Your SlideShare is downloading. ×
0
Scorers, Collectors and
Custom Queries
Mikhail Khludnev
Custom
Queries
Custom
Queries
Custom
Queries

http://nlp.stanford.edu/IR-book/
Custom
Queries

http://nlp.stanford.edu/IR-book/
Custom
Queries

Match Spotting

http://nlp.stanford.edu/IR-book/
Custom Queries
..hm what for ?
denim dress
qf=STYLE TYPE
denim dress
qf=STYLE TYPE
DisjunctionMaxQuery((
(STYLE:denim OR TYPE:denim) |
(STYLE:dress OR TYPE:dress)
))
denim dress
qf=STYLE TYPE
(

DisjunctionMaxQuery((
STYLE:denim | TYPE:denim ))

)OR(

DisjunctionMaxQuery((
STYLE:dress | ...
Custom
Queries
Inverted Index
T[0] = "it is what it is"
T[1] = "what is it"
T[2] = "it is a banana"
"a":
"banana":
"is":
"it":
"what":

{2}
{2}
{0, 1, 2}
{0, 1, 2}
{0, 1}

T[0] = "it is what it is"
T[1] = "what is it"
T[2]...
"a":
"banana":
"is":
"it":
"what":

{2}
{2}
{0, 1, 2}
{0, 1, 2}
{0, 1}

term dictionary

postings list
index/_1.tis
"a"
"banana"
"is"
→"t"
"what"

index/_1.frq
{2}
{2}
{0, 1, 2}
{0, 1, 2}
{0, 1}
http://www.lib.rochester.edu/index.cfm?PAGE=489
What is a Scorer?
"a":
"banana":
"is":
"it":
"what":

{2}
{2}
{0, 1, 2}
{0, 1, 2}
{0, 1}
"a":
"banana":
"is":
"it":
"what":

{2}
{2}
{0, 1, 2}
{0, 1, 2}
{0, 1}
"a":
"banana":
"is":
"it":
"what":

{2}
{2}
{0, 1, 2}
{0, 1, 2}
{0, 1}
while(
(doc = nextDoc())!=NO_MORE_DOCS){

println("found "+ doc +
" with score "+score());
}
2783 issues
Note: Weight is omitted for sake of compactness
Custom
Queries

http://nlp.stanford.edu/IR-book/
Doc-at-time search
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

what OR is OR a OR banana
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

what OR is OR a OR banana
"is":

{0, 1, 2}

"what":

{0, 1}

"a":

{2}

"banana": {2}
"it":

{0, 1, 2}
"is":

{0, 1, 2}

"what":

{0, 1}

"a":

{2}

"banana": {2}

collect(0)
score():2

Collector
"is":

{0, 1, 2}

"what":

{0, 1}

"a":

{2}

"banana": {2}
docID×score
0×2
"is":

{0, 1, 2}

"what":

{0, 1}

"a":

{2}

"banana": {2}

collect(1)
score():2

Collector
0×2
"is":

{0, 1, 2}

"what":

{0, 1}

"a":

{2}

"banana": {2}
Collector
0×2
1×2
"is":

{0, 1, 2}

"a":

{2}

"banana": {2}
"what":

{0, 1}

collect(2)
score():3

Collector
0×2
1×2
Term-at-time search
"lorem"
"ipsum"
"dolor"
"sit"
"amet"
"consectetur"
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

what OR is OR a OR banana
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

Accumulator
... 0×1 ... 1×1 ...
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

Accumulator
... 0×2 ... 1×2 ... 2×1 ...
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

Accumulator
... 0×2 ... 1×2 ... 2×2 ...
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

Accumulator
... 0x2 ... 1x2 ... 2x3 ...
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

Accumulator
... 0×2 ... 1×2 ... 2×3 ...

Col...
O(n)

"lorem"
"ipsum"
"dolor"
"sit"
"amet"
"consectetur"

http://nlp.stanford.edu/IR-book/
k

1×9
7×9
2×7
2×5
9×5
6×4
...
...
≤4
...
...

n
http://en.wikipedia.org/wiki/Binary_heap
6×4

log k

9×5 2×4
2×7 7×9 1×9

n
...
...
≤4
...
...
"a":

{2}

"banana": {2}
"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

p

what OR is OR a OR banana

q
doc at time
complexity

memory

term at time
doc at time
complexity

memory

term at time
O(p + n log k)
"a":

{2}

"banana": {2}

q

"is":

1

{0, 1, 2}

1
2

"what":

{0, 1}

2
doc at time
complexity

memory

term at time

O(p log q + n log k)

O(p + n log k)
doc at time
complexity

memory

term at time

O(p log q + n log k)

O(p + n log k)

q + k
doc at time
complexity

memory

term at time

O(p log q + n log k)

O(p + n log k)

q + k

n
BooleanScorer
org.apache.lucene.search.BooleanScorer
"a":

{2}

"banana":

{2}

"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}
chun...
org.apache.lucene.search.BooleanScorer
"a":

{2}

"banana":

{2}

"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}
chun...
org.apache.lucene.search
"a":

{2}

"banana":

{2}

"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

×2

×2

0

1

Col...
org.apache.lucene.search
"a":

{2}

"banana":

{2}

"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}
Collector
0×2
1×2
...
org.apache.lucene.search
"a":

{2}

"banana":

{2}

"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}
Collector
0×2
1×2
...
org.apache.lucene.search
"a":

{2}

"banana":

{2}

"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}
Collector
0×2
1×2
...
org.apache.lucene.search
"a":

{2}

"banana":

{2}

"is":

{0, 1, 2}

"it":

{0, 1, 2}

"what":

{0, 1}

×3
0

1

Collecto...
Linked Open Hash [2K]

×2
0

×3

×1

×1

×5

×2

1

2

3

4

5

6

7
if (

collector.acceptsDocsOutOfOrder() &&
topScorer &&
required.size() == 0 &&
minNrShouldMatch == 1) {
new BooleanScorer...
q=village operations years disaster visit
q=village operations years disaster visit etc
map seventieth peneplains tussock sir
memory character campaign author publi...
q=+village +operations +years +disaster +visit
Conjunction
(+, MUST)
"a":

{2,3}

"banana": {2,3}
"is":

{0, 1, 2, 3}

"it":

{0, 1, 3}

"what":

{0, 1, 3}

what AND is AND a AND it
"a":

{2,3}

"banana": {2,3}
"is":

{0, 1, 2, 3}

"it":

{0, 1, 3}

"what":

{0, 1, 3}
"a":

{2,3}

"banana": {2,3}
"is":

{0, 1, 2, 3}

"it":

{0, 1, 3}

"what":

{0, 1, 3}
"a":

{2,3}

"banana": {2,3}
"is":

{0, 1, 2, 3}

"it":

{0, 1, 3}

"what":

{0, 1, 3}
"a":

{2,3}

"banana": {2,3}
"is":

{0, 1, 2, 3}

"it":

{0, 1, 3}

"what":

{0, 1, 3}
"a":

{2,3}

"banana": {2,3}
"is":

{0, 1, 2, 3}

"it":

{0, 1, 3}

"what":

{0, 1, 3}
Collector
3x4
http://www.flickr.com/photos/fatniu/184615348/
Ω(n q + n log k)
Wrap-up
● doc-at-time vs term-at-time
● conjunction and leapfrog
complexity

O(n)

memory

O(const)
Custom
Queries

http://nlp.stanford.edu/IR-book/
Custom Queries
●

Sample Coverage Query

●

Deeply Branched vs Flat

●

minShouldMatch

●

Filtering

●

Performance Probl...
silver jeans dress
"silver"

"jeans"

Note: "foo bar" is not a phrase query, just a string

"dress"
silver jeans dress
"silver" "jeans" "dress"
"silver jeans dress"
silver jeans dress
"silver" "jeans" "dress"
"silver jeans dress"
"silver jeans"
"dress"
"silver"
"jeans dress"
silver jeans dress
"silver" "jeans" "dress"
"silver jeans dress"
"silver jeans"
"dress"
"silver"
"jeans dress"
"silver" "d...
boolean verifyMatch(){
int sumLength=0;
for(Scorer child:getChildren()){
if(child.docID()==docID()){
TermQuery tq=child.we...
Deeply Branched vs Flat
(+"silver jeans" +"dress")
ORmax
(+"silver jeans dress")
ORmax
(+"silver" +(
(+"jeans" +"dress")
ORmax
+"jeans dress"
)
)
...
(+"silver jeans" +"dress")
ORmax
(+"silver jeans dress")
ORmax
(+"silver" +(
(+"jeans" +"dress")
ORmax
+"jeans dress"
)
)
...
(+"silver jeans" +"dress")
ORmax
(+"silver jeans dress")
ORmax
(+"silver" +(
(+"jeans" +"dress")
ORmax
+"jeans dress"
)
)
...
("silver jeans" "dress")
ORmax
("silver jeans dress")
ORmax
("silver" (
("jeans" "dress")
ORmax
"jeans dress"
)
)
ORmax is...
+

B:"silver jeans" ORmax
T:"silver jeans" ORmax
S:"silver jeans"

+

B:"dress" ORmax
T:"dress" ORmax
S:"dress"

B - BRAND...
B:"silver"

T:"silver"

S:"silver"

B:"jeans"

T:"jeans"

S:"jeans"

B:"dress"

T:"dress"

S:"dress"

B:"silver jeans"

T:...
Steadiness problem
AFAIK 3.x only.
{1, 3, 7, 10, 27,30,..}
{3, 5, 10, 27,32,..}
{2,3, 27,31,..}
{..., 20, 27,32,..}
{..., 30, 31,32,..}
{..., 30,37,..}
3
3 2...
{3, 5, 10, 27,32,..}
{1, 3, 7, 10, 27,30,..}
{2,3, 27,31,..}
{..., 20, 27,32,..}
{..., 30, 31,32,..}
{..., 30,37,..}
docID...
minShouldMatch
straight silver jeans

minShouldMatch=2
straight jeans
silver jeans
silver jeans straight
jeans
silver
org.apache.lucene.search.DisjunctionSumScorer
int nextDoc() {
while(true) {
while (subScorers[0].docID() == doc) {
if (sub...
Let’s filter!
btw, what it is?
RANDOM_ACCESS_FILTER_STRATEGY
LEAP_FROG_FILTER_FIRST_STRATEGY
LEAP_FROG_QUERY_FIRST_STRATEGY
QUERY_FIRST_FILTER_STRATEGY
http://localhost:8983/solr/collection1/select
?q=village operations years disaster visit etc map
seventieth peneplains tus...
http://localhost:8983/solr/collection1/select
?q=village operations years disaster visit etc map
seventieth peneplains tus...
http://localhost:8983/solr/collection1/select
?q=village operations years disaster visit etc map
seventieth peneplains tus...
{1, 3, 7, 10, 27,30,..}
{3, 5, 10, 27,32,..}
{ 20,27,31,..}
mm=3

{ 30,37,..}
{1, 3, 7, 10, 27,30,..}
{3, 5, 10, 27,32,..}
{ 20,27,31,..}
mm=3

{ 30,37,..}
{1, 3, 7, 10, 27,30,..}
{3, 5, 10, 27,32,..}
{ 20,27,31,..}
mm=3

{ 30,37,..}
{1, 3, 7, 10, 27,30,..}
{3, 5, 10, 27,32,..}
{ 20,27,31,..}
mm=3

{ 30,37,..}
{1, 3, 7, 10, 27,30,..}
{3, 5, 10, 27,32,..}
{ 20,27,31,..}
mm=3

{ 30,37,..}
Custom
Queries

Match Spotting

http://nlp.stanford.edu/IR-book/
BRAND:"silver jeans"
BRAND:"alfani"

TYPE:"dress"

TYPE:"dress"

BRAND:"chaloree"

TYPE:"dress"

STYLE:"white"

STYLE:"sil...
BRAND:"silver jeans"
BRAND:"alfani"

TYPE:"dress"

TYPE:"dress"

BRAND:"chaloree"

TYPE:"dress"

STYLE:"white"

STYLE:"sil...
BRAND:"silver jeans"
BRAND:"alfani"

TYPE:"dress" STYLE:"white"

TYPE:"dress"

BRAND:"chaloree"

TYPE:"dress"

STYLE:"silv...
BRAND:"silver jeans"

TYPE:"dress"

TYPE:"dress"

STYLE:"silver","jeans"

TYPE:"jeans dress"
BRAND:"silver jeans"

TYPE:"d...
BRAND:"silver jeans"

TYPE:"dress"

TYPE:"dress"

STYLE:"silver","jeans"

TYPE:"jeans dress"
BRAND:"silver jeans"

TYPE:"d...
BRAND:"silver jeans"

TYPE:"dress" (4)

TYPE:"dress"

STYLE:"silver","jeans"

TYPE:"jeans dress"

TYPE:"dress"

STYLE:"sil...
BRAND:"silver jeans"

TYPE:"dress" (4)

TYPE:"dress"

STYLE:"silver","jeans"

TYPE:"jeans dress"

TYPE:"dress"

STYLE:"sil...
BRAND:"silver jeans"

TYPE:"dress" (4)

TYPE:"dress" STYLE:"silver","jeans" (3)

TYPE:"jeans dress"

STYLE:"silver"

TYPE:...
BRAND:"silver jeans"

TYPE:"dress" (4)

TYPE:"dress" STYLE:"silver","jeans" (3)

TYPE:"jeans dress"

STYLE:"silver" (2)
silver jeans dress
BRAND:"silver jeans" TYPE:"dress"

(4)

TYPE:"dress" STYLE:"silver","jeans" (3)
TYPE:"jeans dress" STYL...
silver jeans dress
BRAND:"silver jeans" TYPE:"dress"

(4)

TYPE:"dress" STYLE:"silver","jeans" (3)
TYPE:"jeans dress" STYL...
http://goo.gl/7LJFi

Scorers, Collectors and
Custom Queries
http://google.com/+MikhailKhludnev
Appendixes
● Drill Sideways Facets
● Collectors
Appendix D

Drill Sideways Facets
+CATEGORY: Denim
+FIT: Straight
+WASH: Dark&B
+CATEGORY: Denim
+WASH: Dark&B

+CATEGORY: Denim
+FIT: Straight
+WASH: Dark&B
+CATEGORY: Denim
+WASH: Dark&B

+CATEGORY: Denim
+FIT: Straight
+WASH: Dark&B

+CATEGORY: Denim
+FIT: Straight
+CATEGORY: Denim
FIT: Straight
WASH: Dark&Black
...
/minShouldMatch=Ndrilldowns-1
FIT: Straight
+CAT: Denim

WASH: Dark
FIT: Straight
near miss
2
totalHits
3
near miss
2
WASH: Dark

+CAT: Denim
FIT: Straight
near miss
2
totalHits
3
near miss
2
WASH: Dark

+CAT: Denim
FIT: Straight
near miss
2
totalHits
3
near miss
2
WASH: Dark

+CAT: Denim
Doc at time
base query is highly selective
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

TopDocsCollector
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

TopDocsCollector
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

TopDocsCollector
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

TopDocsCollector
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

TopDocsCollector
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

TopDocsCollector
Term at time
drilldown queries are highly selective
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

hits
1
miss
Fit

1

2

...

hits
1
miss
F...
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

hits
2
miss
no

1

2

...

hits hits hits...
+CAT:D..{1, 7, 9, 15 }
FIT:S.. {2, 7, 8, 9, 10,12}
WASH:D..{2, 7, 11,13,15}
...

hits
2
miss
Cat

1

2

...

hits hits hit...
hits
2
miss
Cat

1

2

...

hits hits hits hits hits hits hits hits
3
1
1
1
1
2
2
1
miss miss miss miss miss miss miss mis...
TopDocsCollector

hits
3
miss

...
1

2

no
7

hits
2
miss
Fit

hits
2
miss
Wash

8

9

10

11

12

13 15
TopDocsCollector

hits
3
miss

...
1

2

no
7

hits
2
miss
Fit

hits
2
miss
Wash

8

9

10

11

12

13 15
TopDocsCollector

hits
3
miss

...
1

2

no
7

hits
2
miss
Fit

hits
2
miss
Wash

8

9

10

11

12

13 15
Collector

DocSetCollector

TopDocsCollector

TopFieldCollector
TopScoreDocsCollector
DocSet or DocList?
long [952045] = { 0, 0, 0, 0, 2050, 0, 0, 8, 0, 0, 0,... }

int [2079] = {4, 12, 45, 67, 103, 673, 5890...
DocList/
TopDoc

DocSet

Size

k
(numHits or
rows)

N
(maxDocs)

Ordered by

score or
field

docID

allows*

almost
could ...
?×4

6×4
9×5 2×4
2×7 7×9 1×9
http://www.flickr.com/photos/jbagley/4303976811/sizes/o/
class OutOfOrderTopScoreDocCollector
boolean acceptsDocsOutOfOrder(){ return true;
}
..
void collect(int doc) {
float scor...
UML
http://www.flickr.com/photos/kristykay/2922670979/lightbox/
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Lucene Search Essentials: Scorers, Collectors and Custom Queries
Upcoming SlideShare
Loading in...5
×

Lucene Search Essentials: Scorers, Collectors and Custom Queries

1,525

Published on

Presented by Mikhail Khludnev, Principal Engineer, Grid Dynamics

My team is building next generation eCommerce search platform for major an online retailer with quite challenging business requirements. Turns out, default Lucene toolbox doesn’t ideally fit for those challenges. Thus, the team had to hack deep into Lucene core to achieve our goals. We accumulated quite a deep understanding of Lucene search internals and want to share our experience. We will start with an API overview, and then look at essential search algorithms and their implementations in Lucene. Finally, we will review a few cases of query customization, pitfalls and common performance problems.

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,525
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
53
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Transcript of "Lucene Search Essentials: Scorers, Collectors and Custom Queries"

  1. 1. Scorers, Collectors and Custom Queries Mikhail Khludnev
  2. 2. Custom Queries
  3. 3. Custom Queries
  4. 4. Custom Queries http://nlp.stanford.edu/IR-book/
  5. 5. Custom Queries http://nlp.stanford.edu/IR-book/
  6. 6. Custom Queries Match Spotting http://nlp.stanford.edu/IR-book/
  7. 7. Custom Queries ..hm what for ?
  8. 8. denim dress qf=STYLE TYPE
  9. 9. denim dress qf=STYLE TYPE DisjunctionMaxQuery(( (STYLE:denim OR TYPE:denim) | (STYLE:dress OR TYPE:dress) ))
  10. 10. denim dress qf=STYLE TYPE ( DisjunctionMaxQuery(( STYLE:denim | TYPE:denim )) )OR( DisjunctionMaxQuery(( STYLE:dress | TYPE::dress )) )
  11. 11. Custom Queries
  12. 12. Inverted Index
  13. 13. T[0] = "it is what it is" T[1] = "what is it" T[2] = "it is a banana"
  14. 14. "a": "banana": "is": "it": "what": {2} {2} {0, 1, 2} {0, 1, 2} {0, 1} T[0] = "it is what it is" T[1] = "what is it" T[2] = "it is a banana"
  15. 15. "a": "banana": "is": "it": "what": {2} {2} {0, 1, 2} {0, 1, 2} {0, 1} term dictionary postings list
  16. 16. index/_1.tis "a" "banana" "is" →"t" "what" index/_1.frq {2} {2} {0, 1, 2} {0, 1, 2} {0, 1}
  17. 17. http://www.lib.rochester.edu/index.cfm?PAGE=489
  18. 18. What is a Scorer?
  19. 19. "a": "banana": "is": "it": "what": {2} {2} {0, 1, 2} {0, 1, 2} {0, 1}
  20. 20. "a": "banana": "is": "it": "what": {2} {2} {0, 1, 2} {0, 1, 2} {0, 1}
  21. 21. "a": "banana": "is": "it": "what": {2} {2} {0, 1, 2} {0, 1, 2} {0, 1}
  22. 22. while( (doc = nextDoc())!=NO_MORE_DOCS){ println("found "+ doc + " with score "+score()); }
  23. 23. 2783 issues
  24. 24. Note: Weight is omitted for sake of compactness
  25. 25. Custom Queries http://nlp.stanford.edu/IR-book/
  26. 26. Doc-at-time search
  27. 27. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} what OR is OR a OR banana
  28. 28. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} what OR is OR a OR banana
  29. 29. "is": {0, 1, 2} "what": {0, 1} "a": {2} "banana": {2} "it": {0, 1, 2}
  30. 30. "is": {0, 1, 2} "what": {0, 1} "a": {2} "banana": {2} collect(0) score():2 Collector
  31. 31. "is": {0, 1, 2} "what": {0, 1} "a": {2} "banana": {2} docID×score 0×2
  32. 32. "is": {0, 1, 2} "what": {0, 1} "a": {2} "banana": {2} collect(1) score():2 Collector 0×2
  33. 33. "is": {0, 1, 2} "what": {0, 1} "a": {2} "banana": {2} Collector 0×2 1×2
  34. 34. "is": {0, 1, 2} "a": {2} "banana": {2} "what": {0, 1} collect(2) score():3 Collector 0×2 1×2
  35. 35. Term-at-time search "lorem" "ipsum" "dolor" "sit" "amet" "consectetur"
  36. 36. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} what OR is OR a OR banana
  37. 37. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Accumulator ... 0×1 ... 1×1 ...
  38. 38. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Accumulator ... 0×2 ... 1×2 ... 2×1 ...
  39. 39. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Accumulator ... 0×2 ... 1×2 ... 2×2 ...
  40. 40. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Accumulator ... 0x2 ... 1x2 ... 2x3 ...
  41. 41. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Accumulator ... 0×2 ... 1×2 ... 2×3 ... Collector 2×3 0×2 1×2
  42. 42. O(n) "lorem" "ipsum" "dolor" "sit" "amet" "consectetur" http://nlp.stanford.edu/IR-book/
  43. 43. k 1×9 7×9 2×7 2×5 9×5 6×4 ... ... ≤4 ... ... n
  44. 44. http://en.wikipedia.org/wiki/Binary_heap
  45. 45. 6×4 log k 9×5 2×4 2×7 7×9 1×9 n ... ... ≤4 ... ...
  46. 46. "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} p what OR is OR a OR banana q
  47. 47. doc at time complexity memory term at time
  48. 48. doc at time complexity memory term at time O(p + n log k)
  49. 49. "a": {2} "banana": {2} q "is": 1 {0, 1, 2} 1 2 "what": {0, 1} 2
  50. 50. doc at time complexity memory term at time O(p log q + n log k) O(p + n log k)
  51. 51. doc at time complexity memory term at time O(p log q + n log k) O(p + n log k) q + k
  52. 52. doc at time complexity memory term at time O(p log q + n log k) O(p + n log k) q + k n
  53. 53. BooleanScorer
  54. 54. org.apache.lucene.search.BooleanScorer "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} chunk Hashtable[2] ×1 ×1 0 1
  55. 55. org.apache.lucene.search.BooleanScorer "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} chunk x2 x2 0 1
  56. 56. org.apache.lucene.search "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} ×2 ×2 0 1 Collector 0×2 1×2
  57. 57. org.apache.lucene.search "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Collector 0×2 1×2 ×1 0 1
  58. 58. org.apache.lucene.search "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Collector 0×2 1×2 ×2 0 1
  59. 59. org.apache.lucene.search "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} Collector 0×2 1×2 ×3 0 1
  60. 60. org.apache.lucene.search "a": {2} "banana": {2} "is": {0, 1, 2} "it": {0, 1, 2} "what": {0, 1} ×3 0 1 Collector 2×3 0×2 1×2
  61. 61. Linked Open Hash [2K] ×2 0 ×3 ×1 ×1 ×5 ×2 1 2 3 4 5 6 7
  62. 62. if ( collector.acceptsDocsOutOfOrder() && topScorer && required.size() == 0 && minNrShouldMatch == 1) { new BooleanScorer else //term-at-time new BooleanScorer2 //doc-at-time
  63. 63. q=village operations years disaster visit
  64. 64. q=village operations years disaster visit etc map seventieth peneplains tussock sir memory character campaign author public wonder forker middy vocalize enable race object signal symptom deputy where typhous rectifiable polygamous originally look generation ultimately reasonably ratio numb apposing enroll manhood problem suddenly definitely corp event material affair diploma would dimout speech notion engine artist hotel text field hashed rottener impeding i cricket virtually valley sunday rock come observes gallnuts vibrantly prize involve
  65. 65. q=+village +operations +years +disaster +visit
  66. 66. Conjunction (+, MUST)
  67. 67. "a": {2,3} "banana": {2,3} "is": {0, 1, 2, 3} "it": {0, 1, 3} "what": {0, 1, 3} what AND is AND a AND it
  68. 68. "a": {2,3} "banana": {2,3} "is": {0, 1, 2, 3} "it": {0, 1, 3} "what": {0, 1, 3}
  69. 69. "a": {2,3} "banana": {2,3} "is": {0, 1, 2, 3} "it": {0, 1, 3} "what": {0, 1, 3}
  70. 70. "a": {2,3} "banana": {2,3} "is": {0, 1, 2, 3} "it": {0, 1, 3} "what": {0, 1, 3}
  71. 71. "a": {2,3} "banana": {2,3} "is": {0, 1, 2, 3} "it": {0, 1, 3} "what": {0, 1, 3}
  72. 72. "a": {2,3} "banana": {2,3} "is": {0, 1, 2, 3} "it": {0, 1, 3} "what": {0, 1, 3} Collector 3x4
  73. 73. http://www.flickr.com/photos/fatniu/184615348/
  74. 74. Ω(n q + n log k)
  75. 75. Wrap-up ● doc-at-time vs term-at-time ● conjunction and leapfrog
  76. 76. complexity O(n) memory O(const)
  77. 77. Custom Queries http://nlp.stanford.edu/IR-book/
  78. 78. Custom Queries ● Sample Coverage Query ● Deeply Branched vs Flat ● minShouldMatch ● Filtering ● Performance Problem
  79. 79. silver jeans dress "silver" "jeans" Note: "foo bar" is not a phrase query, just a string "dress"
  80. 80. silver jeans dress "silver" "jeans" "dress" "silver jeans dress"
  81. 81. silver jeans dress "silver" "jeans" "dress" "silver jeans dress" "silver jeans" "dress" "silver" "jeans dress"
  82. 82. silver jeans dress "silver" "jeans" "dress" "silver jeans dress" "silver jeans" "dress" "silver" "jeans dress" "silver" "dress" "silver jeans" "jeans" "silver jeans" "jeans" "dress" Note: "foo bar" is not a phrase query, just a string
  83. 83. boolean verifyMatch(){ int sumLength=0; for(Scorer child:getChildren()){ if(child.docID()==docID()){ TermQuery tq=child.weight.query; sumLength += tq.term.text.length; } } return sumLength>=expectedLength; }
  84. 84. Deeply Branched vs Flat
  85. 85. (+"silver jeans" +"dress") ORmax (+"silver jeans dress") ORmax (+"silver" +( (+"jeans" +"dress") ORmax +"jeans dress" ) ) ORmax is DisjunctionMaxQuery
  86. 86. (+"silver jeans" +"dress") ORmax (+"silver jeans dress") ORmax (+"silver" +( (+"jeans" +"dress") ORmax +"jeans dress" ) ) ORmax is DisjunctionMaxQuery
  87. 87. (+"silver jeans" +"dress") ORmax (+"silver jeans dress") ORmax (+"silver" +( (+"jeans" +"dress") ORmax +"jeans dress" ) ) ORmax is DisjunctionMaxQuery
  88. 88. ("silver jeans" "dress") ORmax ("silver jeans dress") ORmax ("silver" ( ("jeans" "dress") ORmax "jeans dress" ) ) ORmax is DisjunctionMaxQuery
  89. 89. + B:"silver jeans" ORmax T:"silver jeans" ORmax S:"silver jeans" + B:"dress" ORmax T:"dress" ORmax S:"dress" B - BRAND T - TYPE S - STYLE ORmax B:"silver jeans dress" ORmax T:"silver jeans dress" ORmax S:"silver jeans dress" ORmax + B:"silver" ORmax T:"silver" ORmax S:"silver" + + B:"jeans" ORmax T:"jeans" ORmax S:"jeans" + B:"dress" ORmax T:"dress" ORmax S:"dress" ORmax B:"jeans dress" ORmax T:"jeans dress" ORmax S:"jeans dress"
  90. 90. B:"silver" T:"silver" S:"silver" B:"jeans" T:"jeans" S:"jeans" B:"dress" T:"dress" S:"dress" B:"silver jeans" T:"silver jeans" S:"silver jeans" B:"silver jeans dress" T:"silver jeans dress" S:"silver jeans dress" B:"jeans dress" T:"jeans dress" S:"jeans dress"
  91. 91. Steadiness problem AFAIK 3.x only.
  92. 92. {1, 3, 7, 10, 27,30,..} {3, 5, 10, 27,32,..} {2,3, 27,31,..} {..., 20, 27,32,..} {..., 30, 31,32,..} {..., 30,37,..} 3 3 20 3 30 30
  93. 93. {3, 5, 10, 27,32,..} {1, 3, 7, 10, 27,30,..} {2,3, 27,31,..} {..., 20, 27,32,..} {..., 30, 31,32,..} {..., 30,37,..} docID= 3 5 7 20 27 30 30 3.x
  94. 94. minShouldMatch
  95. 95. straight silver jeans minShouldMatch=2 straight jeans silver jeans silver jeans straight jeans silver
  96. 96. org.apache.lucene.search.DisjunctionSumScorer int nextDoc() { while(true) { while (subScorers[0].docID() == doc) { if (subScorers[0].nextDoc() != NO_DOCS) { heapAdjust(0); } else { .... } } ... if (nrMatchers >= minimumNrMatchers) { break; } } return doc; }
  97. 97. Let’s filter! btw, what it is?
  98. 98. RANDOM_ACCESS_FILTER_STRATEGY LEAP_FROG_FILTER_FIRST_STRATEGY LEAP_FROG_QUERY_FIRST_STRATEGY QUERY_FIRST_FILTER_STRATEGY
  99. 99. http://localhost:8983/solr/collection1/select ?q=village operations years disaster visit etc map seventieth peneplains tussock sir memory character campaign author public wonder forker middy vocalize enable race object signal symptom deputy where generation ultimately reasonably ratio numb apposing enroll manhood problem suddenly definitely corp event gallnuts vibrantly prize involve explanation module& qf=text_all&defType=edismax&
  100. 100. http://localhost:8983/solr/collection1/select ?q=village operations years disaster visit etc map seventieth peneplains tussock sir memory character campaign author public wonder forker middy vocalize enable race object signal symptom deputy where generation ultimately reasonably ratio numb apposing enroll manhood problem suddenly definitely corp event gallnuts vibrantly prize involve explanation module& qf=text_all&defType=edismax& fq= id:yes_49912894 id:nurse_30134968&
  101. 101. http://localhost:8983/solr/collection1/select ?q=village operations years disaster visit etc map seventieth peneplains tussock sir memory character campaign author public wonder forker middy vocalize enable race object signal symptom deputy where generation ultimately reasonably ratio numb apposing enroll manhood problem suddenly definitely corp event gallnuts vibrantly prize involve explanation module& qf=text_all&defType=edismax& fq= id:yes_49912894 id:nurse_30134968& mm=32&
  102. 102. {1, 3, 7, 10, 27,30,..} {3, 5, 10, 27,32,..} { 20,27,31,..} mm=3 { 30,37,..}
  103. 103. {1, 3, 7, 10, 27,30,..} {3, 5, 10, 27,32,..} { 20,27,31,..} mm=3 { 30,37,..}
  104. 104. {1, 3, 7, 10, 27,30,..} {3, 5, 10, 27,32,..} { 20,27,31,..} mm=3 { 30,37,..}
  105. 105. {1, 3, 7, 10, 27,30,..} {3, 5, 10, 27,32,..} { 20,27,31,..} mm=3 { 30,37,..}
  106. 106. {1, 3, 7, 10, 27,30,..} {3, 5, 10, 27,32,..} { 20,27,31,..} mm=3 { 30,37,..}
  107. 107. Custom Queries Match Spotting http://nlp.stanford.edu/IR-book/
  108. 108. BRAND:"silver jeans" BRAND:"alfani" TYPE:"dress" TYPE:"dress" BRAND:"chaloree" TYPE:"dress" STYLE:"white" STYLE:"silver","jeans" STYLE:"silver" BRAND:"style&co" TYPE:"jeans dress" STYLE:"silver" BRAND:"silver jeans" TYPE:"dress" STYLE:"black" BRAND:"silver jeans" TYPE:"dress" STYLE:"white" BRAND:"silver jeans" TYPE:"jacket" STYLE: "black" BRAND:"angie" TYPE:"dress" STYLE:"silver","jeans" BRAND:"chaloree" TYPE:"jeans dress" STYLE:"silver" BRAND:"silver jeans" BRAND:"dotty" BRAND:"chaloree" TYPE:"dress" TYPE:"dress" STYLE:"blue" STYLE:"silver","jeans" STYLE:"jeans" "dress"
  109. 109. BRAND:"silver jeans" BRAND:"alfani" TYPE:"dress" TYPE:"dress" BRAND:"chaloree" TYPE:"dress" STYLE:"white" STYLE:"silver","jeans" STYLE:"silver" BRAND:"style&co" TYPE:"jeans dress" STYLE:"silver" BRAND:"silver jeans" TYPE:"dress" STYLE:"black" BRAND:"silver jeans" TYPE:"dress" silver jeans dress STYLE:"white" BRAND:"silver jeans" STYLE: "black" BRAND:"angie" TYPE:"jacket" TYPE:"dress" STYLE:"silver","jeans" BRAND:"chaloree" TYPE:"jeans dress" STYLE:"silver" BRAND:"silver jeans" BRAND:"dotty" BRAND:"chaloree" TYPE:"dress" TYPE:"dress" STYLE:"blue" STYLE:"silver","jeans" STYLE:"jeans" "dress"
  110. 110. BRAND:"silver jeans" BRAND:"alfani" TYPE:"dress" STYLE:"white" TYPE:"dress" BRAND:"chaloree" TYPE:"dress" STYLE:"silver","jeans" STYLE:"silver" BRAND:"style&co" TYPE:"jeans dress" STYLE:"silver" BRAND:"silver jeans" TYPE:"dress" STYLE:"black" BRAND:"silver jeans" TYPE:"dress" STYLE:"white" BRAND:"silver jeans" BRAND:"angie" TYPE:"jacket" TYPE:"dress" STYLE: "black" STYLE:"silver","jeans" BRAND:"chaloree" TYPE:"jeans dress" BRAND:"silver jeans" BRAND:"dotty" BRAND:"chaloree" STYLE:"silver" TYPE:"dress" STYLE:"blue" TYPE:"dress" STYLE:"silver","jeans" STYLE:"jeans" "dress"
  111. 111. BRAND:"silver jeans" TYPE:"dress" TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" BRAND:"silver jeans" TYPE:"dress" BRAND:"silver jeans" STYLE:"silver" TYPE:"dress" TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" BRAND:"silver jeans" STYLE:"silver" TYPE:"dress" TYPE:"dress" STYLE:"silver","jeans"
  112. 112. BRAND:"silver jeans" TYPE:"dress" TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" BRAND:"silver jeans" TYPE:"dress" BRAND:"silver jeans" STYLE:"silver" TYPE:"dress" TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" BRAND:"silver jeans" STYLE:"silver" TYPE:"dress" TYPE:"dress" STYLE:"silver","jeans"
  113. 113. BRAND:"silver jeans" TYPE:"dress" (4) TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" TYPE:"dress" STYLE:"silver" STYLE:"silver" STYLE:"silver","jeans"
  114. 114. BRAND:"silver jeans" TYPE:"dress" (4) TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" TYPE:"dress" STYLE:"silver","jeans" TYPE:"jeans dress" TYPE:"dress" STYLE:"silver" STYLE:"silver" STYLE:"silver","jeans"
  115. 115. BRAND:"silver jeans" TYPE:"dress" (4) TYPE:"dress" STYLE:"silver","jeans" (3) TYPE:"jeans dress" STYLE:"silver" TYPE:"jeans dress" STYLE:"silver"
  116. 116. BRAND:"silver jeans" TYPE:"dress" (4) TYPE:"dress" STYLE:"silver","jeans" (3) TYPE:"jeans dress" STYLE:"silver" (2)
  117. 117. silver jeans dress BRAND:"silver jeans" TYPE:"dress" (4) TYPE:"dress" STYLE:"silver","jeans" (3) TYPE:"jeans dress" STYLE:"silver" (2)
  118. 118. silver jeans dress BRAND:"silver jeans" TYPE:"dress" (4) TYPE:"dress" STYLE:"silver","jeans" (3) TYPE:"jeans dress" STYLE:"silver" (2)
  119. 119. http://goo.gl/7LJFi Scorers, Collectors and Custom Queries http://google.com/+MikhailKhludnev
  120. 120. Appendixes ● Drill Sideways Facets ● Collectors
  121. 121. Appendix D Drill Sideways Facets
  122. 122. +CATEGORY: Denim +FIT: Straight +WASH: Dark&B
  123. 123. +CATEGORY: Denim +WASH: Dark&B +CATEGORY: Denim +FIT: Straight +WASH: Dark&B
  124. 124. +CATEGORY: Denim +WASH: Dark&B +CATEGORY: Denim +FIT: Straight +WASH: Dark&B +CATEGORY: Denim +FIT: Straight
  125. 125. +CATEGORY: Denim FIT: Straight WASH: Dark&Black ... /minShouldMatch=Ndrilldowns-1
  126. 126. FIT: Straight +CAT: Denim WASH: Dark
  127. 127. FIT: Straight near miss 2 totalHits 3 near miss 2 WASH: Dark +CAT: Denim
  128. 128. FIT: Straight near miss 2 totalHits 3 near miss 2 WASH: Dark +CAT: Denim
  129. 129. FIT: Straight near miss 2 totalHits 3 near miss 2 WASH: Dark +CAT: Denim
  130. 130. Doc at time base query is highly selective
  131. 131. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ...
  132. 132. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ...
  133. 133. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ...
  134. 134. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... TopDocsCollector
  135. 135. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... TopDocsCollector
  136. 136. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... TopDocsCollector
  137. 137. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... TopDocsCollector
  138. 138. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... TopDocsCollector
  139. 139. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... TopDocsCollector
  140. 140. Term at time drilldown queries are highly selective
  141. 141. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... hits 1 miss Fit 1 2 ... hits 1 miss Fit 7 hits 1 miss Fit 8 9 10 11 hits hits 1 1 miss miss Fit Fit 12 13 15
  142. 142. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... hits 2 miss no 1 2 ... hits hits hits hits hits hits hits hits 2 1 1 1 1 1 1 1 miss miss miss miss miss miss miss miss no Wash Wash Wash Fit Wash Fit Fit 7 8 9 10 11 12 13 15
  143. 143. +CAT:D..{1, 7, 9, 15 } FIT:S.. {2, 7, 8, 9, 10,12} WASH:D..{2, 7, 11,13,15} ... hits 2 miss Cat 1 2 ... hits hits hits hits hits hits hits hits 3 1 1 1 1 2 2 1 miss miss miss miss miss miss miss miss Wash Wash Fit Wash Wash Fit Cat Fit Cat Cat Cat Cat 7 8 9 10 11 12 13 15
  144. 144. hits 2 miss Cat 1 2 ... hits hits hits hits hits hits hits hits 3 1 1 1 1 2 2 1 miss miss miss miss miss miss miss miss Fit no Wash Wash Wash Cat Wash Fit Cat Fit Cat Cat Cat 7 8 9 10 11 12 13 15
  145. 145. TopDocsCollector hits 3 miss ... 1 2 no 7 hits 2 miss Fit hits 2 miss Wash 8 9 10 11 12 13 15
  146. 146. TopDocsCollector hits 3 miss ... 1 2 no 7 hits 2 miss Fit hits 2 miss Wash 8 9 10 11 12 13 15
  147. 147. TopDocsCollector hits 3 miss ... 1 2 no 7 hits 2 miss Fit hits 2 miss Wash 8 9 10 11 12 13 15
  148. 148. Collector DocSetCollector TopDocsCollector TopFieldCollector TopScoreDocsCollector
  149. 149. DocSet or DocList? long [952045] = { 0, 0, 0, 0, 2050, 0, 0, 8, 0, 0, 0,... } int [2079] = {4, 12, 45, 67, 103, 673, 5890, 34103,...} int [100] = {8947, 7498,1, 230, 2356, 9812, 167,....}
  150. 150. DocList/ TopDoc DocSet Size k (numHits or rows) N (maxDocs) Ordered by score or field docID allows* almost could allow (No) Out-of-order collecting
  151. 151. ?×4 6×4 9×5 2×4 2×7 7×9 1×9
  152. 152. http://www.flickr.com/photos/jbagley/4303976811/sizes/o/
  153. 153. class OutOfOrderTopScoreDocCollector boolean acceptsDocsOutOfOrder(){ return true; } .. void collect(int doc) { float score = scorer.score(); ... if (score == pqTop.score && doc > pqTop.doc) { ... }
  154. 154. UML http://www.flickr.com/photos/kristykay/2922670979/lightbox/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×