O C T O B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
Faceting optimizations for Solr
Toke Eskildsen
Search Engineer / Solr Hacker
State and University Library, Denmark
@TokeEskildsen / te@statsbiblioteket.dk
3/55
3
Overview
l  Web scale at the State and University Library,
Denmark
l  Field faceting 101
l  Optimizations
-  Reuse
-  Tracking
-  Caching
-  Alternative counters
4/55
Web scale for a small web
l  Denmark
-  Consolidation circa 10th century
-  5.6 million people
l  Danish Net Archive (http://netarkivet.dk)
-  Constitution 2005
-  20 billion items / 590TB+ raw data
5/55
Indexing 20 billion web items / 590TB into Solr
l  Solr index size is 1/9th of real data = 70TB
l  Each shard holds 200M documents / 900GB
-  Shards build chronologically by dedicated machine
-  Projected 80 shards
-  Current build time per shard: 4 days
-  Total build time is 20 CPU-core years
-  So far only 7.4 billion documents / 27TB in index
6/55
Searching a 7.4 billion documents / 27TB Solr index
l  SolrCloud with 2 machines, each having
-  16 HT-cores, 256GB RAM, 25 * 930GB SSD
-  25 shards @ 900GB
-  1 Solr/shard/SSD, Xmx=8g, Solr 4.10
-  Disk cache 100GB or < 1% of index size
7/55
8/55
String faceting 101 (single shard)
counter = new int[ordinals]
for docID: result.getDocIDs()
for ordinal: getOrdinals(docID)
counter[ordinal]++
for ordinal = 0 ; ordinal < counters.length ; ordinal++
priorityQueue.add(ordinal, counter[ordinal])
for entry: priorityQueue
result.add(resolveTerm(ordinal), count)
ord term counter
0 A 0
1 B 3
2 C 0
3 D 1006
4 E 1
5 F 1
6 G 0
7 H 0
8 I 3
9/55
Test setup 1 (easy start)
l  Solr setup
-  16 HT-cores, 256GB RAM, SSD
-  Single shard 250M documents / 900GB
l  URL field
-  Single String value
-  200M unique terms
l  3 concurrent “users”
l  Random search terms
10/55
Vanilla Solr, single shard, 250M documents, 200M values, 3 users
11/55
Allocating and dereferencing 800MB arrays
12/55
Reuse the counter
counter = new int[ordinals]
for docID: result.getDocIDs()
for ordinal: getOrdinals(docID)
counter[ordinal]++
for ordinal = 0 ; ordinal < counters.length ; ordinal++
priorityQueue.add(ordinal, counter[ordinal])
<counter no more referenced and will be garbage collected at some point>
13/55
Reuse the counter
counter = pool.getCounter()
for docID: result.getDocIDs()
for ordinal: getOrdinals(docID)
counter[ordinal]++
for ordinal = 0 ; ordinal < counters.length ; ordinal++
priorityQueue.add(ordinal, counter[ordinal])
pool.release(counter)
Note: The JSON Facet API in Solr 5 already supports reuse of counters
14/55
Using and clearing 800MB arrays
15/55
Reusing counters vs. not doing so
16/55
Reusing counters, now with readable visualization
17/55
Reusing counters, now with readable visualization
Why does it always take more than 500ms?
18/55
Iteration is not free
counter = pool.getCounter()
for docID: result.getDocIDs()
for ordinal: getOrdinals(docID)
counter[ordinal]++
for ordinal = 0 ; ordinal < counters.length ; ordinal++
priorityQueue.add(ordinal, counter[ordinal])
pool.release(counter)
200M unique terms = 800MB
19/55
ord counter
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
tracker
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
Tracking updated counters
20/55
ord counter
0 0
1 0
2 0
3 1
4 0
5 0
6 0
7 0
8 0
tracker
3
N/A
N/A
N/A
N/A
N/A
N/A
N/A
N/A
counter[3]++
Tracking updated counters
21/55
ord counter
0 0
1 1
2 0
3 1
4 0
5 0
6 0
7 0
8 0
tracker
3
1
N/A
N/A
N/A
N/A
N/A
N/A
N/A
counter[3]++
counter[1]++
Tracking updated counters
22/55
ord counter
0 0
1 3
2 0
3 1
4 0
5 0
6 0
7 0
8 0
tracker
3
1
N/A
N/A
N/A
N/A
N/A
N/A
N/A
counter[3]++
counter[1]++
counter[1]++
counter[1]++
Tracking updated counters
23/55
ord counter
0 0
1 3
2 0
3 1006
4 1
5 1
6 0
7 0
8 3
tracker
3
1
8
4
5
N/A
N/A
N/A
N/A
counter[3]++
counter[1]++
counter[1]++
counter[1]++
counter[8]++
counter[8]++
counter[4]++
counter[8]++
counter[5]++
counter[1]++
counter[1]++
…
counter[1]++
Tracking updated counters
24/55
Tracking updated counters
counter = pool.getCounter()
for docID: result.getDocIDs()
for ordinal: getOrdinals(docID)
if counter[ordinal]++ == 0 && tracked < maxTracked
tracker[tracked++] = ordinal
if tracked < maxTracked
for i = 0 ; i < tracked ; i++
priorityQueue.add(tracker[i], counter[tracker[i]])
else
for ordinal = 0 ; ordinal < counter.length ; ordinal++
priorityQueue.add(ordinal, counter[ordinal])
ord counter
0 0
1 3
2 0
3 1006
4 1
5 1
6 0
7 0
8 3
tracker
3
1
8
4
5
N/A
N/A
N/A
N/A
25/55
Tracking updated counters
26/55
Distributed faceting
Phase 1) All shards performs faceting.
The Merger calculates the top-X terms.
Phase 2) The term counts are requested from the shards
that did not return them in phase 1.
The Merger calculates the final counts for the top-X terms.
for term: fineCountRequest.getTerms()
result.add(term,
searcher.numDocs(query(field:term), base.getDocIDs()))
27/55
Test setup 2 (more shards, smaller field)
l  Solr setup
-  16 HT-cores, 256GB RAM, SSD
-  9 shards @ 250M documents / 900GB
l  domain field
-  Single String value
-  1.1M unique terms per shard
l  1 concurrent “user”
l  Random search terms
28/55
Pit of Pain™ (or maybe “Horrible Hill”?)
29/55
Fine counting can be slow
Phase 1: Standard faceting
Phase 2:
for term: fineCountRequest.getTerms()
result.add(term,
searcher.numDocs(query(field:term), base.getDocIDs()))
30/55
Alternative fine counting
counter = pool.getCounter()
for docID: result.getDocIDs()
for ordinal: getOrdinals(docID)
counter.increment(ordinal)
for term: fineCountRequest.getTerms()
result.add(term, counter.get(getOrdinal(term)))
}Same as phase 1, which yields
ord counter
0 0
1 3
2 0
3 1006
4 1
5 1
6 0
7 0
8 3
31/55
Using cached counters from phase 1 in phase 2
counter = pool.getCounter(key)
for term: query.getTerms()
result.add(term, counter.get(getOrdinal(term)))
pool.release(counter)
32/55
Pit of Pain™ practically eliminated
33/55
Pit of Pain™ practically eliminated
Stick figure CC BY-NC 2.5 Randall Munroe xkcd.com
34/55
Test setup 3 (more shards, more fields)
l  Solr setup
-  16 HT-cores, 256GB RAM, SSD
-  23 shards @ 250M documents / 900GB
l  Faceting on 6 fields
-  url: ~200M unique terms / shard
-  domain & host: ~1M unique terms each / shard
-  type, suffix, year: < 1000 unique terms / shard
35/55
1 machine, 7 billion documents / 23TB total index, 6 facet fields
36/55
High-cardinality can mean different things
Single shard / 250,000,000 docs / 900GB
Field References Max docs/term Unique terms
domain 250,000,000 3,000,000 1,100,000
url 250,000,000 56,000 200,000,000
links 5,800,000,000 5,000,000 610,000,000
2440 MB / counter
37/55
Different distributions
domain 1.1M url 200M links 600M
High max
Low max
Very long tail
Short tail
38/55
Theoretical lower limit per counter: log2(max_count)
max=1
max=7
max=2047
max=3
max=63
39/55
int vs. PackedInts
domain: 4 MB
url: 780 MB
links: 2350 MB
int[ordinals] PackedInts(ordinals, maxBPV)
domain: 3 MB (72%)
url: 420 MB (53%)
links: 1760 MB (75%)
40/55
n-plane-z counters
Platonic ideal Harsh reality
Plane d
Plane c
Plane b
Plane a
41/55
Plane d
Plane c
Plane b
Plane a
L: 0 ≣ 000000
42/55
Plane d
Plane c
Plane b
Plane a
L: 0 ≣ 000000
L: 1 ≣ 000001
43/55
Plane d
Plane c
Plane b
Plane a
L: 0 ≣ 000000
L: 1 ≣ 000001
L: 2 ≣ 000011
44/55
Plane d
Plane c
Plane b
Plane a
L: 0 ≣ 000000
L: 1 ≣ 000001
L: 2 ≣ 000011
L: 3 ≣ 000101
45/55
Plane d
Plane c
Plane b
Plane a
L: 0 ≣ 000000
L: 1 ≣ 000001
L: 2 ≣ 000011
L: 3 ≣ 000101
L: 4 ≣ 000111
L: 5 ≣ 001001
L: 6 ≣ 001011
L: 7 ≣ 001101
...
L: 12 ≣ 010111
46/55
Comparison of counter structures
domain: 4 MB
url: 780 MB
links: 2350 MB
domain: 3 MB (72%)
url: 420 MB (53%)
links: 1760 MB (75%)
domain: 1 MB (30%)
url: 66 MB ( 8%)
links: 311 MB (13%)
int[ordinals] PackedInts(ordinals, maxBPV) n-plane-z
47/55
Speed comparison
48/55
I could go on about
l  Threaded counting
l  Heuristic faceting
l  Fine count skipping
l  Counter capping
l  Monotonically increasing tracker for n-plane-z
l  Regexp filtering
49/55
What about huge result sets?
l  Rare for explorative term-based searches
l  Common for batch extractions
l  Threading works poorly as #shards > #CPUs
l  But how bad is it really?
50/55
Really bad! 8 minutes
51/55
Heuristic faceting
l  Use sampling to guess top-X terms
-  Re-use the existing tracked counters
-  1:1000 sampling seems usable for the field links,
which has 5 billion references per shard
l  Fine-count the guessed terms
52/55
Over provisioning helps validity
53/55
10 seconds < 8 minutes
54/55
Web scale for a small web
l  Denmark
-  Consolidation circa 10th century
-  5.6 million people
l  Danish Net Archive (http://netarkivet.dk)
-  Constitution 2005
-  20 billion items / 590TB+ raw data
55/55
Never enough time, but talk to me about
l  Threaded counting
l  Monotonically increasing tracker for n-plane-z
l  Regexp filtering
l  Fine count skipping
l  Counter capping

Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & University Library, Denmark

  • 1.
    O C TO B E R 1 3 - 1 6 , 2 0 1 6 • A U S T I N , T X
  • 2.
    Faceting optimizations forSolr Toke Eskildsen Search Engineer / Solr Hacker State and University Library, Denmark @TokeEskildsen / te@statsbiblioteket.dk
  • 3.
    3/55 3 Overview l  Web scaleat the State and University Library, Denmark l  Field faceting 101 l  Optimizations -  Reuse -  Tracking -  Caching -  Alternative counters
  • 4.
    4/55 Web scale fora small web l  Denmark -  Consolidation circa 10th century -  5.6 million people l  Danish Net Archive (http://netarkivet.dk) -  Constitution 2005 -  20 billion items / 590TB+ raw data
  • 5.
    5/55 Indexing 20 billionweb items / 590TB into Solr l  Solr index size is 1/9th of real data = 70TB l  Each shard holds 200M documents / 900GB -  Shards build chronologically by dedicated machine -  Projected 80 shards -  Current build time per shard: 4 days -  Total build time is 20 CPU-core years -  So far only 7.4 billion documents / 27TB in index
  • 6.
    6/55 Searching a 7.4billion documents / 27TB Solr index l  SolrCloud with 2 machines, each having -  16 HT-cores, 256GB RAM, 25 * 930GB SSD -  25 shards @ 900GB -  1 Solr/shard/SSD, Xmx=8g, Solr 4.10 -  Disk cache 100GB or < 1% of index size
  • 7.
  • 8.
    8/55 String faceting 101(single shard) counter = new int[ordinals] for docID: result.getDocIDs() for ordinal: getOrdinals(docID) counter[ordinal]++ for ordinal = 0 ; ordinal < counters.length ; ordinal++ priorityQueue.add(ordinal, counter[ordinal]) for entry: priorityQueue result.add(resolveTerm(ordinal), count) ord term counter 0 A 0 1 B 3 2 C 0 3 D 1006 4 E 1 5 F 1 6 G 0 7 H 0 8 I 3
  • 9.
    9/55 Test setup 1(easy start) l  Solr setup -  16 HT-cores, 256GB RAM, SSD -  Single shard 250M documents / 900GB l  URL field -  Single String value -  200M unique terms l  3 concurrent “users” l  Random search terms
  • 10.
    10/55 Vanilla Solr, singleshard, 250M documents, 200M values, 3 users
  • 11.
  • 12.
    12/55 Reuse the counter counter= new int[ordinals] for docID: result.getDocIDs() for ordinal: getOrdinals(docID) counter[ordinal]++ for ordinal = 0 ; ordinal < counters.length ; ordinal++ priorityQueue.add(ordinal, counter[ordinal]) <counter no more referenced and will be garbage collected at some point>
  • 13.
    13/55 Reuse the counter counter= pool.getCounter() for docID: result.getDocIDs() for ordinal: getOrdinals(docID) counter[ordinal]++ for ordinal = 0 ; ordinal < counters.length ; ordinal++ priorityQueue.add(ordinal, counter[ordinal]) pool.release(counter) Note: The JSON Facet API in Solr 5 already supports reuse of counters
  • 14.
  • 15.
  • 16.
    16/55 Reusing counters, nowwith readable visualization
  • 17.
    17/55 Reusing counters, nowwith readable visualization Why does it always take more than 500ms?
  • 18.
    18/55 Iteration is notfree counter = pool.getCounter() for docID: result.getDocIDs() for ordinal: getOrdinals(docID) counter[ordinal]++ for ordinal = 0 ; ordinal < counters.length ; ordinal++ priorityQueue.add(ordinal, counter[ordinal]) pool.release(counter) 200M unique terms = 800MB
  • 19.
    19/55 ord counter 0 0 10 2 0 3 0 4 0 5 0 6 0 7 0 8 0 tracker N/A N/A N/A N/A N/A N/A N/A N/A N/A Tracking updated counters
  • 20.
    20/55 ord counter 0 0 10 2 0 3 1 4 0 5 0 6 0 7 0 8 0 tracker 3 N/A N/A N/A N/A N/A N/A N/A N/A counter[3]++ Tracking updated counters
  • 21.
    21/55 ord counter 0 0 11 2 0 3 1 4 0 5 0 6 0 7 0 8 0 tracker 3 1 N/A N/A N/A N/A N/A N/A N/A counter[3]++ counter[1]++ Tracking updated counters
  • 22.
    22/55 ord counter 0 0 13 2 0 3 1 4 0 5 0 6 0 7 0 8 0 tracker 3 1 N/A N/A N/A N/A N/A N/A N/A counter[3]++ counter[1]++ counter[1]++ counter[1]++ Tracking updated counters
  • 23.
    23/55 ord counter 0 0 13 2 0 3 1006 4 1 5 1 6 0 7 0 8 3 tracker 3 1 8 4 5 N/A N/A N/A N/A counter[3]++ counter[1]++ counter[1]++ counter[1]++ counter[8]++ counter[8]++ counter[4]++ counter[8]++ counter[5]++ counter[1]++ counter[1]++ … counter[1]++ Tracking updated counters
  • 24.
    24/55 Tracking updated counters counter= pool.getCounter() for docID: result.getDocIDs() for ordinal: getOrdinals(docID) if counter[ordinal]++ == 0 && tracked < maxTracked tracker[tracked++] = ordinal if tracked < maxTracked for i = 0 ; i < tracked ; i++ priorityQueue.add(tracker[i], counter[tracker[i]]) else for ordinal = 0 ; ordinal < counter.length ; ordinal++ priorityQueue.add(ordinal, counter[ordinal]) ord counter 0 0 1 3 2 0 3 1006 4 1 5 1 6 0 7 0 8 3 tracker 3 1 8 4 5 N/A N/A N/A N/A
  • 25.
  • 26.
    26/55 Distributed faceting Phase 1)All shards performs faceting. The Merger calculates the top-X terms. Phase 2) The term counts are requested from the shards that did not return them in phase 1. The Merger calculates the final counts for the top-X terms. for term: fineCountRequest.getTerms() result.add(term, searcher.numDocs(query(field:term), base.getDocIDs()))
  • 27.
    27/55 Test setup 2(more shards, smaller field) l  Solr setup -  16 HT-cores, 256GB RAM, SSD -  9 shards @ 250M documents / 900GB l  domain field -  Single String value -  1.1M unique terms per shard l  1 concurrent “user” l  Random search terms
  • 28.
    28/55 Pit of Pain™(or maybe “Horrible Hill”?)
  • 29.
    29/55 Fine counting canbe slow Phase 1: Standard faceting Phase 2: for term: fineCountRequest.getTerms() result.add(term, searcher.numDocs(query(field:term), base.getDocIDs()))
  • 30.
    30/55 Alternative fine counting counter= pool.getCounter() for docID: result.getDocIDs() for ordinal: getOrdinals(docID) counter.increment(ordinal) for term: fineCountRequest.getTerms() result.add(term, counter.get(getOrdinal(term))) }Same as phase 1, which yields ord counter 0 0 1 3 2 0 3 1006 4 1 5 1 6 0 7 0 8 3
  • 31.
    31/55 Using cached countersfrom phase 1 in phase 2 counter = pool.getCounter(key) for term: query.getTerms() result.add(term, counter.get(getOrdinal(term))) pool.release(counter)
  • 32.
    32/55 Pit of Pain™practically eliminated
  • 33.
    33/55 Pit of Pain™practically eliminated Stick figure CC BY-NC 2.5 Randall Munroe xkcd.com
  • 34.
    34/55 Test setup 3(more shards, more fields) l  Solr setup -  16 HT-cores, 256GB RAM, SSD -  23 shards @ 250M documents / 900GB l  Faceting on 6 fields -  url: ~200M unique terms / shard -  domain & host: ~1M unique terms each / shard -  type, suffix, year: < 1000 unique terms / shard
  • 35.
    35/55 1 machine, 7billion documents / 23TB total index, 6 facet fields
  • 36.
    36/55 High-cardinality can meandifferent things Single shard / 250,000,000 docs / 900GB Field References Max docs/term Unique terms domain 250,000,000 3,000,000 1,100,000 url 250,000,000 56,000 200,000,000 links 5,800,000,000 5,000,000 610,000,000 2440 MB / counter
  • 37.
    37/55 Different distributions domain 1.1Murl 200M links 600M High max Low max Very long tail Short tail
  • 38.
    38/55 Theoretical lower limitper counter: log2(max_count) max=1 max=7 max=2047 max=3 max=63
  • 39.
    39/55 int vs. PackedInts domain:4 MB url: 780 MB links: 2350 MB int[ordinals] PackedInts(ordinals, maxBPV) domain: 3 MB (72%) url: 420 MB (53%) links: 1760 MB (75%)
  • 40.
    40/55 n-plane-z counters Platonic idealHarsh reality Plane d Plane c Plane b Plane a
  • 41.
    41/55 Plane d Plane c Planeb Plane a L: 0 ≣ 000000
  • 42.
    42/55 Plane d Plane c Planeb Plane a L: 0 ≣ 000000 L: 1 ≣ 000001
  • 43.
    43/55 Plane d Plane c Planeb Plane a L: 0 ≣ 000000 L: 1 ≣ 000001 L: 2 ≣ 000011
  • 44.
    44/55 Plane d Plane c Planeb Plane a L: 0 ≣ 000000 L: 1 ≣ 000001 L: 2 ≣ 000011 L: 3 ≣ 000101
  • 45.
    45/55 Plane d Plane c Planeb Plane a L: 0 ≣ 000000 L: 1 ≣ 000001 L: 2 ≣ 000011 L: 3 ≣ 000101 L: 4 ≣ 000111 L: 5 ≣ 001001 L: 6 ≣ 001011 L: 7 ≣ 001101 ... L: 12 ≣ 010111
  • 46.
    46/55 Comparison of counterstructures domain: 4 MB url: 780 MB links: 2350 MB domain: 3 MB (72%) url: 420 MB (53%) links: 1760 MB (75%) domain: 1 MB (30%) url: 66 MB ( 8%) links: 311 MB (13%) int[ordinals] PackedInts(ordinals, maxBPV) n-plane-z
  • 47.
  • 48.
    48/55 I could goon about l  Threaded counting l  Heuristic faceting l  Fine count skipping l  Counter capping l  Monotonically increasing tracker for n-plane-z l  Regexp filtering
  • 49.
    49/55 What about hugeresult sets? l  Rare for explorative term-based searches l  Common for batch extractions l  Threading works poorly as #shards > #CPUs l  But how bad is it really?
  • 50.
  • 51.
    51/55 Heuristic faceting l  Usesampling to guess top-X terms -  Re-use the existing tracked counters -  1:1000 sampling seems usable for the field links, which has 5 billion references per shard l  Fine-count the guessed terms
  • 52.
  • 53.
  • 54.
    54/55 Web scale fora small web l  Denmark -  Consolidation circa 10th century -  5.6 million people l  Danish Net Archive (http://netarkivet.dk) -  Constitution 2005 -  20 billion items / 590TB+ raw data
  • 55.
    55/55 Never enough time,but talk to me about l  Threaded counting l  Monotonically increasing tracker for n-plane-z l  Regexp filtering l  Fine count skipping l  Counter capping