The talk is about designing Solr schema and different use cases in e-commerce world. It shares our experience with Solr Schema and how it evolved while building multi-tenant search platform.
3. Solr Schema
• Text Search
Dump everything to text
Store structured fields if needed
• E-commerce
Multiple applications per field
Facets, n-gram search, natural text
3
4. Schema in e-commerce world
• Brand
One Base field – Facet
Multiple Copy Fields – text, n-gram,
hierarchical paths etc.
4
Field Name Stored Indexed Type
Brand True True Facet (=String)
Brand_text False True text
Brand_ngra
m
False True ngram
• Facet = String = Stored
• Store a copy of every field for debugging
5. Multi Tenant
• More customers = more fields
• Common fields + merchant Fields
Field names
• brand, color, title
• debshops_*, e.g. color_group
Operational overhead for indexer and
deployment
5
6. Common Schema
• One schema.xml to rule them all
• Optimization cost v/s simplicity in early
stages
• Indexer, ranking etc.
• Solrcore.properties – Solrconfig.xml
6
7. Customers
• Common schema is not scalable
f_* = MultiValued + String
• Customer requests for Filters
f_oven_type
• Different Use cases and 100+ fields
Range queries = String vs Number, e.g.
Range
Sort = single-valued vs multi-valued, e.g.
Ratings
Stored ??7
8. Dynamic Fields
• Ranking cannot depend on dynamic fields
• Indexer and Search-Consumers –
common lib
8
DYN_S_S DYN_S_M DYN_N_S DYN_N_M
Sort, facet facet Sort, range Range
dept oven_type ratings item_length
• Dynamic Fields – DYN_<S/N>_<S/M>
• Datatype – String (S), Number(N)
• Single Valued(S), Multi-Valued (M)