Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
sony headphones
mdr1000x
title^100 cat^10
model
/select
defType=edismax
title:sony^100 title:headphones^100 title:mdr1000x...
index
add_field_type:
name: tag
postingsFormat: FST50
add_field:
name: name_tag
type: tag
{"name_tag":"Thomas
Jefferson"}
...
Let's dive in!
https://github.com/sematext/activate
doc index query
Let's dive in!
{"Sony headphones", {'entities': [(0,4,'Manufacturer')]}}
{"Beats headphones", {'entities': [(0,5,'Manufacturer')]}}
Text ...
trained
model
spaCy
load
sony headphones mdr1000x
manufacturer: sony
q=manufacturer:sony AND (headphones mdr1000x)
Let's dive in!
Let's dive in!
⇒
MDR1000X is nice
WH1000X is nicesimilar
Let's dive in!
expr=significantTerms
sony
0,7981
philips
0,6534
beats
0,5342
features
recalculate
scores
q=...&rq={!ltr model=manufacture...
Let's dive in!
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Entity extraction for product search
Upcoming SlideShare
Loading in …5
×

Entity extraction for product search

1,651 views

Published on

This talk was given during Activate Conference 2018.
A user looking for “awesome smartphone 2018” is likely really after “+review:awesome +category:smartphone +release_date:2018”. A clever use of (e)dismax might get us pretty close to where we want, but it’s not real query understanding. There are other ways, of course, like training a model that will be based on the keyword, guess which field it’s looking into. In this session, we’ll discuss some of the ways, their pros, and cons and how you’d implement them on top of Solr. We’ll specifically look into existing open-source tools that you can re-use in order to build such a system.
Learn more on https://sematext.com/.

Published in: Engineering
  • Be the first to comment

Entity extraction for product search

  1. 1. sony headphones mdr1000x title^100 cat^10 model /select defType=edismax title:sony^100 title:headphones^100 title:mdr1000x^100 cat:sony^10 cat:headphones^10 cat:mdr1000x^100 model:sony model:headphones model:mdr1000x
  2. 2. index add_field_type: name: tag postingsFormat: FST50 add_field: name: name_tag type: tag {"name_tag":"Thomas Jefferson"} Thomas Jefferson (April 13, [O.S. April 2] 1743 – July 4, 1826) was an American Founding Father who was the principal author of the Declaration of Independence ... thomas jefferson startOffset: 100 endOffset: 116 startOffset: 215 endOffset: 231 /tagger
  3. 3. Let's dive in! https://github.com/sematext/activate
  4. 4. doc index query
  5. 5. Let's dive in!
  6. 6. {"Sony headphones", {'entities': [(0,4,'Manufacturer')]}} {"Beats headphones", {'entities': [(0,5,'Manufacturer')]}} Text Label Gradient Model Label predict
  7. 7. trained model spaCy load sony headphones mdr1000x manufacturer: sony q=manufacturer:sony AND (headphones mdr1000x)
  8. 8. Let's dive in!
  9. 9. Let's dive in!
  10. 10. ⇒ MDR1000X is nice WH1000X is nicesimilar
  11. 11. Let's dive in!
  12. 12. expr=significantTerms sony 0,7981 philips 0,6534 beats 0,5342 features recalculate scores q=...&rq={!ltr model=manufacturerTrainingModel reRankDocs=20} features model = + weights Model store
  13. 13. Let's dive in!

×