SlideShare a Scribd company logo
OCTOBER	
  11-­‐14,	
  2016	
  	
  •	
  	
  BOSTON,	
  MA	
  
Anyone	
  can	
  build	
  a	
  Recsys	
  w/	
  Solr!	
  
Doug	
  Turnbull	
  
Relevance	
  Consultant,	
  OpenSource	
  ConnecIons	
  
I’m now available in
book form!
https://www.manning.com/books/relevant-search
Discount code: relsearch (38% off)
http://opensourceconnections.com/about-us/doug-turnbull/
Me The company...
field	
  Body	
  
	
  term	
  laser	
  
	
  	
  	
  	
  doc	
  2	
  
	
  <metadata>	
  	
  
	
  	
  	
  	
  doc	
  4	
  
	
  <metadata>	
  	
  	
  
	
  term	
  light	
  
	
  	
  	
  	
  doc	
  2	
  
	
  	
   	
  <metadata>	
  
	
  term	
  lightsaber	
  
	
  	
  	
  	
  doc	
  0	
  
How do search engines work?
The answer can be found in your textbook…
OpenSource Connections
Book Index:
•  Topics -> page no
•  Very efficient tool – compare to
scanning the whole book!
Lucene uses an index:
•  Tokens => document ids:
laser => [2, 4]
light => [2, 5]
lightsaber => [0, 1, 5, 7]
What's the point?
OpenSource Connections
Solr:
-  A general purpose system for looking up content based on features that
describe them
Tokens aren't really words!
doc0: "I like the bananas"
Analysis
Analysis
term I:
doc 0
term lik
doc 0
term banan:
doc 0
[lik]
[banan]Search: "liked banana?"
[I] [lik] [banan]
TF*IDF -- measuring feature
weight
OpenSource Connections
term I:
doc 0:
freq: 5
doc 1:
freq: 7
doc 3:
freq: 4
term banan:
doc 0:
freq: 2
"Banana-ness" is pretty special
"I-ness" is not special
doc0:
tf==5
df==3
(raw) TF*IDF = 5/3 = 1.6667
doc0:
tf==2
df==1
(raw) TF*IDF = 2/1 = 2.0
Search is really
distributed feature
matching and
similarity
(text-oriented)
Search often stands in for human interactions
I have a craving for a nice
juicy cut of meat. What
might you recommend?
I have JUST the thing!
Searching the market
q=(juiciness:juicy meatiness:meaty)
Modeling arbitrary feature
strength
OpenSource Connections
term juicy:
steak:
juiciness: 5
grapefruit:
juiciness: 7
orange:
juiciness: 4
term meaty:
burger:
meatiness: 2
What you want:
{
item: "steak",
juiciness: ["juicy", "juicy", "juicy"],
meatiness: ["meaty"]
}
Use term frequency as feature
strength:
{
item: "grapefruit",
juiciness: ["juicy", "juicy", "juicy", "juicy", "juicy"],
meatiness: [""]
}
(remember,
Solr doesn't
need to store
this)
TF*IDF -- measuring feature
weight
OpenSource Connections
term juicy:
doc 0:
freq: 5
doc 1:
freq: 7
doc 3:
freq: 4
term meaty:
doc 0:
freq: 2
"meaty-ness" is pretty special
"juicy-ness" is pretty non-special
doc0:
tf==5
df==3
(raw) TF*IDF = 5/3 = 1.6667
doc0:
tf==2
df==1
(raw) TF*IDF = 2/1 = 2.0
Search is really
distributed feature
matching and
similarity
Requesting something from my grocer
More juicy Less juicy
More meaty Less meaty
q=meaty juicy
Results: 1.
2.
3.
Recsys also stands in for human interactions
Hi Jane,
Recommend me
something?
Hmm…
<Tom likes limes, what is
similar to limes?>
recommendations
Use existing properties
of thing to recommend
similar things
juicy
citrus
More like this for
unstructured data
What features/tokens are
most representative of this
thing?
http://solr.quepid.com/solr/tmdb/select?q={!mlt%20qf=overview}97&fl=title,id,overview (movies like
juicy
citrus
(search)
Here's some ideas...
{
item: "lime",
juiciness: ["juicy", "juicy", "juicy"],
citrusness: ["citrus", "citrus", "citrus"],
meatiness: [""],
partyness: ["party"]
}
"Content Based" more-like-these
Use existing properties
of thing to recommend
similar things
juicy
meaty
citrus
http://solr.quepid.com/solr/tmdb/select?q={!mlt%20qf=overview}97&fl=title,id,overview (movies like
Here's some ideas...
Jane knows a few more things that Tom likes...
Personalization metadata
Index extra data alongside your
products
{
item: "hamburger",
preferred_by_genders: ["m", …],
preferred_by_ages: ["30_40"]
}
age:30_40
gender:m
http://solr.quepid.com/solr/tmdb/select?q={!mlt%20qf=overview}97&fl=title,id,overview (movies like
Here's some ideas...
Jane knows a few things about Tom
(30 yr old male)
But, Jane's intuition transcends
words!
age:30_40
gender:m
Currently we're stuck with predefined labels:
citrus juicy
meaty
We're curating using
known vocabularies
(can we describe everything?)
What we like often transcends words
There are emergent properties of our world that don't have names
Relative flarglewharbliness
More flarglewharbilyLess flarglewharbily
Diet Coke
What's a flarglewharble?
More flarglewharbilyLess flarglewharbily
fruit orange lemon banana mentos diet coke
tom X
sue X X X
charlie X X
clare X X
hal x x
Goes together
Diet Coke
Can search find the flargles?
q=(flargliwharbliness:very)
	
  term	
  flarglewharble:	
  
	
  	
  	
  	
  diet-­‐coke:	
  
	
  	
  	
  	
  	
  	
  flargleness:	
  4	
  
	
  	
  	
  	
  mentos:	
  
	
  	
  	
  	
  	
  	
  flargleness:	
  3	
  
	
  	
  	
  	
  banana	
  
	
  	
  	
  	
  	
  	
  flargleness:	
  1	
  
	
  	
  
Can we somehow build?
Diet Coke
personfood orange lemon banana mentos diet coke
tom X X
sue X X X X
charlie X X
clare X X
hal x x X
Goes together
flarglewharble!
Babies often use made-up words based
on emergent patterns in their universe
They are less committed to our
language
What's the point?
Collaborative filtering
Latent vocabulary
(the flarglewharbles)
Pure Search
Content-based Recs
Predefined vocabulary
Can Solr discover the latent/
emergent vocabularies?
Can Solr discover the latent/
emergent vocabularies?
Well first let's tell Solr about our users
{
user: "Sue"
foods_bought: ["lemon", "banana", "mentos", "diet coke"]
}
{
user: "Charlie"
foods_bought: ["banana", "mentos", "diet coke"]
}
Faceting?
We need a way to look across users and look for patterns
(analyze all the baskets that contain mentos)
q=foods_bought:mentos&facet=true&facet.field=foods_bought
facets:
mentos: 3
diet-coke: 3
banana: 2
Hmm:
-  Bananas are globally popular
-  Diet-coke is probably what matters
Counts don't work: importance of
significance
q=foods_bought:mentos&facet=true&facet.field=foods_bought
facets:
mentos: 3
diet-coke: 3
banana: 2
Diet Coke:
Global popularity: diet coke (3)
Local popularity: 3
Score: 3/3 = 1
Banana:
Global popularity: banana
(4)
Local popularity: 2
Score: 2/4 = 0.5
by-significance:
diet-coke: 1
banana: 0.5
Streaming Expressions
/select?q=*:*&facet=true&facet.field=liked_movies
But there's a new sheriff in town!
One option: we could go about and gather global doc freqs & compare those
locally.
Terms component another option… plugins...
Streaming expressions -- distributed stream
computation system on top of Solr Cloud
You must ALWAYS cross the streams!
Streaming Expressions
/stream?expr=scoreNodes(facet(...)...)
facet(movielens,
q="*:*",
buckets="liked_movies",
bucketSorts="count(*) desc",
bucketSizeLimit="100",
count(*))
Faceting with Streaming Expressions:
Output:
{
"result-set":
{"docs":[
{
"count(*)":55807,
"liked_movies":"318"},
{
"count(*)":52352,
"liked_movies":"296"},
{
"count(*)":50114,
"liked_movies":"593"}
Nodes to be transformed
Significance with streaming expr
/stream?expr=scoreNodes(facet(...)...)
scoreNodes(
select(
facet(movielens,
q="liked_movies:2571 OR liked_movies:4993",
buckets="liked_movies",
bucketSorts="count(*) desc",
bucketSizeLimit="100",
count(*)),
liked_movies as node,
"count(*)",
replace(collection, null, withValue=movielens),
replace(field, null, withValue=liked_movies))
)
1.  facet (just like above, just with streaming expr)
2.  select to format data for scoreNodes
3.  scoreNodes to score using TF*IDF
Banana occurs in 2 documents here, 4 globally --
2/4 = 0.5
Diet coke occurs 2 documents here, 2 globally --
2/2 = 1.0
Thinking back on my
shoppers behaviors, here's
some other items you might
like:
(thanks Joel Bernstein!)
Diet Coke
Lots of power here
/stream?expr=scoreNodes(facet(...)...)
scoreNodes(
select(
facet(movielens,
q="juiciness_pref:juicy",
buckets="liked_movies",
bucketSorts="count(*) desc",
bucketSizeLimit="100",
count(*)),
liked_movies as node,
"count(*)",
replace(collection, null, withValue=movielens),
replace(field, null, withValue=liked_movies))
)
Find users that like juicy things, what do they like?
Perhaps bucket over the aisle they like?
Construct our query to focus on a date range?
Many insights
(thanks Joel Bernstein!)
Only glimpsing the underlying
pattern...
We're not enumerating the flarglewharbles, and the schlumblefumbles
More flarglewharbilyLess flarglewharbily
Diet Coke
More schlumblewumblyLess schumblewumbly
Diet Coke
Coming soon (Solr 6.3)
http://yonik.com/solr-6-3/
https://issues.apache.org/jira/browse/SOLR-9258
-  Models for training classifiers
-  Then in turn updating documents
Progress is being made!
-  Clustering?
Questions?
The Flarglewharbles

More Related Content

What's hot

MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB
 
[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信
[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信
[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信Amazon Web Services Japan
 
AWS Black Belt Online Seminar 2017 初心者向け クラウドコンピューティング はじめの一歩
AWS Black Belt Online Seminar 2017  初心者向け クラウドコンピューティング はじめの一歩AWS Black Belt Online Seminar 2017  初心者向け クラウドコンピューティング はじめの一歩
AWS Black Belt Online Seminar 2017 初心者向け クラウドコンピューティング はじめの一歩Amazon Web Services Japan
 
2023 COSCUP - Whats new in PostgreSQL 16
2023 COSCUP - Whats new in PostgreSQL 162023 COSCUP - Whats new in PostgreSQL 16
2023 COSCUP - Whats new in PostgreSQL 16José Lin
 
jq: JSON - Like a Boss
jq: JSON - Like a Bossjq: JSON - Like a Boss
jq: JSON - Like a BossBob Tiernay
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon Web Services Korea
 
Neural Search Comes to Apache Solr
Neural Search Comes to Apache SolrNeural Search Comes to Apache Solr
Neural Search Comes to Apache SolrSease
 
15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由
15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由
15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由Yasuhiro Horiuchi
 
AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~
AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~
AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~Amazon Web Services Japan
 
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNSAWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNSAmazon Web Services Japan
 
Data processing with celery and rabbit mq
Data processing with celery and rabbit mqData processing with celery and rabbit mq
Data processing with celery and rabbit mqJeff Peck
 
AWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearchAWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearchAmazon Web Services Japan
 
An Example of Library and Museum Cooperation: FRBRoo
An Example of Library and Museum Cooperation: FRBRooAn Example of Library and Museum Cooperation: FRBRoo
An Example of Library and Museum Cooperation: FRBRooPatrick Le Boeuf
 
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfWord2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfSease
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB
 
研究用途でのAWSの利用事例と機械学習について
研究用途でのAWSの利用事例と機械学習について研究用途でのAWSの利用事例と機械学習について
研究用途でのAWSの利用事例と機械学習についてYasuhiro Matsuo
 
Quick flask an intro to flask
Quick flask   an intro to flaskQuick flask   an intro to flask
Quick flask an intro to flaskjuzten
 
elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리Junyi Song
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphTrey Grainger
 

What's hot (20)

MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
MongoDB .local Toronto 2019: Aggregation Pipeline Power++: How MongoDB 4.2 Pi...
 
[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信
[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信
[AWSマイスターシリーズ] Amazon CloudFront / Amazon Elastic Transcoderによるコンテンツ配信
 
AWS Black Belt Online Seminar 2017 初心者向け クラウドコンピューティング はじめの一歩
AWS Black Belt Online Seminar 2017  初心者向け クラウドコンピューティング はじめの一歩AWS Black Belt Online Seminar 2017  初心者向け クラウドコンピューティング はじめの一歩
AWS Black Belt Online Seminar 2017 初心者向け クラウドコンピューティング はじめの一歩
 
2023 COSCUP - Whats new in PostgreSQL 16
2023 COSCUP - Whats new in PostgreSQL 162023 COSCUP - Whats new in PostgreSQL 16
2023 COSCUP - Whats new in PostgreSQL 16
 
jq: JSON - Like a Boss
jq: JSON - Like a Bossjq: JSON - Like a Boss
jq: JSON - Like a Boss
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
 
Neural Search Comes to Apache Solr
Neural Search Comes to Apache SolrNeural Search Comes to Apache Solr
Neural Search Comes to Apache Solr
 
15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由
15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由
15分でわかるAWSクラウドで オンプレ以上のセキュリティを実現できる理由
 
AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~
AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~
AWS Black Belt Online Seminar 2017 AWS体験ハンズオン~Deploy with EB CLI編~
 
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNSAWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
 
Data processing with celery and rabbit mq
Data processing with celery and rabbit mqData processing with celery and rabbit mq
Data processing with celery and rabbit mq
 
はじめよう DynamoDB ハンズオン
はじめよう DynamoDB ハンズオンはじめよう DynamoDB ハンズオン
はじめよう DynamoDB ハンズオン
 
AWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearchAWS Black Belt Techシリーズ Amazon CloudSearch
AWS Black Belt Techシリーズ Amazon CloudSearch
 
An Example of Library and Museum Cooperation: FRBRoo
An Example of Library and Museum Cooperation: FRBRooAn Example of Library and Museum Cooperation: FRBRoo
An Example of Library and Museum Cooperation: FRBRoo
 
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdfWord2Vec model to generate synonyms on the fly in Apache Lucene.pdf
Word2Vec model to generate synonyms on the fly in Apache Lucene.pdf
 
MongoDB Aggregation Performance
MongoDB Aggregation PerformanceMongoDB Aggregation Performance
MongoDB Aggregation Performance
 
研究用途でのAWSの利用事例と機械学習について
研究用途でのAWSの利用事例と機械学習について研究用途でのAWSの利用事例と機械学習について
研究用途でのAWSの利用事例と機械学習について
 
Quick flask an intro to flask
Quick flask   an intro to flaskQuick flask   an intro to flask
Quick flask an intro to flask
 
elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리elasticsearch_적용 및 활용_정리
elasticsearch_적용 및 활용_정리
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
 

Similar to Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbull, OpenSource Connections

Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
Brandwatch masterclass presentation
Brandwatch masterclass presentationBrandwatch masterclass presentation
Brandwatch masterclass presentationColin Rogers
 
Masterclass San Francisco: Data-driven analysis of social conversations using...
Masterclass San Francisco: Data-driven analysis of social conversations using...Masterclass San Francisco: Data-driven analysis of social conversations using...
Masterclass San Francisco: Data-driven analysis of social conversations using...Brandwatch
 
Scaling Saved Searches at eBay Kleinanzeigen
Scaling Saved Searches at eBay KleinanzeigenScaling Saved Searches at eBay Kleinanzeigen
Scaling Saved Searches at eBay KleinanzeigenAndre Charton
 
BigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic TradingBigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic TradingTim Shea
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchFabien Gandon
 
The well tempered search application
The well tempered search applicationThe well tempered search application
The well tempered search applicationTed Sullivan
 
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Lucidworks
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrLucidworks
 
Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at DailymotionCédric Hourcade
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Webebiquity
 
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionDeveloping in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionRobin van Emden
 
Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchTrey Grainger
 
The Functional Programmer's Toolkit (NDC London 2019)
The Functional Programmer's Toolkit (NDC London 2019)The Functional Programmer's Toolkit (NDC London 2019)
The Functional Programmer's Toolkit (NDC London 2019)Scott Wlaschin
 
The filter bubble
The filter bubbleThe filter bubble
The filter bubblefleong
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceJonathan Mugan
 
Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015François Scharffe
 

Similar to Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbull, OpenSource Connections (20)

Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Brandwatch masterclass presentation
Brandwatch masterclass presentationBrandwatch masterclass presentation
Brandwatch masterclass presentation
 
Masterclass San Francisco: Data-driven analysis of social conversations using...
Masterclass San Francisco: Data-driven analysis of social conversations using...Masterclass San Francisco: Data-driven analysis of social conversations using...
Masterclass San Francisco: Data-driven analysis of social conversations using...
 
Scaling Saved Searches at eBay Kleinanzeigen
Scaling Saved Searches at eBay KleinanzeigenScaling Saved Searches at eBay Kleinanzeigen
Scaling Saved Searches at eBay Kleinanzeigen
 
BigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic TradingBigData and Algorithms - LA Algorithmic Trading
BigData and Algorithms - LA Algorithmic Trading
 
Discovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory searchDiscovery Hub: on-the-fly linked data exploratory search
Discovery Hub: on-the-fly linked data exploratory search
 
The well tempered search application
The well tempered search applicationThe well tempered search application
The well tempered search application
 
Big data elasticsearch practical
Big data  elasticsearch practicalBig data  elasticsearch practical
Big data elasticsearch practical
 
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
Building Smarter Search Applications Using Built-In Knowledge Graphs and Quer...
 
30 Jan 2012 - New York City
30 Jan 2012 - New York City30 Jan 2012 - New York City
30 Jan 2012 - New York City
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with Solr
 
Elasticsearch at Dailymotion
Elasticsearch at DailymotionElasticsearch at Dailymotion
Elasticsearch at Dailymotion
 
Finding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic WebFinding knowledge, data and answers on the Semantic Web
Finding knowledge, data and answers on the Semantic Web
 
Developing in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit editionDeveloping in R - the contextual Multi-Armed Bandit edition
Developing in R - the contextual Multi-Armed Bandit edition
 
Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered Search
 
OpenML DALI
OpenML DALIOpenML DALI
OpenML DALI
 
The Functional Programmer's Toolkit (NDC London 2019)
The Functional Programmer's Toolkit (NDC London 2019)The Functional Programmer's Toolkit (NDC London 2019)
The Functional Programmer's Toolkit (NDC London 2019)
 
The filter bubble
The filter bubbleThe filter bubble
The filter bubble
 
From Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial IntelligenceFrom Natural Language Processing to Artificial Intelligence
From Natural Language Processing to Artificial Intelligence
 
Word embeddings as a service - PyData NYC 2015
Word embeddings as a service -  PyData NYC 2015Word embeddings as a service -  PyData NYC 2015
Word embeddings as a service - PyData NYC 2015
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceLucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesLucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchLucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondLucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...CzechDreamin
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Julian Hyde
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...Product School
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka DoktorováCzechDreamin
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsExpeed Software
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1DianaGray10
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaCzechDreamin
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...Product School
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Product School
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCzechDreamin
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...CzechDreamin
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationZilliz
 

Recently uploaded (20)

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
In-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT ProfessionalsIn-Depth Performance Testing Guide for IT Professionals
In-Depth Performance Testing Guide for IT Professionals
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 

Anyone Can Build A Recommendation Engine With Solr: Presented by Doug Turnbull, OpenSource Connections

  • 1. OCTOBER  11-­‐14,  2016    •    BOSTON,  MA  
  • 2. Anyone  can  build  a  Recsys  w/  Solr!   Doug  Turnbull   Relevance  Consultant,  OpenSource  ConnecIons  
  • 3. I’m now available in book form! https://www.manning.com/books/relevant-search Discount code: relsearch (38% off) http://opensourceconnections.com/about-us/doug-turnbull/ Me The company...
  • 4. field  Body    term  laser          doc  2    <metadata>            doc  4    <metadata>        term  light          doc  2        <metadata>    term  lightsaber          doc  0   How do search engines work? The answer can be found in your textbook… OpenSource Connections Book Index: •  Topics -> page no •  Very efficient tool – compare to scanning the whole book! Lucene uses an index: •  Tokens => document ids: laser => [2, 4] light => [2, 5] lightsaber => [0, 1, 5, 7]
  • 5. What's the point? OpenSource Connections Solr: -  A general purpose system for looking up content based on features that describe them Tokens aren't really words! doc0: "I like the bananas" Analysis Analysis term I: doc 0 term lik doc 0 term banan: doc 0 [lik] [banan]Search: "liked banana?" [I] [lik] [banan]
  • 6. TF*IDF -- measuring feature weight OpenSource Connections term I: doc 0: freq: 5 doc 1: freq: 7 doc 3: freq: 4 term banan: doc 0: freq: 2 "Banana-ness" is pretty special "I-ness" is not special doc0: tf==5 df==3 (raw) TF*IDF = 5/3 = 1.6667 doc0: tf==2 df==1 (raw) TF*IDF = 2/1 = 2.0 Search is really distributed feature matching and similarity (text-oriented)
  • 7. Search often stands in for human interactions I have a craving for a nice juicy cut of meat. What might you recommend? I have JUST the thing!
  • 9. Modeling arbitrary feature strength OpenSource Connections term juicy: steak: juiciness: 5 grapefruit: juiciness: 7 orange: juiciness: 4 term meaty: burger: meatiness: 2 What you want: { item: "steak", juiciness: ["juicy", "juicy", "juicy"], meatiness: ["meaty"] } Use term frequency as feature strength: { item: "grapefruit", juiciness: ["juicy", "juicy", "juicy", "juicy", "juicy"], meatiness: [""] } (remember, Solr doesn't need to store this)
  • 10. TF*IDF -- measuring feature weight OpenSource Connections term juicy: doc 0: freq: 5 doc 1: freq: 7 doc 3: freq: 4 term meaty: doc 0: freq: 2 "meaty-ness" is pretty special "juicy-ness" is pretty non-special doc0: tf==5 df==3 (raw) TF*IDF = 5/3 = 1.6667 doc0: tf==2 df==1 (raw) TF*IDF = 2/1 = 2.0 Search is really distributed feature matching and similarity
  • 11. Requesting something from my grocer More juicy Less juicy More meaty Less meaty q=meaty juicy Results: 1. 2. 3.
  • 12. Recsys also stands in for human interactions Hi Jane, Recommend me something? Hmm… <Tom likes limes, what is similar to limes?>
  • 13. recommendations Use existing properties of thing to recommend similar things juicy citrus More like this for unstructured data What features/tokens are most representative of this thing? http://solr.quepid.com/solr/tmdb/select?q={!mlt%20qf=overview}97&fl=title,id,overview (movies like juicy citrus (search) Here's some ideas... { item: "lime", juiciness: ["juicy", "juicy", "juicy"], citrusness: ["citrus", "citrus", "citrus"], meatiness: [""], partyness: ["party"] }
  • 14. "Content Based" more-like-these Use existing properties of thing to recommend similar things juicy meaty citrus http://solr.quepid.com/solr/tmdb/select?q={!mlt%20qf=overview}97&fl=title,id,overview (movies like Here's some ideas... Jane knows a few more things that Tom likes...
  • 15. Personalization metadata Index extra data alongside your products { item: "hamburger", preferred_by_genders: ["m", …], preferred_by_ages: ["30_40"] } age:30_40 gender:m http://solr.quepid.com/solr/tmdb/select?q={!mlt%20qf=overview}97&fl=title,id,overview (movies like Here's some ideas... Jane knows a few things about Tom (30 yr old male)
  • 16. But, Jane's intuition transcends words! age:30_40 gender:m Currently we're stuck with predefined labels: citrus juicy meaty We're curating using known vocabularies (can we describe everything?)
  • 17. What we like often transcends words There are emergent properties of our world that don't have names Relative flarglewharbliness More flarglewharbilyLess flarglewharbily Diet Coke
  • 18. What's a flarglewharble? More flarglewharbilyLess flarglewharbily fruit orange lemon banana mentos diet coke tom X sue X X X charlie X X clare X X hal x x Goes together Diet Coke
  • 19. Can search find the flargles? q=(flargliwharbliness:very)  term  flarglewharble:          diet-­‐coke:              flargleness:  4          mentos:              flargleness:  3          banana              flargleness:  1       Can we somehow build? Diet Coke
  • 20. personfood orange lemon banana mentos diet coke tom X X sue X X X X charlie X X clare X X hal x x X Goes together flarglewharble! Babies often use made-up words based on emergent patterns in their universe They are less committed to our language
  • 21. What's the point? Collaborative filtering Latent vocabulary (the flarglewharbles) Pure Search Content-based Recs Predefined vocabulary Can Solr discover the latent/ emergent vocabularies?
  • 22. Can Solr discover the latent/ emergent vocabularies? Well first let's tell Solr about our users { user: "Sue" foods_bought: ["lemon", "banana", "mentos", "diet coke"] } { user: "Charlie" foods_bought: ["banana", "mentos", "diet coke"] }
  • 23. Faceting? We need a way to look across users and look for patterns (analyze all the baskets that contain mentos) q=foods_bought:mentos&facet=true&facet.field=foods_bought facets: mentos: 3 diet-coke: 3 banana: 2 Hmm: -  Bananas are globally popular -  Diet-coke is probably what matters
  • 24. Counts don't work: importance of significance q=foods_bought:mentos&facet=true&facet.field=foods_bought facets: mentos: 3 diet-coke: 3 banana: 2 Diet Coke: Global popularity: diet coke (3) Local popularity: 3 Score: 3/3 = 1 Banana: Global popularity: banana (4) Local popularity: 2 Score: 2/4 = 0.5 by-significance: diet-coke: 1 banana: 0.5
  • 25. Streaming Expressions /select?q=*:*&facet=true&facet.field=liked_movies But there's a new sheriff in town! One option: we could go about and gather global doc freqs & compare those locally. Terms component another option… plugins... Streaming expressions -- distributed stream computation system on top of Solr Cloud You must ALWAYS cross the streams!
  • 26. Streaming Expressions /stream?expr=scoreNodes(facet(...)...) facet(movielens, q="*:*", buckets="liked_movies", bucketSorts="count(*) desc", bucketSizeLimit="100", count(*)) Faceting with Streaming Expressions: Output: { "result-set": {"docs":[ { "count(*)":55807, "liked_movies":"318"}, { "count(*)":52352, "liked_movies":"296"}, { "count(*)":50114, "liked_movies":"593"} Nodes to be transformed
  • 27. Significance with streaming expr /stream?expr=scoreNodes(facet(...)...) scoreNodes( select( facet(movielens, q="liked_movies:2571 OR liked_movies:4993", buckets="liked_movies", bucketSorts="count(*) desc", bucketSizeLimit="100", count(*)), liked_movies as node, "count(*)", replace(collection, null, withValue=movielens), replace(field, null, withValue=liked_movies)) ) 1.  facet (just like above, just with streaming expr) 2.  select to format data for scoreNodes 3.  scoreNodes to score using TF*IDF Banana occurs in 2 documents here, 4 globally -- 2/4 = 0.5 Diet coke occurs 2 documents here, 2 globally -- 2/2 = 1.0 Thinking back on my shoppers behaviors, here's some other items you might like: (thanks Joel Bernstein!) Diet Coke
  • 28. Lots of power here /stream?expr=scoreNodes(facet(...)...) scoreNodes( select( facet(movielens, q="juiciness_pref:juicy", buckets="liked_movies", bucketSorts="count(*) desc", bucketSizeLimit="100", count(*)), liked_movies as node, "count(*)", replace(collection, null, withValue=movielens), replace(field, null, withValue=liked_movies)) ) Find users that like juicy things, what do they like? Perhaps bucket over the aisle they like? Construct our query to focus on a date range? Many insights (thanks Joel Bernstein!)
  • 29. Only glimpsing the underlying pattern... We're not enumerating the flarglewharbles, and the schlumblefumbles More flarglewharbilyLess flarglewharbily Diet Coke More schlumblewumblyLess schumblewumbly Diet Coke
  • 30. Coming soon (Solr 6.3) http://yonik.com/solr-6-3/ https://issues.apache.org/jira/browse/SOLR-9258 -  Models for training classifiers -  Then in turn updating documents Progress is being made! -  Clustering?