SlideShare a Scribd company logo

Balancing the Dimensions of User Intent

The first step in returning relevant search results is successfully interpreting the user’s intent. This requires combining a holistic understanding of your content, your users, and your domain. Traditional keyword search focuses on the content understanding dimension. Knowledge graphs are then typically built and leveraged to represent an understanding of your domain. Finally, Collaborative recommendations and user profile learning are typically the tools of choice for generating and modeling an understanding of the preferences of each user. While these systems (search, recommendations, and knowledge graphs) are often built and used in isolation, combining them together is the key to truly understanding a user’s query intent. For example, combining traditional keyword search with your knowledge graph leads to semantic search capabilities, and combining traditional keyword search with recommendations leads to personalized search experiences. Combining all of these dimensions together in an appropriately balanced way will ultimately lead to the most accurate interpretation of a user’s query, resulting in a better query to the core search engine and ultimately a better, more relevant search experience. In this talk, we’ll demonstrate strategies for delivering and combining each of these dimensions of user intent, and we’ll walk through concrete examples of how to balance the nuances of each so that you also don’t over-personalize, over-contextualize, or under appreciate the nuances of your user’s intent.

1 of 107
Download to read offline
Trey Grainger
Chief Algorithms Officer
Balancing the Dimensions of User Intent
October 28, 2019
Trey Grainger
Chief Algorithms Officer
• Previously: SVP of Engineering @ Lucidworks; Director of Engineering @ CareerBuilder
• Georgia Tech – MBA, Management of Technology
• Furman University – BA, Computer Science, Business, & Philosophy
• Stanford University – Information Retrieval & Web Search
Other fun projects:
• Co-author of Solr in Action, plus numerous research publications
• Advisor to Presearch, the decentralized search engine
• Lucene / Solr contributor
About Me
• About Lucidworks
• What is AI-powered Search?
• The Dimensions of User Intent
• Content Understanding:
• Keyword Search
• User Understanding:
• Collaborative Recommendations
• Content Understanding + User Understanding:
• Personalized Search
• Domain Understanding:
• Knowledge Graphs
• Domain Understanding + User Understanding:
• Domain-aware Matching
• Content Understanding + Domain Understanding:
• Semantic Search
• Balancing Approaches:
• Keyword vs. Vector vs. Knowledge Graph Search
• Vector Search
• Knowledge Graph Search
• Combining it all together
Agenda
Who are we?
300+ CUSTOMERS ACROSS THE
FORTUNE 1000
400+EMPLOYEES
OFFICES IN
San Francisco, CA (HQ)
Raleigh-Durham, NC
Cambridge, UK
Bangalore, India
Hong Kong
The Search & AI Conference
COMPANY BEHIND
D E V E L O P M E N T,
H O S T I N G ,
& S U P P O R T
Proudly built with open-source
tech at its core: Apache Solr &
Apache Spark
Personalizes search
with applied
machine learning
Proven on the
world’s biggest
information systems
AI-Powered Search
What is
?

Recommended

Thought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchThought Vectors and Knowledge Graphs in AI-powered Search
Thought Vectors and Knowledge Graphs in AI-powered SearchTrey Grainger
 
BrightonSEO March 2021 | Dan Taylor, Image Entity Tags
BrightonSEO March 2021 | Dan Taylor, Image Entity TagsBrightonSEO March 2021 | Dan Taylor, Image Entity Tags
BrightonSEO March 2021 | Dan Taylor, Image Entity TagsDan Taylor
 
Leveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineLeveraging Lucene/Solr as a Knowledge Graph and Intent Engine
Leveraging Lucene/Solr as a Knowledge Graph and Intent EngineTrey Grainger
 
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Natural Language Processing and Search Intent Understanding C3 Conductor 2019...
Natural Language Processing and Search Intent Understanding C3 Conductor 2019...Dawn Anderson MSc DigM
 
Passage indexing is likely more important than you think
Passage indexing is likely more important than you thinkPassage indexing is likely more important than you think
Passage indexing is likely more important than you thinkDawn Anderson MSc DigM
 
Semantic Content Networks - Ranking Websites on Google with Semantic SEO
Semantic Content Networks - Ranking Websites on Google with Semantic SEOSemantic Content Networks - Ranking Websites on Google with Semantic SEO
Semantic Content Networks - Ranking Websites on Google with Semantic SEOKoray Tugberk GUBUR
 
Search Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEO
Search Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEOSearch Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEO
Search Query Processing: The Secret Life of Queries, Parsing, Rewriting & SEOKoray Tugberk GUBUR
 

More Related Content

What's hot

Keyword Research and Topic Modeling in a Semantic Web
Keyword Research and Topic Modeling in a Semantic WebKeyword Research and Topic Modeling in a Semantic Web
Keyword Research and Topic Modeling in a Semantic WebBill Slawski
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search SystemTrey Grainger
 
Coronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote CultureCoronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote CultureKoray Tugberk GUBUR
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfConnorShorten2
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphTrey Grainger
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge GraphTrey Grainger
 
Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Neo4j
 
William slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-searchWilliam slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-searchBill Slawski
 
10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and how10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and howRecruitingDaily.com LLC
 
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?Koray Tugberk GUBUR
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsAndrzej Michałowski
 
Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...
Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...
Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...Dan Taylor
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsXavier Amatriain
 
Lexical Semantics, Semantic Similarity and Relevance for SEO
Lexical Semantics, Semantic Similarity and Relevance for SEOLexical Semantics, Semantic Similarity and Relevance for SEO
Lexical Semantics, Semantic Similarity and Relevance for SEOKoray Tugberk GUBUR
 
AI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBURAI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBURAnton Shulke
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j
 
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j
 
stackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with Weaviatestackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with WeaviateNETWAYS
 
The Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy MarketerThe Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy MarketerHamlet Batista
 

What's hot (20)

Keyword Research and Topic Modeling in a Semantic Web
Keyword Research and Topic Modeling in a Semantic WebKeyword Research and Topic Modeling in a Semantic Web
Keyword Research and Topic Modeling in a Semantic Web
 
How to Build a Semantic Search System
How to Build a Semantic Search SystemHow to Build a Semantic Search System
How to Build a Semantic Search System
 
Coronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote CultureCoronavirus and Future of SEO: Digital Marketing and Remote Culture
Coronavirus and Future of SEO: Digital Marketing and Remote Culture
 
Vector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdfVector Search for Data Scientists.pdf
Vector Search for Data Scientists.pdf
 
The Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge GraphThe Apache Solr Semantic Knowledge Graph
The Apache Solr Semantic Knowledge Graph
 
40 Deep #SEO Insights for 2023
40 Deep #SEO Insights for 202340 Deep #SEO Insights for 2023
40 Deep #SEO Insights for 2023
 
The Semantic Knowledge Graph
The Semantic Knowledge GraphThe Semantic Knowledge Graph
The Semantic Knowledge Graph
 
Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach
 
William slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-searchWilliam slawski-google-patents- how-do-they-influence-search
William slawski-google-patents- how-do-they-influence-search
 
10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and how10 search engines every recruiter should be using and how
10 search engines every recruiter should be using and how
 
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?
The Reason Behind Semantic SEO: Why does Google Avoid the Word PageRank?
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...
Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...
Using Search Data To Create Better User Forecasts (& Then Fulfil Experiences)...
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Lexical Semantics, Semantic Similarity and Relevance for SEO
Lexical Semantics, Semantic Similarity and Relevance for SEOLexical Semantics, Semantic Similarity and Relevance for SEO
Lexical Semantics, Semantic Similarity and Relevance for SEO
 
AI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBURAI-powered Semantic SEO by Koray GUBUR
AI-powered Semantic SEO by Koray GUBUR
 
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4jNeo4j Graph Use Cases, Bruno Ungermann, Neo4j
Neo4j Graph Use Cases, Bruno Ungermann, Neo4j
 
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use CasesNeo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
 
stackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with Weaviatestackconf 2022: Introduction to Vector Search with Weaviate
stackconf 2022: Introduction to Vector Search with Weaviate
 
The Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy MarketerThe Python Cheat Sheet for the Busy Marketer
The Python Cheat Sheet for the Busy Marketer
 

Similar to Balancing the Dimensions of User Intent

The Next Generation of AI-powered Search
The Next Generation of AI-powered SearchThe Next Generation of AI-powered Search
The Next Generation of AI-powered SearchTrey Grainger
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementTrey Grainger
 
The Next Generation of AI-Powered Search
The Next Generation of AI-Powered SearchThe Next Generation of AI-Powered Search
The Next Generation of AI-Powered SearchLucidworks
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 CareerBuilder.com
 
Measuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metricMeasuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metricEdward Baker
 
8 Information Architecture Better Practices
8 Information Architecture Better Practices8 Information Architecture Better Practices
8 Information Architecture Better PracticesLouis Rosenfeld
 
Before Code: How to Plan & Visualize Your Project
Before Code: How to Plan & Visualize Your ProjectBefore Code: How to Plan & Visualize Your Project
Before Code: How to Plan & Visualize Your Projectmelmiller
 
SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018Linus Logren
 
A Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services IIA Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services IIMarvin Heng
 
Using sharepoint to solve business problems #spsnairobi2014
Using sharepoint to solve business problems #spsnairobi2014Using sharepoint to solve business problems #spsnairobi2014
Using sharepoint to solve business problems #spsnairobi2014Amos Wachanga
 
Crowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and HadoopCrowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and HadoopGrant Ingersoll
 
Adaptable Information Workshop slides
Adaptable Information Workshop slidesAdaptable Information Workshop slides
Adaptable Information Workshop slidesLouis Rosenfeld
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDBMongoDB
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systemsTrey Grainger
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataAndy Stretton
 
Open Source Needs Design
Open Source Needs DesignOpen Source Needs Design
Open Source Needs DesignAll Things Open
 
Navigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointNavigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointJoanne Klein
 

Similar to Balancing the Dimensions of User Intent (20)

The Next Generation of AI-powered Search
The Next Generation of AI-powered SearchThe Next Generation of AI-powered Search
The Next Generation of AI-powered Search
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
The Next Generation of AI-Powered Search
The Next Generation of AI-Powered SearchThe Next Generation of AI-Powered Search
The Next Generation of AI-Powered Search
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018 SDSC18 and DSATL Meetup March 2018
SDSC18 and DSATL Meetup March 2018
 
Beyond User Research
Beyond User ResearchBeyond User Research
Beyond User Research
 
Everything You Wish You Knew About Search
Everything You Wish You Knew About SearchEverything You Wish You Knew About Search
Everything You Wish You Knew About Search
 
Measuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metricMeasuring Impact: Towards a data citation metric
Measuring Impact: Towards a data citation metric
 
8 Information Architecture Better Practices
8 Information Architecture Better Practices8 Information Architecture Better Practices
8 Information Architecture Better Practices
 
Before Code: How to Plan & Visualize Your Project
Before Code: How to Plan & Visualize Your ProjectBefore Code: How to Plan & Visualize Your Project
Before Code: How to Plan & Visualize Your Project
 
SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018SEO Do's and Dont's - Search in 2018
SEO Do's and Dont's - Search in 2018
 
A Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services IIA Journey With Microsoft Cognitive Services II
A Journey With Microsoft Cognitive Services II
 
Using sharepoint to solve business problems #spsnairobi2014
Using sharepoint to solve business problems #spsnairobi2014Using sharepoint to solve business problems #spsnairobi2014
Using sharepoint to solve business problems #spsnairobi2014
 
Crowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and HadoopCrowd Sourced Reflected Intelligence for Solr and Hadoop
Crowd Sourced Reflected Intelligence for Solr and Hadoop
 
Adaptable Information Workshop slides
Adaptable Information Workshop slidesAdaptable Information Workshop slides
Adaptable Information Workshop slides
 
Webinar: Scaling MongoDB
Webinar: Scaling MongoDBWebinar: Scaling MongoDB
Webinar: Scaling MongoDB
 
Reflected intelligence evolving self-learning data systems
Reflected intelligence  evolving self-learning data systemsReflected intelligence  evolving self-learning data systems
Reflected intelligence evolving self-learning data systems
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
 
Open Source Needs Design
Open Source Needs DesignOpen Source Needs Design
Open Source Needs Design
 
Navigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePointNavigating the Mess of a Shared drive Migration to SharePoint
Navigating the Mess of a Shared drive Migration to SharePoint
 

More from Trey Grainger

Reflected Intelligence: Real world AI in Digital Transformation
Reflected Intelligence: Real world AI in Digital TransformationReflected Intelligence: Real world AI in Digital Transformation
Reflected Intelligence: Real world AI in Digital TransformationTrey Grainger
 
Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)Trey Grainger
 
Natural Language Search with Knowledge Graphs (Activate 2019)
Natural Language Search with Knowledge Graphs (Activate 2019)Natural Language Search with Knowledge Graphs (Activate 2019)
Natural Language Search with Knowledge Graphs (Activate 2019)Trey Grainger
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceTrey Grainger
 
Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)Trey Grainger
 
The Future of Search and AI
The Future of Search and AIThe Future of Search and AI
The Future of Search and AITrey Grainger
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphTrey Grainger
 
Searching for Meaning
Searching for MeaningSearching for Meaning
Searching for MeaningTrey Grainger
 
The Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesTrey Grainger
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation EnginesTrey Grainger
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsTrey Grainger
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrTrey Grainger
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemTrey Grainger
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelTrey Grainger
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Trey Grainger
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrTrey Grainger
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Trey Grainger
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchTrey Grainger
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 

More from Trey Grainger (20)

Reflected Intelligence: Real world AI in Digital Transformation
Reflected Intelligence: Real world AI in Digital TransformationReflected Intelligence: Real world AI in Digital Transformation
Reflected Intelligence: Real world AI in Digital Transformation
 
Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)
 
Natural Language Search with Knowledge Graphs (Activate 2019)
Natural Language Search with Knowledge Graphs (Activate 2019)Natural Language Search with Knowledge Graphs (Activate 2019)
Natural Language Search with Knowledge Graphs (Activate 2019)
 
Measuring Relevance in the Negative Space
Measuring Relevance in the Negative SpaceMeasuring Relevance in the Negative Space
Measuring Relevance in the Negative Space
 
Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)Natural Language Search with Knowledge Graphs (Haystack 2019)
Natural Language Search with Knowledge Graphs (Haystack 2019)
 
The Future of Search and AI
The Future of Search and AIThe Future of Search and AI
The Future of Search and AI
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge Graph
 
Searching for Meaning
Searching for MeaningSearching for Meaning
Searching for Meaning
 
The Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation EnginesThe Intent Algorithms of Search & Recommendation Engines
The Intent Algorithms of Search & Recommendation Engines
 
Building Search & Recommendation Engines
Building Search & Recommendation EnginesBuilding Search & Recommendation Engines
Building Search & Recommendation Engines
 
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval SystemsIntent Algorithms: The Data Science of Smart Information Retrieval Systems
Intent Algorithms: The Data Science of Smart Information Retrieval Systems
 
Self-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache SolrSelf-learned Relevancy with Apache Solr
Self-learned Relevancy with Apache Solr
 
The Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data EcosystemThe Apache Solr Smart Data Ecosystem
The Apache Solr Smart Data Ecosystem
 
South Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis PanelSouth Big Data Hub: Text Data Analysis Panel
South Big Data Hub: Text Data Analysis Panel
 
Reflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data systemReflected Intelligence: Lucene/Solr as a self-learning data system
Reflected Intelligence: Lucene/Solr as a self-learning data system
 
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
Searching on Intent: Knowledge Graphs, Personalization, and Contextual Disamb...
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 
Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...Crowdsourced query augmentation through the semantic discovery of domain spec...
Crowdsourced query augmentation through the semantic discovery of domain spec...
 
Enhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic searchEnhancing relevancy through personalization & semantic search
Enhancing relevancy through personalization & semantic search
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 

Recently uploaded

"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...Fwdays
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...ISPMAIndia
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...DianaGray10
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceVijayananda Mohire
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIEDanBrown980551
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...htrindia
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsInflectra
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build PolandGDSC PJATK
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
Enterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewEnterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewAshraf Fouad
 
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Product School
 
The Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product SchoolThe Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product SchoolProduct School
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceSusan Ibach
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxInfosec
 
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docxLeveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docxVotarikari Shravan
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...Neo4j
 
"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys VasylievFwdays
 
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...MarcovanHurne2
 
Apex Replay Debugger and Salesforce Platform Events.pptx
Apex Replay Debugger and Salesforce Platform Events.pptxApex Replay Debugger and Salesforce Platform Events.pptx
Apex Replay Debugger and Salesforce Platform Events.pptxmohayyudin7826
 

Recently uploaded (20)

In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...In sharing we trust. Taking advantage of a diverse consortium to build a tran...
In sharing we trust. Taking advantage of a diverse consortium to build a tran...
 
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions..."How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
"How we created an SRE team in Temabit as a part of FOZZY Group in conditions...
 
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
AI MODELS USAGE IN FINTECH PRODUCTS: PM APPROACH & BEST PRACTICES by Kasthuri...
 
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
Automation Ops Series: Session 1 - Introduction and setup DevOps for UiPath p...
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial Intelligence
 
LF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIELF Energy Webinar: Introduction to TROLIE
LF Energy Webinar: Introduction to TROLIE
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
 
Bit N Build Poland
Bit N Build PolandBit N Build Poland
Bit N Build Poland
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
Enterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book ReviewEnterprise Architecture As Strategy - Book Review
Enterprise Architecture As Strategy - Book Review
 
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
 
The Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product SchoolThe Future of Product, by Founder & CEO, Product School
The Future of Product, by Founder & CEO, Product School
 
Confoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data scienceConfoo 2024 Gettings started with OpenAI and data science
Confoo 2024 Gettings started with OpenAI and data science
 
How AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptxHow AI and ChatGPT are changing cybersecurity forever.pptx
How AI and ChatGPT are changing cybersecurity forever.pptx
 
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docxLeveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
Leveraging SLF4j for Effective Logging in IBM App Connect Enterprise.docx
 
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
ASTRAZENECA. Knowledge Graphs Powering a Fast-moving Global Life Sciences Org...
 
"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev"AIRe - AI Reliability Engineering", Denys Vasyliev
"AIRe - AI Reliability Engineering", Denys Vasyliev
 
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
Digital Transformation Strategy & Plan Templates - www.beyondthecloud.digital...
 
Apex Replay Debugger and Salesforce Platform Events.pptx
Apex Replay Debugger and Salesforce Platform Events.pptxApex Replay Debugger and Salesforce Platform Events.pptx
Apex Replay Debugger and Salesforce Platform Events.pptx
 

Balancing the Dimensions of User Intent

  • 1. Trey Grainger Chief Algorithms Officer Balancing the Dimensions of User Intent October 28, 2019
  • 2. Trey Grainger Chief Algorithms Officer • Previously: SVP of Engineering @ Lucidworks; Director of Engineering @ CareerBuilder • Georgia Tech – MBA, Management of Technology • Furman University – BA, Computer Science, Business, & Philosophy • Stanford University – Information Retrieval & Web Search Other fun projects: • Co-author of Solr in Action, plus numerous research publications • Advisor to Presearch, the decentralized search engine • Lucene / Solr contributor About Me
  • 3. • About Lucidworks • What is AI-powered Search? • The Dimensions of User Intent • Content Understanding: • Keyword Search • User Understanding: • Collaborative Recommendations • Content Understanding + User Understanding: • Personalized Search • Domain Understanding: • Knowledge Graphs • Domain Understanding + User Understanding: • Domain-aware Matching • Content Understanding + Domain Understanding: • Semantic Search • Balancing Approaches: • Keyword vs. Vector vs. Knowledge Graph Search • Vector Search • Knowledge Graph Search • Combining it all together Agenda
  • 4. Who are we? 300+ CUSTOMERS ACROSS THE FORTUNE 1000 400+EMPLOYEES OFFICES IN San Francisco, CA (HQ) Raleigh-Durham, NC Cambridge, UK Bangalore, India Hong Kong The Search & AI Conference COMPANY BEHIND D E V E L O P M E N T, H O S T I N G , & S U P P O R T
  • 5. Proudly built with open-source tech at its core: Apache Solr & Apache Spark Personalizes search with applied machine learning Proven on the world’s biggest information systems
  • 7. http://aiPoweredSearch.com ... is my new book! (Haystack discount code: ctwhay19)
  • 10. AI-powered Search Question / Answer Systems Virtual Assistants • Signals Boosting Models • Learning to Rank • Semantic Search • Collaborative Filtering • Personalized Search • Content Clustering • NLP / Entity Resolution • Semantic Knowledge Graphs • Document Classification • etc. • Neural Search • Word Embeddings • Vector Search • Image / Voice Search • etc. • Question / Answer Systems • Virtual Assistants • Chatbots • Rules-based Relevancy • etc.
  • 11. We have a big toolbox - great!
  • 12. But how do we properly apply those tools?
  • 13. Dimensions of User Intent Content Understanding Domain Understanding User Understanding User Intent
  • 14. Keyword Search Dimensions of User Intent Content Understanding Domain Understanding User Understanding User Intent
  • 15. /solr/collection/select/?q=apache solr Term Documents … … apache doc1, doc3, doc4, doc5 … lucene doc2, doc4, doc6 … … solr doc1, doc3, doc4, doc7, doc8 … … doc5 doc7 doc8 doc1 doc3 doc4 solr apache apache solr Matching queries to documents
  • 16. BM25 (Relevance Scoring between Query and Documents) Score(q, d) = ∑ idf(t) · ( tf(t in d) · (k + 1) ) / ( tf(t in d) + k · (1 – b + b · |d| / avgdl ) t in q Where: t = term; d = document; q = query; i = index tf(t in d) = numTermOccurrencesInDocument ½ idf(t) = 1 + log (numDocs / (docFreq + 1)) |d| = ∑ 1 t in d avgdl = = ( ∑ |d| ) / ( ∑ 1 ) ) d in i d in i k = Free parameter. Usually ~1.2 to 2.0. Increases term frequency saturation point. b = Free parameter. Usually ~0.75. Increases impact of document normalization.
  • 17. ipad
  • 18. Keyword Search Dimensions of User Intent Content Understanding Domain Understanding Collaborative Recommendations User Understanding User Intent
  • 20. Collaborative Filtering (Recommendations) User Searches User Sees Results User takes an action Users’ actions inform system improvements User Query Results Alonzo ipad doc10, doc22, doc12, … Elena printer doc84, doc2, doc17, … Ming ipad doc10, doc22, doc12, … … … … User Action Document Alonzo click doc22 Elena click doc17 Ming click doc12 Alonzo purchase doc22 Ming click doc22 Ming purchase doc12 Elena click doc2 … … … User Item Weight Alonzo doc22 1.0 Alonzo doc12 0.4 … … … Ming doc12 0.9 Ming doc22 0.6 … … … ipad ⌕ Matrix Factorization Recommendations for Alonzo: • doc22: “iPad Pro” • doc12: “Kindle Fire” …
  • 21. Recommendations (User-Item, Item-Item, Query-Item) User Item Weight Alonzo doc22 1.0 Alonzo doc12 0.4 … … … Ming doc12 0.9 Ming doc22 0.6 … … … Recommendations for Alonzo: • doc22: “iPad Pro” • doc12: “Kindle Fire” … Item Item Weight doc22 doc22 1.0 doc22 doc12 0.85 … … … doc12 doc12 1.0 doc12 doc22 0.83 … … … Query Item Weight ipad doc22 0.98 ipad doc12 0.6 … … … kindle doc12 0.96 apple doc22 0.90 … … … Recommendations for Doc22: • doc22: “iPad Pro” • doc12: “Kindle Fire” … Recommendations for “ipad”: • doc22: “iPad Pro” • doc12: “Kindle Fire” … Matrix Factorization
  • 22. ipad
  • 24. ipad
  • 25. Keyword Search Knowledge Graph Dimensions of User Intent Content Understanding Domain Understanding Collaborative Recommendations User Understanding User Intent
  • 26. What is a Knowledge Graph? (vs. Ontology vs. Taxonomy vs. Synonyms, etc.)
  • 27. Overly Simplistic Definitions Alternative Labels: Substitute words with identical meanings [ CTO => Chief Technology Officer; specialise => specialize ] Synonyms List: Provides substitute words that can be used to represent the same or very similar things [ human => homo sapien, mankind; food => sustenance, meal ] Taxonomy: Classifies things into Categories [ john is Human; Human is Mammal; Mammal is Animal ] Ontology: Defines relationships between types of things [ animal eats food; human is animal ] Knowledge Graph: Instantiation of an Ontology (contains the things that are related) [ john is human; john eats food ] A Knowledge Graph subsumes the other types.
  • 30. Keyword Search Knowledge Graph User Intent Personalized Search Dimensions of User Intent Content Understanding Domain Understanding Collaborative Recommendations User Understanding
  • 32. Keyword Search (Completely User-specified) User-guided Recommendations (Mostly driven by user profile, partially user-specified) Traditional Recommendations (Completely driven by user behavior) Keyword Search (Completely User-specified) Personalized Queries (Mostly user-specified, partially driven by user profile)
  • 33. Personalized Queries (Mostly user-specified, partially driven by user profile) Keyword Search (Completely User-specified) User-guided Recommendations (Mostly driven by user profile, partially user-specified) Traditional Recommendations (Completely driven by user behavior) Personalized Search
  • 37. Regular Search Results: Personalized Search Results: User:
  • 38. Nice - personalization is awesome! Let’s roll it out everywhere!
  • 40. Keyword Search Knowledge Graph User Intent Personalized Search Domain-aware Matching Dimensions of User Intent Content Understanding Domain Understanding Collaborative Recommendations User Understanding
  • 41. Knowledge Graph (Understanding conceptual and logical relationships between domain-specific entities) Collaborative Recommendations (Completely driven by user behavior)
  • 42. Personas / User Profiles (User attributes and preferences in knowledge graph) Multimodal Recommendations (Recommendations combining collaborative filtering plus user-based profile attribute matching/ranking) Knowledge Graph (Understanding conceptual and logical relationships between domain-specific entities) Collaborative Recommendations (Completely driven by user behavior)
  • 43. Personas / User Profiles (User attributes and preferences in knowledge graph) Multimodal Recommendations (Recommendations combining collaborative filtering plus user-based profile attribute matching/ranking) Knowledge Graph (Understanding conceptual and logical relationships between domain-specific entities) Collaborative Recommendations (Completely driven by user behavior) Domain-aware Matching
  • 48. http://localhost:8983/solr/jobs/select/? fl=jobtitle,city,state,salary& q=( jobtitle:"nurse educator"^25 OR jobtitle:(nurse educator)^10 ) AND ( (city:"Boston" AND state:"MA")^15 OR state:"MA") AND _val_:"map(salary, 40000, 60000,10, 0)" AND similar_users:{!terms}u99,u1,u50,u2311,u253,u70,u99 *Example derived from chapter 16 of Solr in Action Multimodal Recommendations Jane is a nurse educator in Boston seeking between $40K and $60K She has interacted with the same content as the following users: u99,u1,u50,u2311,u253,u70,u99
  • 49. Keyword Search Knowledge Graph User Intent Personalized Search Semantic Search Domain-aware Matching Dimensions of User Intent Content Understanding Domain Understanding Collaborative Recommendations User Understanding
  • 50. Keyword Search (Finding and Ranking Keyword) Knowledge Graph (Understanding conceptual and logical relationships between domain-specific entities)
  • 51. Language Understanding (Understanding syntax and query structure) Keyword Search (Finding and Ranking Keyword) Terminology Understanding (Understanding domain-specific terms and conceptual meaning) Knowledge Graph (Understanding conceptual and logical relationships between domain-specific entities)
  • 52. Language Understanding (Understanding syntax and query structure) Terminology Understanding (Understanding domain-specific terms and conceptual meaning) Keyword Search (Finding and Ranking Keyword) Knowledge Graph (Understanding conceptual and logical relationships between domain-specific entities) Semantic Search
  • 55. Sentence Embeddings: [ 2, 3, 2, 4, 2, 1, 5, 3 ] [ 5, 3, 2, 3, 4, 0, 3, 4 ] . . . Document Embedding: [ 4, 1, 4, 2, 1, 2, 4, 3 ] Word Embeddings: [ 5, 1, 3, 4, 2, 1, 5, 3 ] [ 4, 1, 3, 0, 1, 1, 4, 2 ] . . . Paragraph Embeddings: [ 5, 1, 4, 1, 0, 2, 4, 0 ] [ 1, 1, 4, 2, 1, 0, 0, 0 ] . . . Thought Vectors
  • 56. apple caffeine cheese coffee drink donut food juice pizza tea water … term N cappuccino 0 0 0 0 0 0 0 0 0 0 0 ... apple 1 0 0 0 0 0 0 0 0 0 0 ... juice 0 0 0 0 0 0 0 1 0 0 0 ... cheese 0 0 1 0 0 0 0 0 0 0 0 ... pizza 0 0 0 0 0 0 0 0 1 0 0 ... donut 0 0 0 0 0 1 0 0 0 0 0 ... green 0 0 0 0 0 0 0 0 0 0 0 ... tea 0 0 0 0 0 0 0 0 0 1 0 ... bread 0 0 1 0 0 0 0 0 0 0 0 ... sticks 0 0 0 0 0 0 0 0 0 0 0 ... exact term lookup in inverted indexquery Single Term Searches (as a Vector)
  • 57. Combined Vector query Multi-term Query Vectors juice 0 0 0 0 0 0 0 1 0 0 0 ... apple 1 0 0 0 0 0 0 0 0 0 0 ... + apple juice 1 0 0 0 0 0 0 1 0 0 0 ...
  • 58. apple caffeine cheese coffee drink donut food juice pizza tea water … term N latte 0 0 0 0 0 0 0 0 0 0 0 ... cappuccino 0 0 0 0 0 0 0 0 0 0 0 ... apple juice 1 0 0 0 0 0 0 1 0 0 0 ... cheese pizza 0 0 1 0 0 0 0 0 1 0 0 ... donut 0 0 0 0 0 1 0 0 0 0 0 ... soda 0 0 0 0 0 0 0 0 0 0 0 ... green tea 0 0 0 0 0 0 0 0 0 1 0 ... water 0 0 0 0 0 0 0 0 0 0 1 ... cheese bread sticks 0 0 1 0 0 0 0 0 0 0 0 ... cinnamon sticks 0 0 0 0 0 0 0 0 0 0 0 ... exact term lookup in inverted indexquery Multi-term Searches
  • 59. food drink dairy bread caffeine sweet calories healthy apple juice 0 5 0 0 0 4 4 3 cappuccino 0 5 3 0 4 1 2 3 cheese bread sticks 5 0 4 5 0 1 4 2 cheese pizza 5 0 4 4 0 1 5 2 cinnamon bread sticks 5 0 1 5 0 3 4 2 donut 5 0 1 5 0 4 5 1 green tea 0 5 0 0 2 1 1 5 latte 0 5 4 0 4 1 3 3 soda 0 5 0 0 3 5 5 0 water 0 5 0 0 0 0 0 5 Dimensionality Reduction
  • 60. Phrase: Vector: apple juice: [ 0, 5, 0, 0, 0, 4, 4, 3 ] cappuccino: [ 0, 5, 3, 0, 4, 1, 2, 3 ] cheese bread sticks: [ 5, 0, 4, 5, 0, 1, 4, 2 ] cheese pizza: [ 5, 0, 4, 4, 0, 1, 5, 2 ] cinnamon bread sticks: [ 5, 0, 4, 5, 0, 1, 4, 2 ] donut: [ 5, 0, 1, 5, 0, 4, 5, 1 ] green tea: [ 0, 5, 0, 0, 2, 1, 1, 5 ] latte: [ 0, 5, 4, 0, 4, 1, 3, 3 ] soda: [ 0, 5, 0, 0, 3, 5, 5, 0 ] water: [ 0, 5, 0, 0, 0, 0, 0, 5 ] Ranked Results: Green Tea 0.94 water 0.85 cappuccino 0.80 latte 0.78 apple juice 0.60 soda … … 0.19 donut Vector Similarity Scores: Vector Similarity (a, b): cos(θ) = a · b |a| × |b| Ranked Results: Cheese Pizza 0.99 cheese bread sticks 0.91 cinnamon bread sticks 0.89 donut 0.47 latte 0.46 apple juice … … 0.19 water Vector Similarity Scoring
  • 61. Vector Similarity Scores: Performance Considerations Problem: Vector Scoring is Slow • Unlike keyword search, which looks up pre-indexed answers to queries, Vector Search must instead calculate similarities between the query vector and every document’s vectors to determine best matches, which is slow at scale. Solution: Quantized Vectors • “Quantization” is the process for mapping vectors features to discrete values. • Creating “tokens” which map to a similar vector space, enables matching on those tokens to perform an ANN (Approximate Nearest Neighbor) search • This enables converting vector scoring into a search problem (term lookup and scoring), which is fast again, at the expense of some recall and scoring accuracy Recommended Approach: Quantized Vector Search + Vector Similarity Reranking • Combine the best of both worlds by running an initial ANN search on a quantized vector representation, and then re-rank the top-N results using full Vector similarity scoring.
  • 63. Option 1: Streaming Expressions
  • 64. curl -X POST -H "Content-Type: application/json" http://localhost:8983/solr/food/update?commit=true --data-binary ' [ {"id": "1", "name_s":"donut", "vector_fs":[5.0,0.0,1.0,5.0,0.0,4.0,5.0,1.0]}, {"id": "2", "name_s":"apple juice", "vector_fs":[1.0,5.0,0.0,0.0,0.0,4.0,4.0,3.0]}, {"id": "3", "name_s":"cappuccino", "vector_fs":[0.0,5.0,3.0,0.0,4.0,1.0,2.0,3.0]}, {"id": "4", "name_s":"cheese pizza", "vector_fs":[5.0,0.0,4.0,4.0,0.0,1.0,5.0,2.0]}, {"id": "5", "name_s":"green tea", "vector_fs":[0.0,5.0,0.0,0.0,2.0,1.0,1.0,5.0]}, {"id": "6", "name_s":"latte", "vector_fs":[0.0,5.0,4.0,0.0,4.0,1.0,3.0,3.0]}, {"id": "7", "name_s":"soda", "vector_fs":[0.0,5.0,0.0,0.0,3.0,5.0,5.0,0.0]}, {"id": "8", "name_s":"cheese bread sticks", "vector_fs":[5.0,0.0,4.0,5.0,0.0,1.0,4.0,2.0]}, {"id": "9", "name_s":"water", "vector_fs":[0.0,5.0,0.0,0.0,0.0,0.0,0.0,5.0]}, {"id": "10", "name_s":"cinnamon bread sticks", "vector_fs":[5.0,0.0,1.0,5.0,0.0,3.0,4.0,2.0]} ] ' Send Documents to Solr: Streaming Expressions
  • 65. 8983
  • 67. http://localhost:8983/solr/food/select?q=*:*&fl=id,name_s& fq={!streaming_expression}top( select( search(food, q="*:*", fl="id,vector_fs", sort="id asc"), cosineSimilarity(vector_fs, array(5.1,0.0,1.0,5.0,0.0,4.0,5.0,1.0)) as cos, id), n=5, sort="cos desc” ) { "responseHeader":{ … }, "response":{"numFound":5,"start":0,"docs":[ { "name_s":"donut", "id":"1"}, { "name_s":"apple juice", "id":"2"}, { "name_s":"cheese pizza", "id":"4"}, { "name_s":"cheese bread sticks", "id":"8"}, { "name_s":"cinnamon bread sticks", "id":"10"}] }} Request: Response: Streaming Expressions Query Parser
  • 68. Option 3: Solr Vector Scoring Plugin
  • 69. Send Documents to Solr: curl -X POST -H "Content-Type: application/json" http://localhost:8983/solr/{your-collection-name}/update?commit=true -- data-binary ‘ [ {"name":"example 0", "vector":"0|1.55 1|3.53 2|2.3 3|0.7 4|3.44 5|2.33"}, {"name":"example 1", "vector":"0|3.54 1|0.4 2|4.16 3|4.88 4|4.28 5|4.25"}, {"name":"example 2", "vector":"0|1.11 1|0.6 2|1.47 3|1.99 4|2.91 5|1.01"}, {"name":"example 3", "vector":"0|0.06 1|4.73 2|0.29 3|1.27 4|0.69 5|3.9"}, {"name":"example 4", "vector":"0|4.01 1|3.69 2|2 3|4.36 4|1.09 5|0.1"}, {"name":"example 5", "vector":"0|0.64 1|3.95 2|1.03 3|1.65 4|0.99 5|0.09"} ]' Solr Vector Scoring Plugin
  • 70. Request: Response: http://localhost:8983/solr/{your-collection-name}/query?fl=name,score,vector&q={!vp f=vector vector="0.1,4.75,0.3,1.2,0.7,4.0” } { "responseHeader":{ "status":0, "QTime":1}}, "response":{ "numFound":6,"start":0,"maxScore":0.99984086, "docs":[ { "name":["example 3"], "vector":["0|0.06 1|4.73 2|0.29 3|1.27 4|0.69 5|3.9 "], "score":0.99984086}, { "name":["example 0"], "vector":["0|1.55 1|3.53 2|2.3 3|0.7 4|3.44 5|2.33 "], "score":0.7693964}, { "name":["example 5"], "vector":["0|0.64 1|3.95 2|1.03 3|1.65 4|0.99 5|0.09 "], "score":0.76322395}, { "name":["example 4"], "vector":["0|4.01 1|3.69 2|2 3|4.36 4|1.09 5|0.1 "], "score":0.5328145}, { "name":["example 1"], "vector":["0|3.54 1|0.4 2|4.16 3|4.88 4|4.28 5|4.25 "], "score":0.48513117}, { "name":["example 2"], "vector":["0|1.11 1|0.6 2|1.47 3|1.99 4|2.91 5|1.01 "], "score":0.44909418}] }} Solr Vector Scoring Plugin
  • 71. Option 4: Solr Vector Scoring + LSH Plugin
  • 72. Send Documents to Solr: Solr Vector Scoring + LSH Plugin curl -X POST -H "Content-Type: application/json" http://localhost:8983/solr/{your-collection- name}/update?update.chain=LSH&commit=true --data-binary ‘ [ {"id":"1", "vector":"1.55,3.53,2.3,0.7,3.44,2.33"}, {"id":"2", "vector":"3.54,0.4,4.16,4.88,4.28,4.25"} ]' http://localhost:8983/solr/{your-collection-name}/query?fl=name,score,vector&q={!vp f=vector vector="1.55,3.53,2.3,0.7,3.44,2.33" lsh="true" reRankDocs="5"}&fl=name,score,vector,_vector_,_lsh_hash_ Request:
  • 73. Response: Solr Vector Scoring + LSH Plugin { "responseHeader":{ "status":0, "QTime":8, "response":{"numFound":1,"start":0,"maxScore":36.65736, "docs":[ { "id": "1", "vector":"1.55,3.53,2.3,0.7,3.44,2.33", "_vector_":"/z/GZmZAYeuFQBMzMz8zMzNAXCj2QBUeuA==", "_lsh_hash_":["0_8", "1_35", "2_7", "3_10", "4_2", "5_35", "6_16", "7_30", "8_27", "9_12", "10_7", "11_32", "12_48", "13_36", "14_10", "15_7", "16_42", "17_5", "18_3", "19_2", "20_1", "21_0", "22_24", "23_18", "24_42", "25_31", "26_35", "27_8", "28_1", "29_24", "30_47", "31_14", "32_22", "33_39", "34_0", "35_34", "36_34", "37_39", "38_27", "39_27", "40_45", "41_10", "42_21", "43_34", "44_41", "45_9", "46_31", "47_0", "48_4", "49_43"], "score":36.65736} ] } } http://localhost:8983/solr/{your-collection-name}/query?fl=name,score,vector&q={!vp f=vector vector="1.55,3.53,2.3,0.7,3.44,2.33" lsh="true" reRankDocs="5"}&fl=name,score,vector,_vector_,_lsh_hash_ Request:
  • 74. Option 5 (Work in Progress): First-class Vector Fields in Lucene/Solr
  • 76. ANN Benchmarks (Approximate Nearest Neighbor) https://github.com/erikbern/ann-benchmarks
  • 78. • Take queries, documents, sentences, paragraphs, etc. and transform them into vectors. • Usually leverage deep learning, which can discover rich language usage rules and map them to combinations of features in the vector • Popular Libraries: • Bert • Elmo • Universal Sentence Encoder • Word2Vec • Sentence2Vec • Glove • fastText • many more … Vector Encoders
  • 80. Query Type Likely Outcome Obscure keyword combinations Q. (software OR hardware) AND enginee* • Keyword search succeeds • Vector Search fails Natural Language Queries Q. Can my wife drive on my insurance? • Keyword search might get lucky, but probably fails • Vector Search succeeds Fuzzy Language Queries Q. famous french tower • Keyword search mismatch yields poor results • Vector Search succeeds Structured Relationship Queries Q. popular bbq near Activate • Keyword search fails • Vector search fails • Need a Knowledge Graph! Keyword Search vs. Vector Search
  • 81. Giant Graph of Relationships... Trey Grainger works for Lucidworks. He spoke at the Activate 2019 conference. #Activate19 (Activate) wqs held in Washington, DC September 9-12, 2019. Trey got his masters degree from Georgia Tech. Trey’s Voicemail
  • 83. id: 1 job_title: Software Engineer desc: software engineer at a great company skills: .Net, C#, java id: 2 job_title: Registered Nurse desc: a registered nurse at hospital doing hard work skills: oncology, phlebotemy id: 3 job_title: Java Developer desc: a software engineer or a java engineer doing work skills: java, scala, hibernate field doc term desc 1 a at company engineer great software 2 a at doing hard hospital nurse registered work 3 a doing engineer java or software work job_title 1 Software Engineer … … … Terms-Docs Inverted IndexDocs-Terms Forward IndexDocuments Source: Trey Grainger, Khalifeh AlJadda, Mohammed Korayem, Andries Smith.“The Semantic Knowledge Graph: A compact, auto-generated model for real-time traversal and ranking of any relationship within a domain”. DSAA 2016. Knowledge Graph field term postings list doc pos desc a 1 4 2 1 3 1, 5 at 1 3 2 4 company 1 6 doing 2 6 3 8 engineer 1 2 3 3, 7 great 1 5 hard 2 7 hospital 2 5 java 3 6 nurse 2 3 or 3 4 registered 2 2 software 1 1 3 2 work 2 10 3 9 job_title java developer 3 1 … … … …
  • 84. Related term vector (for query concept expansion) http://localhost:8983/solr/stack-exchange-health/skg
  • 85. Disambiguation by Category Example Meaning 1: Restaurant => bbq, brisket, ribs, pork, … Meaning 2: Outdoor Equipment => bbq, grill, charcoal, propane, …
  • 93. Demo!
  • 94. Demo Data Places (also includes geonames database) Entities (includes search commands) Text Content [ Web crawl of restaurant and product reviews sites ]
  • 95. Solr Knowledge Graph Traversal Query "bbq",
  • 97. Why this Semantic Nuance Matters
  • 98. popular barbeque near Activate (popular same as "good", "top", "best") Hotels near Haystack EU hotels near popular BBQ in Berlin BBQ near airports near Berlin hotels near movie theaters in Berlin … Other Knowledge Graph Search examples:
  • 99. Keyword Search Knowledge Graph User Intent Personalized Search Semantic Search Domain-aware Matching Dimensions of User Intent Content Understanding Domain Understanding Collaborative Recommendations User Understanding
  • 100. News Search : popularity and freshness drive relevance Restaurant Search: geographical proximity and price range are critical Ecommerce: likelihood of a purchase is key Movie search: More popular titles are generally more relevant Job search: category of job, salary range, and geographical proximity matter The right ranking algorithm is domain and context-dependent
  • 101. Example Combining Content + Domain + User Context News website: /select? fq={!cache=false v=$keywords}& q= {!func}scale(query($keywords),0,25) {!func}scale(geodist(),0,25) {!func}recip(rord(publicationDate),1,25,0) {!func}scale(popularity,0,25)& keywords="fall festival"& sfield=location& pt=33.748,-84.391 25% 25% 25% 25% *Example from chapter 16 of Solr in Action
  • 102. But how do we figure out the right balance of weights?
  • 103. Learning to Rank User Searches User Sees Results User takes an action Users’ actions inform system improvements User Query Re Alonzo ipad do do do Elena printer do do do Ming ipad do do do … … … User Action Document Alonzo click doc22 Elena click doc17 Ming click doc12 Alonzo purchase doc22 Ming click doc22 Ming purchase doc22 Elena click doc2 … … … Feature Weight title_match_all_terms 15.25 exact_phrase_match 10 signal_boost 9.5 content_age 9.2 user_geo_distance 6.5 personalization_cat_1 2.8 doc_popularity 2.75 … … ipad ⌕ Initial Results: 1) doc1 2) doc2 3) doc3 Build Ranking Classifier (from Implicit Relevance Judgements) Final Results: 1) doc3 2) doc1 3) doc2
  • 105. We operationalize AI for the largest businesses on the planet.
  • 107. Trey Grainger trey@lucidworks.com @treygrainger Other presentations: http://www.treygrainger.com 40% Discount code: ctwhay19 http://aiPoweredSearch.com http://solrinaction.com Books: Thank You!