SlideShare a Scribd company logo
1 of 29
Download to read offline
Ted Sullivan
Natural Language Search
with Solr
lucidworks.com
Senior Solutions Architect
The take-home word for this talk is:
CONTEXT
What I will talk about …
Why does context matter?
Phrase and contextual ambiguities in search
• Recent advances in Query Autofiltering that attack the context
problem by adding “verb/preposition” disambiguation *
Traditional ways of visualizing context in search - forging search “loops”
• Facets
• Typeahead
https://lucidworks.com/blog/2015/11/19/query-autofiltering-chapter-4-a-novel-approach-to-natural-language-processing/*
Adding metadata context to Suggestions using Facets
Using Pivot Facets to create semantically rich suggestions
Facets to bring user-centric context to suggestions
• Entitlements: Security trimming of suggestions
• User session context: Dynamic On-The-Fly Predictive Analytics!
What I will talk about …
Why Does Context Matter?
Relevance is contextual - relevant to whom under what circumstances?
Language / User Intent / Social and business factors
Ambiguities in search are often due to an failure/inability to detect context.
So, what can we do about this - or is this talk just some obvious hand-waving
BS that we’ve heard a thousand times? Hopefully, not.
But that said - maybe just a little theory first …
Contextual Relationships
Semantic Context - Language, Lexicon
User Context - Intent, Agendas,Permissions, Demographics, Location
Social Context - Popularity, Common Behaviors => Recommendations
Business Context - Rules, Organization, Domain, Security
Context == Relationships
Within and between metadata “objects”
Search is an exchange of one metadata object - the query - for others -
the results.
Things are related to other Things
Relationships provide context
• Static or known Relationships - defined by a knowledge graph
such as an Ontology
• Discovered Relationships - computed by data mining
Knowledge Graphs - connected-ness
Usage Logs (query logs, other captured events or signals) -
behavioral contexts
Clustering - unsupervised learning algorithms
Natural Language Processing - semantic contexts - noun phrases -
statements
Machine Learning - supervised learning => Feature extraction
Apple
Tim Cook
Times Square
Granny Smith
White Album
iPhone Macintosh Computer Tablet Steve Jobs Lisa iTunes
Broadway Wall Street Empire State Building Bronx Zoo
Pie Fritters Season Sauce Cider Picking Tree McIntosh
Records Beatles George Martin Capitol White Album
Feature Sets
Resolving Ambiguities
Phrase or syntactic ambiguities - detecting nouns
Autophrasing - unstructured data
Query Autofiltering - structured data
Contextual or semantic ambiguities (subject-verb-object) - detecting intent
Traditional NLP - POS detection, Machine Learning
Query Autofiltering with verb/preposition disambiguation
Song
Songwriter
Genre
Performer
Recording
Guitarist
Pianist
VocalistProducer
Record Label
Band
Album
Enough abstractions - give me some examples!
Music Ontology
Discovery and Focus
Enough abstractions - give me some examples!
Medical Ontology
Disease
Condition Symptom
Drug
Treatment
Query Autofiltering
“songs Eric Clapton wrote” vs. “songs Eric Clapton performed”
Without Verb support get:
(performer_ss:”Eric Clapton” OR composer_ss:”Eric Clapton”) AND composition_type:Song
For either.
With Verb support
Now we get:
songs Eric Clapton wrote => composer_ss:”Eric Clapton” AND composition_type:Song
songs Eric Clapton performed => performer_ss:”Eric Claptpn” AND composition_type:Song
Verb/Preposition context rules
written,wrote,composed =>composer_ss
performed,played,sang,recorded:performer_ss
Query Autofiltering
“Bands that Eric Clapton was in”
No context rules (raw autofiltering):
((name_s:Band OR musician_type_ss:Band) AND (name_s:"Eric Clapton" OR
original_performer_s:"Eric Clapton" OR composer_ss:"Eric Clapton" OR
performer_ss:"Eric Clapton" OR groupMembers_ss:"Eric Clapton”))
Add context rule
members,member,was in,is in,who's in,who's in the,is in the,was in the =>
memberOfGroup_ss,groupMembers_ss
((name_s:Band OR musician_type_ss:Band) AND groupMembers_ss:"Eric Clapton")
Verb/Preposition context rules
Query Autofiltering Verb/Preposition context rules
Who’s in The Who
raw autofiltering
((name_s:"The Who" OR original_performer_s:"The Who" OR
performer_ss:"The Who" OR memberOfGroup_ss:"The Who”))
Query Autofiltering Verb/Preposition context rules
Who’s in The Who
raw autofiltering
((name_s:"The Who" OR original_performer_s:"The Who" OR performer_ss:
"The Who" OR memberOfGroup_ss:"The Who”))
with context rule
members,member,was in,is in,who's in,who's in the,is in the,was in the =>
memberOfGroup_ss,groupMembers_ss
query is now:
(memberOfGroup_ss:"The Who")
Query Autofiltering
Drugs that treat abdominal pain
treatment_type:Drug AND has_indication:”abdominal pain”
Drugs that cause abdominal pain
treatment_type:Drug AND has_side_effect:”abdominal pain”
vs.
treatment_type:Drug AND (has_indication:”abdominal pain” OR
has_side_effect:”abdominal pain”)
Verb/Preposition context rules
treat,for,indicated => has_indication
cause,produce => has_side_effect
Query Autofiltering
Beatles Songs covered vs Songs Beatles covered
covers by other artists of songs written by the Beatles
vs covers by Beatles of songs by other songwriters
Robert Johnson Songs that Eric Clapton covered
works the same as:
Eric Clapton covers of Robert Johnson Songs
Insomnia Drugs - are just indicated drugs
Noun-Noun Phrases
Robert Johnson Songs
Beatles Songs
Robert Johnson Songs
Insomnia Drugs
covered,covers:performer_ss | version_s:Cover |
original_performer_s:_ENTITY_,recording_type_ss:Song=>original_performer_s:_ENTITY_
Facets provide Context
Visualization and the search “conversation”: Discovery and Focus
• Post-query visualization- facet display - aggregated attributes of found things
• Pre-query visualization - query suggestion or typeahead - can use facets too
(stay tuned).
• The Good, The Bad and The Ugly aspects of Facets
New and Improved: Statistics, Analytics and APIs - Oh My!
• Dashboards and Dynamic Business Intelligence
• Heatmap Faceting
• Pivot Facets and Ad-Hoc Object Hierarchies - now with stats!
•JSON Facet API
How can we use facets to improve typeahead?
Put more precision and more context into a suggester.
=> Using metadata - guide the user to more precise queries
that we can be really GOOD at!
To do this, we can build a specialized suggester collection - then
we can use facet contexts to build semantic and behavioral
relationships within and between searches.
* Shameless Monty Python’s Flying Circus reference
And now for something completely
different! *
Suggester Buildware
Query Collectors or Fetchers
Gather sets of query suggestions - Interface with multiple
implementations possible
Suggester Builder
• Validates suggestions
• Adds context to suggestions using faceting
• Submits suggestion and metadata to Solr Index
Query Logs
Terms Component
Curated Lists
Pivot Facet CollectorPivot Facet Collector
Databases - SQL or Not
Pivot Facet Query Collector
Uses “Field Pattern Templates” to generate semantically rich suggestions
Structured data - metadata fields contain object attributes
Can combine these attributes into phrases - semantically (or not)
Machine doesn’t know semantics.
Example
Bob Jobs Accountant Cincinnati Ohio
makes sense
Ohio Accountant Jones Cincinnati Bob
doesn’t
first_name last_name occupation city state
Pivot Facet Query Collector
${musician_type} ${recording_type}s
${genre} ${musician_type}s
${performer} ${recording_type}s
Rolling Stones Albums
New Wave Songs
Classical Pianists
If we create Pivot Template Patterns like this:
${original_performer} ${recording_type}s covered by ${performer} (plus context)
Beatles Songs covered by Joe Cocker
We get suggestions like this:
${name}
Stuck Inside of Mobile With The Memphis Blues Again
Suggester Builder - validate and contextualize
• Validate - make sure that the query works
• Contextualize - use facets to acquire “aboutness” stuff
Tests the query against the content collection
“Stuck Inside of Mobile With The Memphis Blues Again”
composition_type_ss: [
"Song"
]
composer_ss: [
"Bob Dylan"
]
genre_ss: [
"Blues Rock"
"Folk Rock"
]
Use Cases - User Context sensitive typeahead
User Permissions: Security Trimming of Suggestions
Faceting on ACL lists of content collection - copy set of ACL values for
suggestion result set to suggester collection
=> Don’t suggest queries that return 0 results for a given user
User Behavior: Dynamic On-The-Fly Predictive Analytics
Cache context facets returned by Suggester - use as boost queries for
subsequent queries in a user session
=> System learns “what” user is looking for
Data Quality - Text - Metadata
Data design and curation - solve garbage in - garbage out at the
source.
More fields with more precise values - combine for
expressiveness
The Ole Structured vs Unstructured bugga-boo
Use Machine Learning / Knowledge Base Classification to add
metadata
“MODEL”(
Machine(Learning(
Subject(Ma6er(Experts(
Model Building
Training'Set'–'“Seed'Crystal”'Subject'Ma8er'Experts'
Machine'Learning'
Model'
QUERY& DOCUMENTS&
Yes$
No$
Feature&Sets&
Model: Mapping of Text => Feature Sets
Detecting and Consuming Context
(more)'Structured'Document'
Collec1on'
Query'Autofiltering'
Query'
Solr'/'Lucene''
Result'Set'
Query Autofiltering can be used as a
“normalization” layer for classification
Document)Classifica0on)Stages)
(Manual,ML,Ontology,Hybrid))
Document)Classifica0on)Stages)
(Manual,ML,Ontology,Hybrid))
Document)Classifica0on)Stages)
(Manual,ML,Ontology,Hybrid))
Metadata)Enrichment)
(more))Structured)Document)
Collec0on)–)The)Model!)
=> Can Think of the Solr/Lucene Index itself as the “Model”
Thank you!
lucidworks.com
Ted Sullivan
Senior Solutions Architect

More Related Content

What's hot

Sourcingrecruitinggooglelive 1232145650825055 3
Sourcingrecruitinggooglelive 1232145650825055 3Sourcingrecruitinggooglelive 1232145650825055 3
Sourcingrecruitinggooglelive 1232145650825055 3
pallaviksrikanth
 
Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologies
Elena Simperl
 
Internet Search Methods
Internet Search MethodsInternet Search Methods
Internet Search Methods
wmassie
 

What's hot (20)

Google searching techniques
Google searching techniquesGoogle searching techniques
Google searching techniques
 
Introduction to boolean search
Introduction to boolean searchIntroduction to boolean search
Introduction to boolean search
 
Google Is a Two Page Site
Google Is a Two Page SiteGoogle Is a Two Page Site
Google Is a Two Page Site
 
Techniques For Deep Query Understanding
Techniques For Deep Query UnderstandingTechniques For Deep Query Understanding
Techniques For Deep Query Understanding
 
Searching techniques
Searching techniquesSearching techniques
Searching techniques
 
Modeling Ontologies with Natural Language
Modeling Ontologies with Natural LanguageModeling Ontologies with Natural Language
Modeling Ontologies with Natural Language
 
Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014Introduction to Ontology Engineering with Fluent Editor 2014
Introduction to Ontology Engineering with Fluent Editor 2014
 
A Search Engine Syntax
A Search Engine SyntaxA Search Engine Syntax
A Search Engine Syntax
 
Sourcingrecruitinggooglelive 1232145650825055 3
Sourcingrecruitinggooglelive 1232145650825055 3Sourcingrecruitinggooglelive 1232145650825055 3
Sourcingrecruitinggooglelive 1232145650825055 3
 
Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologies
 
Internet Search Methods
Internet Search MethodsInternet Search Methods
Internet Search Methods
 
Google search tips
Google search tipsGoogle search tips
Google search tips
 
Search Google Like a Pro
Search Google Like a ProSearch Google Like a Pro
Search Google Like a Pro
 
NLTK
NLTKNLTK
NLTK
 
Using and learning phrases
Using and learning phrasesUsing and learning phrases
Using and learning phrases
 
Text Mining
Text MiningText Mining
Text Mining
 
Google BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual ConferenceGoogle BERT - SMX London 2020 Virtual Conference
Google BERT - SMX London 2020 Virtual Conference
 
Enhance discovery Solr and Mahout
Enhance discovery Solr and MahoutEnhance discovery Solr and Mahout
Enhance discovery Solr and Mahout
 
Google BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to KnowGoogle BERT - What SEOs and Marketers Need to Know
Google BERT - What SEOs and Marketers Need to Know
 
Google search techniques
Google search techniquesGoogle search techniques
Google search techniques
 

Viewers also liked

Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Lucidworks
 
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Lucidworks
 

Viewers also liked (20)

Natural Language Search in Solr
Natural Language Search in SolrNatural Language Search in Solr
Natural Language Search in Solr
 
Using OpenNLP with Solr to improve search relevance and to extract named enti...
Using OpenNLP with Solr to improve search relevance and to extract named enti...Using OpenNLP with Solr to improve search relevance and to extract named enti...
Using OpenNLP with Solr to improve search relevance and to extract named enti...
 
Semantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/SolrSemantic & Multilingual Strategies in Lucene/Solr
Semantic & Multilingual Strategies in Lucene/Solr
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Introduction to Patent Searching
Introduction to Patent SearchingIntroduction to Patent Searching
Introduction to Patent Searching
 
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
Never Stop Exploring - Pushing the Limits of Solr: Presented by Anirudha Jadh...
 
Running Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDBRunning Natural Language Queries on MongoDB
Running Natural Language Queries on MongoDB
 
Semantic & Multilingual Strategies in Lucene/Solr: Presented by Trey Grainger...
Semantic & Multilingual Strategies in Lucene/Solr: Presented by Trey Grainger...Semantic & Multilingual Strategies in Lucene/Solr: Presented by Trey Grainger...
Semantic & Multilingual Strategies in Lucene/Solr: Presented by Trey Grainger...
 
Webinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with SolrWebinar: Simpler Semantic Search with Solr
Webinar: Simpler Semantic Search with Solr
 
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, ClouderaReal-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
Real-Time Analytics with Solr: Presented by Yonik Seeley, Cloudera
 
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
Large Scale Log Analytics with Solr: Presented by Rafał Kuć & Radu Gheorghe, ...
 
Shrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLPShrinking the Haystack" using Solr and OpenNLP
Shrinking the Haystack" using Solr and OpenNLP
 
Webinar: Site Search in an Hour with Fusion
Webinar: Site Search in an Hour with FusionWebinar: Site Search in an Hour with Fusion
Webinar: Site Search in an Hour with Fusion
 
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
Galene - LinkedIn's Search Architecture: Presented by Diego Buthay & Sriram S...
 
Solr5
Solr5Solr5
Solr5
 
Apache solr
Apache solrApache solr
Apache solr
 
Webinar: Solr's example/files: From bin/post to /browse and Beyond
Webinar: Solr's example/files: From bin/post to /browse and BeyondWebinar: Solr's example/files: From bin/post to /browse and Beyond
Webinar: Solr's example/files: From bin/post to /browse and Beyond
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
 
Webinar: What's New in Solr 6
Webinar: What's New in Solr 6Webinar: What's New in Solr 6
Webinar: What's New in Solr 6
 
Retrieving Information From Solr
Retrieving Information From SolrRetrieving Information From Solr
Retrieving Information From Solr
 

Similar to Webinar: Natural Language Search with Solr

Dealing with a search engine in your application - a Solr approach for beginners
Dealing with a search engine in your application - a Solr approach for beginnersDealing with a search engine in your application - a Solr approach for beginners
Dealing with a search engine in your application - a Solr approach for beginners
Elaine Naomi
 

Similar to Webinar: Natural Language Search with Solr (20)

NLP in Practice - Part II
NLP in Practice - Part IINLP in Practice - Part II
NLP in Practice - Part II
 
Dealing with a search engine in your application - a Solr approach for beginners
Dealing with a search engine in your application - a Solr approach for beginnersDealing with a search engine in your application - a Solr approach for beginners
Dealing with a search engine in your application - a Solr approach for beginners
 
NLP
NLPNLP
NLP
 
NLP
NLPNLP
NLP
 
Music Therapy Bi Fall 2005
Music Therapy Bi Fall 2005Music Therapy Bi Fall 2005
Music Therapy Bi Fall 2005
 
Finding Web Resources
Finding Web ResourcesFinding Web Resources
Finding Web Resources
 
A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, LucidworksA Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
A Multifaceted Look At Faceting - Ted Sullivan, Lucidworks
 
Why Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspectiveWhy Watson Won: A cognitive perspective
Why Watson Won: A cognitive perspective
 
Relation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant SupervisionRelation Extraction from the Web using Distant Supervision
Relation Extraction from the Web using Distant Supervision
 
Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)Social Tags and Music Information Retrieval (Part II)
Social Tags and Music Information Retrieval (Part II)
 
Setlistör Demo Slides
Setlistör Demo SlidesSetlistör Demo Slides
Setlistör Demo Slides
 
Martina Welander - Google is a two pagesite
Martina Welander - Google is a two pagesiteMartina Welander - Google is a two pagesite
Martina Welander - Google is a two pagesite
 
Opinion Mining
Opinion MiningOpinion Mining
Opinion Mining
 
You've Got (Big) Data! Now What?
You've Got (Big) Data! Now What?You've Got (Big) Data! Now What?
You've Got (Big) Data! Now What?
 
Spotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendationsSpotify Discover Weekly: The machine learning behind your music recommendations
Spotify Discover Weekly: The machine learning behind your music recommendations
 
Key Phrases for Better Search
Key Phrases for Better SearchKey Phrases for Better Search
Key Phrases for Better Search
 
Data Science Your Vacation
Data Science Your VacationData Science Your Vacation
Data Science Your Vacation
 
Semantic Search Component
Semantic Search ComponentSemantic Search Component
Semantic Search Component
 
Evolution of Search
Evolution of SearchEvolution of Search
Evolution of Search
 
Data Science Your Vacation
Data Science Your VacationData Science Your Vacation
Data Science Your Vacation
 

More from Lucidworks

Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Recently uploaded (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

Webinar: Natural Language Search with Solr

  • 1.
  • 2. Ted Sullivan Natural Language Search with Solr lucidworks.com Senior Solutions Architect
  • 3. The take-home word for this talk is: CONTEXT
  • 4. What I will talk about … Why does context matter? Phrase and contextual ambiguities in search • Recent advances in Query Autofiltering that attack the context problem by adding “verb/preposition” disambiguation * Traditional ways of visualizing context in search - forging search “loops” • Facets • Typeahead https://lucidworks.com/blog/2015/11/19/query-autofiltering-chapter-4-a-novel-approach-to-natural-language-processing/*
  • 5. Adding metadata context to Suggestions using Facets Using Pivot Facets to create semantically rich suggestions Facets to bring user-centric context to suggestions • Entitlements: Security trimming of suggestions • User session context: Dynamic On-The-Fly Predictive Analytics! What I will talk about …
  • 6. Why Does Context Matter? Relevance is contextual - relevant to whom under what circumstances? Language / User Intent / Social and business factors Ambiguities in search are often due to an failure/inability to detect context. So, what can we do about this - or is this talk just some obvious hand-waving BS that we’ve heard a thousand times? Hopefully, not. But that said - maybe just a little theory first …
  • 7. Contextual Relationships Semantic Context - Language, Lexicon User Context - Intent, Agendas,Permissions, Demographics, Location Social Context - Popularity, Common Behaviors => Recommendations Business Context - Rules, Organization, Domain, Security Context == Relationships Within and between metadata “objects” Search is an exchange of one metadata object - the query - for others - the results.
  • 8. Things are related to other Things Relationships provide context • Static or known Relationships - defined by a knowledge graph such as an Ontology • Discovered Relationships - computed by data mining Knowledge Graphs - connected-ness Usage Logs (query logs, other captured events or signals) - behavioral contexts Clustering - unsupervised learning algorithms Natural Language Processing - semantic contexts - noun phrases - statements Machine Learning - supervised learning => Feature extraction
  • 9. Apple Tim Cook Times Square Granny Smith White Album iPhone Macintosh Computer Tablet Steve Jobs Lisa iTunes Broadway Wall Street Empire State Building Bronx Zoo Pie Fritters Season Sauce Cider Picking Tree McIntosh Records Beatles George Martin Capitol White Album Feature Sets
  • 10. Resolving Ambiguities Phrase or syntactic ambiguities - detecting nouns Autophrasing - unstructured data Query Autofiltering - structured data Contextual or semantic ambiguities (subject-verb-object) - detecting intent Traditional NLP - POS detection, Machine Learning Query Autofiltering with verb/preposition disambiguation
  • 12. Discovery and Focus Enough abstractions - give me some examples! Medical Ontology Disease Condition Symptom Drug Treatment
  • 13. Query Autofiltering “songs Eric Clapton wrote” vs. “songs Eric Clapton performed” Without Verb support get: (performer_ss:”Eric Clapton” OR composer_ss:”Eric Clapton”) AND composition_type:Song For either. With Verb support Now we get: songs Eric Clapton wrote => composer_ss:”Eric Clapton” AND composition_type:Song songs Eric Clapton performed => performer_ss:”Eric Claptpn” AND composition_type:Song Verb/Preposition context rules written,wrote,composed =>composer_ss performed,played,sang,recorded:performer_ss
  • 14. Query Autofiltering “Bands that Eric Clapton was in” No context rules (raw autofiltering): ((name_s:Band OR musician_type_ss:Band) AND (name_s:"Eric Clapton" OR original_performer_s:"Eric Clapton" OR composer_ss:"Eric Clapton" OR performer_ss:"Eric Clapton" OR groupMembers_ss:"Eric Clapton”)) Add context rule members,member,was in,is in,who's in,who's in the,is in the,was in the => memberOfGroup_ss,groupMembers_ss ((name_s:Band OR musician_type_ss:Band) AND groupMembers_ss:"Eric Clapton") Verb/Preposition context rules
  • 15. Query Autofiltering Verb/Preposition context rules Who’s in The Who raw autofiltering ((name_s:"The Who" OR original_performer_s:"The Who" OR performer_ss:"The Who" OR memberOfGroup_ss:"The Who”))
  • 16. Query Autofiltering Verb/Preposition context rules Who’s in The Who raw autofiltering ((name_s:"The Who" OR original_performer_s:"The Who" OR performer_ss: "The Who" OR memberOfGroup_ss:"The Who”)) with context rule members,member,was in,is in,who's in,who's in the,is in the,was in the => memberOfGroup_ss,groupMembers_ss query is now: (memberOfGroup_ss:"The Who")
  • 17. Query Autofiltering Drugs that treat abdominal pain treatment_type:Drug AND has_indication:”abdominal pain” Drugs that cause abdominal pain treatment_type:Drug AND has_side_effect:”abdominal pain” vs. treatment_type:Drug AND (has_indication:”abdominal pain” OR has_side_effect:”abdominal pain”) Verb/Preposition context rules treat,for,indicated => has_indication cause,produce => has_side_effect
  • 18. Query Autofiltering Beatles Songs covered vs Songs Beatles covered covers by other artists of songs written by the Beatles vs covers by Beatles of songs by other songwriters Robert Johnson Songs that Eric Clapton covered works the same as: Eric Clapton covers of Robert Johnson Songs Insomnia Drugs - are just indicated drugs Noun-Noun Phrases Robert Johnson Songs Beatles Songs Robert Johnson Songs Insomnia Drugs covered,covers:performer_ss | version_s:Cover | original_performer_s:_ENTITY_,recording_type_ss:Song=>original_performer_s:_ENTITY_
  • 19. Facets provide Context Visualization and the search “conversation”: Discovery and Focus • Post-query visualization- facet display - aggregated attributes of found things • Pre-query visualization - query suggestion or typeahead - can use facets too (stay tuned). • The Good, The Bad and The Ugly aspects of Facets New and Improved: Statistics, Analytics and APIs - Oh My! • Dashboards and Dynamic Business Intelligence • Heatmap Faceting • Pivot Facets and Ad-Hoc Object Hierarchies - now with stats! •JSON Facet API
  • 20. How can we use facets to improve typeahead? Put more precision and more context into a suggester. => Using metadata - guide the user to more precise queries that we can be really GOOD at! To do this, we can build a specialized suggester collection - then we can use facet contexts to build semantic and behavioral relationships within and between searches. * Shameless Monty Python’s Flying Circus reference And now for something completely different! *
  • 21. Suggester Buildware Query Collectors or Fetchers Gather sets of query suggestions - Interface with multiple implementations possible Suggester Builder • Validates suggestions • Adds context to suggestions using faceting • Submits suggestion and metadata to Solr Index Query Logs Terms Component Curated Lists Pivot Facet CollectorPivot Facet Collector Databases - SQL or Not
  • 22. Pivot Facet Query Collector Uses “Field Pattern Templates” to generate semantically rich suggestions Structured data - metadata fields contain object attributes Can combine these attributes into phrases - semantically (or not) Machine doesn’t know semantics. Example Bob Jobs Accountant Cincinnati Ohio makes sense Ohio Accountant Jones Cincinnati Bob doesn’t first_name last_name occupation city state
  • 23. Pivot Facet Query Collector ${musician_type} ${recording_type}s ${genre} ${musician_type}s ${performer} ${recording_type}s Rolling Stones Albums New Wave Songs Classical Pianists If we create Pivot Template Patterns like this: ${original_performer} ${recording_type}s covered by ${performer} (plus context) Beatles Songs covered by Joe Cocker We get suggestions like this: ${name} Stuck Inside of Mobile With The Memphis Blues Again
  • 24. Suggester Builder - validate and contextualize • Validate - make sure that the query works • Contextualize - use facets to acquire “aboutness” stuff Tests the query against the content collection “Stuck Inside of Mobile With The Memphis Blues Again” composition_type_ss: [ "Song" ] composer_ss: [ "Bob Dylan" ] genre_ss: [ "Blues Rock" "Folk Rock" ]
  • 25. Use Cases - User Context sensitive typeahead User Permissions: Security Trimming of Suggestions Faceting on ACL lists of content collection - copy set of ACL values for suggestion result set to suggester collection => Don’t suggest queries that return 0 results for a given user User Behavior: Dynamic On-The-Fly Predictive Analytics Cache context facets returned by Suggester - use as boost queries for subsequent queries in a user session => System learns “what” user is looking for
  • 26. Data Quality - Text - Metadata Data design and curation - solve garbage in - garbage out at the source. More fields with more precise values - combine for expressiveness The Ole Structured vs Unstructured bugga-boo Use Machine Learning / Knowledge Base Classification to add metadata
  • 28. (more)'Structured'Document' Collec1on' Query'Autofiltering' Query' Solr'/'Lucene'' Result'Set' Query Autofiltering can be used as a “normalization” layer for classification Document)Classifica0on)Stages) (Manual,ML,Ontology,Hybrid)) Document)Classifica0on)Stages) (Manual,ML,Ontology,Hybrid)) Document)Classifica0on)Stages) (Manual,ML,Ontology,Hybrid)) Metadata)Enrichment) (more))Structured)Document) Collec0on)–)The)Model!) => Can Think of the Solr/Lucene Index itself as the “Model”