4. What I will talk about …
Why does context matter?
Phrase and contextual ambiguities in search
• Recent advances in Query Autofiltering that attack the context
problem by adding “verb/preposition” disambiguation *
Traditional ways of visualizing context in search - forging search “loops”
• Facets
• Typeahead
https://lucidworks.com/blog/2015/11/19/query-autofiltering-chapter-4-a-novel-approach-to-natural-language-processing/*
5. Adding metadata context to Suggestions using Facets
Using Pivot Facets to create semantically rich suggestions
Facets to bring user-centric context to suggestions
• Entitlements: Security trimming of suggestions
• User session context: Dynamic On-The-Fly Predictive Analytics!
What I will talk about …
6. Why Does Context Matter?
Relevance is contextual - relevant to whom under what circumstances?
Language / User Intent / Social and business factors
Ambiguities in search are often due to an failure/inability to detect context.
So, what can we do about this - or is this talk just some obvious hand-waving
BS that we’ve heard a thousand times? Hopefully, not.
But that said - maybe just a little theory first …
7. Contextual Relationships
Semantic Context - Language, Lexicon
User Context - Intent, Agendas,Permissions, Demographics, Location
Social Context - Popularity, Common Behaviors => Recommendations
Business Context - Rules, Organization, Domain, Security
Context == Relationships
Within and between metadata “objects”
Search is an exchange of one metadata object - the query - for others -
the results.
8. Things are related to other Things
Relationships provide context
• Static or known Relationships - defined by a knowledge graph
such as an Ontology
• Discovered Relationships - computed by data mining
Knowledge Graphs - connected-ness
Usage Logs (query logs, other captured events or signals) -
behavioral contexts
Clustering - unsupervised learning algorithms
Natural Language Processing - semantic contexts - noun phrases -
statements
Machine Learning - supervised learning => Feature extraction
9. Apple
Tim Cook
Times Square
Granny Smith
White Album
iPhone Macintosh Computer Tablet Steve Jobs Lisa iTunes
Broadway Wall Street Empire State Building Bronx Zoo
Pie Fritters Season Sauce Cider Picking Tree McIntosh
Records Beatles George Martin Capitol White Album
Feature Sets
10. Resolving Ambiguities
Phrase or syntactic ambiguities - detecting nouns
Autophrasing - unstructured data
Query Autofiltering - structured data
Contextual or semantic ambiguities (subject-verb-object) - detecting intent
Traditional NLP - POS detection, Machine Learning
Query Autofiltering with verb/preposition disambiguation
12. Discovery and Focus
Enough abstractions - give me some examples!
Medical Ontology
Disease
Condition Symptom
Drug
Treatment
13. Query Autofiltering
“songs Eric Clapton wrote” vs. “songs Eric Clapton performed”
Without Verb support get:
(performer_ss:”Eric Clapton” OR composer_ss:”Eric Clapton”) AND composition_type:Song
For either.
With Verb support
Now we get:
songs Eric Clapton wrote => composer_ss:”Eric Clapton” AND composition_type:Song
songs Eric Clapton performed => performer_ss:”Eric Claptpn” AND composition_type:Song
Verb/Preposition context rules
written,wrote,composed =>composer_ss
performed,played,sang,recorded:performer_ss
14. Query Autofiltering
“Bands that Eric Clapton was in”
No context rules (raw autofiltering):
((name_s:Band OR musician_type_ss:Band) AND (name_s:"Eric Clapton" OR
original_performer_s:"Eric Clapton" OR composer_ss:"Eric Clapton" OR
performer_ss:"Eric Clapton" OR groupMembers_ss:"Eric Clapton”))
Add context rule
members,member,was in,is in,who's in,who's in the,is in the,was in the =>
memberOfGroup_ss,groupMembers_ss
((name_s:Band OR musician_type_ss:Band) AND groupMembers_ss:"Eric Clapton")
Verb/Preposition context rules
15. Query Autofiltering Verb/Preposition context rules
Who’s in The Who
raw autofiltering
((name_s:"The Who" OR original_performer_s:"The Who" OR
performer_ss:"The Who" OR memberOfGroup_ss:"The Who”))
16. Query Autofiltering Verb/Preposition context rules
Who’s in The Who
raw autofiltering
((name_s:"The Who" OR original_performer_s:"The Who" OR performer_ss:
"The Who" OR memberOfGroup_ss:"The Who”))
with context rule
members,member,was in,is in,who's in,who's in the,is in the,was in the =>
memberOfGroup_ss,groupMembers_ss
query is now:
(memberOfGroup_ss:"The Who")
17. Query Autofiltering
Drugs that treat abdominal pain
treatment_type:Drug AND has_indication:”abdominal pain”
Drugs that cause abdominal pain
treatment_type:Drug AND has_side_effect:”abdominal pain”
vs.
treatment_type:Drug AND (has_indication:”abdominal pain” OR
has_side_effect:”abdominal pain”)
Verb/Preposition context rules
treat,for,indicated => has_indication
cause,produce => has_side_effect
18. Query Autofiltering
Beatles Songs covered vs Songs Beatles covered
covers by other artists of songs written by the Beatles
vs covers by Beatles of songs by other songwriters
Robert Johnson Songs that Eric Clapton covered
works the same as:
Eric Clapton covers of Robert Johnson Songs
Insomnia Drugs - are just indicated drugs
Noun-Noun Phrases
Robert Johnson Songs
Beatles Songs
Robert Johnson Songs
Insomnia Drugs
covered,covers:performer_ss | version_s:Cover |
original_performer_s:_ENTITY_,recording_type_ss:Song=>original_performer_s:_ENTITY_
19. Facets provide Context
Visualization and the search “conversation”: Discovery and Focus
• Post-query visualization- facet display - aggregated attributes of found things
• Pre-query visualization - query suggestion or typeahead - can use facets too
(stay tuned).
• The Good, The Bad and The Ugly aspects of Facets
New and Improved: Statistics, Analytics and APIs - Oh My!
• Dashboards and Dynamic Business Intelligence
• Heatmap Faceting
• Pivot Facets and Ad-Hoc Object Hierarchies - now with stats!
•JSON Facet API
20. How can we use facets to improve typeahead?
Put more precision and more context into a suggester.
=> Using metadata - guide the user to more precise queries
that we can be really GOOD at!
To do this, we can build a specialized suggester collection - then
we can use facet contexts to build semantic and behavioral
relationships within and between searches.
* Shameless Monty Python’s Flying Circus reference
And now for something completely
different! *
21. Suggester Buildware
Query Collectors or Fetchers
Gather sets of query suggestions - Interface with multiple
implementations possible
Suggester Builder
• Validates suggestions
• Adds context to suggestions using faceting
• Submits suggestion and metadata to Solr Index
Query Logs
Terms Component
Curated Lists
Pivot Facet CollectorPivot Facet Collector
Databases - SQL or Not
22. Pivot Facet Query Collector
Uses “Field Pattern Templates” to generate semantically rich suggestions
Structured data - metadata fields contain object attributes
Can combine these attributes into phrases - semantically (or not)
Machine doesn’t know semantics.
Example
Bob Jobs Accountant Cincinnati Ohio
makes sense
Ohio Accountant Jones Cincinnati Bob
doesn’t
first_name last_name occupation city state
23. Pivot Facet Query Collector
${musician_type} ${recording_type}s
${genre} ${musician_type}s
${performer} ${recording_type}s
Rolling Stones Albums
New Wave Songs
Classical Pianists
If we create Pivot Template Patterns like this:
${original_performer} ${recording_type}s covered by ${performer} (plus context)
Beatles Songs covered by Joe Cocker
We get suggestions like this:
${name}
Stuck Inside of Mobile With The Memphis Blues Again
24. Suggester Builder - validate and contextualize
• Validate - make sure that the query works
• Contextualize - use facets to acquire “aboutness” stuff
Tests the query against the content collection
“Stuck Inside of Mobile With The Memphis Blues Again”
composition_type_ss: [
"Song"
]
composer_ss: [
"Bob Dylan"
]
genre_ss: [
"Blues Rock"
"Folk Rock"
]
25. Use Cases - User Context sensitive typeahead
User Permissions: Security Trimming of Suggestions
Faceting on ACL lists of content collection - copy set of ACL values for
suggestion result set to suggester collection
=> Don’t suggest queries that return 0 results for a given user
User Behavior: Dynamic On-The-Fly Predictive Analytics
Cache context facets returned by Suggester - use as boost queries for
subsequent queries in a user session
=> System learns “what” user is looking for
26. Data Quality - Text - Metadata
Data design and curation - solve garbage in - garbage out at the
source.
More fields with more precise values - combine for
expressiveness
The Ole Structured vs Unstructured bugga-boo
Use Machine Learning / Knowledge Base Classification to add
metadata
28. (more)'Structured'Document'
Collec1on'
Query'Autofiltering'
Query'
Solr'/'Lucene''
Result'Set'
Query Autofiltering can be used as a
“normalization” layer for classification
Document)Classifica0on)Stages)
(Manual,ML,Ontology,Hybrid))
Document)Classifica0on)Stages)
(Manual,ML,Ontology,Hybrid))
Document)Classifica0on)Stages)
(Manual,ML,Ontology,Hybrid))
Metadata)Enrichment)
(more))Structured)Document)
Collec0on)–)The)Model!)
=> Can Think of the Solr/Lucene Index itself as the “Model”