SlideShare a Scribd company logo
STAY CONNECTED
Twitter @activate_conf
Facebook @activateconf
#Activate19
Log in to wifi, follow Activate on social media,
and download the event app where you can
submit an evaluation after the session
WIFI NETWORK: Activate2019
PASSWORD: Lucidworks
DOWNLOAD THE ACTIVATE 2019 MOBILE APP
Search Activate2019 in the App/Play store
Or visit: http://crowd.cc/activate19
Elevation query
Solr plugin
Speaker Slide
R O B E R T
K I R C H G E S S N E R
Search Technology Architect
Wolters Kluwer
E X P E R I E N C E
• Search Algorithms Development
• Content Analysis
• Entity Recognition
• Solr plugins / extensions
• Strong software development experience for about 14 years in different commercial projects
• Last 4 years working on search expertise, particularly with Apache Solr and cloud-based solution for this
including availability and scalability.
• Customers: Wolters Kluwer, TRAFIGURA, Daadkracht...
N A Z A R S E N I U K
Lead Software Engineer
EPAM
Agenda
• Motivation
• Implementation Idea
• Implementation status
• Case study autosuggest
• Summary
Some background
• Developing search applications for legal market since 2003
• Inhomogeneous, structured content, rich metadata (laws, cases, commentaries)
• Use of metadata for ranking is essential for good results
• Up to 30% of queries contain legal / other entities
• Relying on query cooking using entity recognition in the user input
• Combining with full text search and tuning the results becomes a challenge
Example
User input: § 123 BGB
Transformed to queries Q1, Q2, Q3, Q4
Expected output:
• § 123 BGB (law document)
• Legal commentary A to § 123 BGB (promoted content)
• Legal commentary B to § 123 BGB (promoted content)
• Some latest cases based on § 123 BGB (relevant content)
• Full text (or whatever needed)
How to achieve?
Requirements
C O N T E N T S T R U C T U R E
• Handle entities in the user input properly: legal citations, locations, dates, names
– e.g. place the correct document cited in the query on the top
– given a book title place an entry document (table of contents) on the top
• Top (1-5) hits expected to be unambiguous
• Use the top slots efficiently (10-100 hits)
• Keep balance between numerous document types (legal cases) and relevant or promoted
document types
Generally more precise control of what is going on in the top 10
Possible solutions
• Boost factors on queries, terms, documents
• Sort fields
• Ranking functions
• Function queries
• Reranking (in Solr or application)
• Filtering
• Multiple requests
Works, but…
• Some are too complex
• Some are too slow
• Others are not reliable
• Missing a concept of subquery:
– tracking from which subquery a document is coming from
• Missing LIMIT as in SQL
Example continued
User input: § 123 BGB
Transformed to queries Q1, Q2, Q3, Q4
Expected output:
• § 123 BGB (law document)
• Legal commentary A to § 123 BGB (promoted content)
• Legal commentary B to § 123 BGB (promoted content)
• Some latest cases based on § 123 BGB (relevant content)
• Full text (or whatever needed)
Want the request look like: Q1 << Q2 << Q3 << Q4
Elevation query
Initial Idea / Specification
Given a list of queries Q1, Q2, …, QN produce a result fulfilling the conditions:
• All the documents of Qn are placed before the documents of Qm for m>n
• Each hit should occur in the leftmost possible subset
• No duplication of hits
• Meaningful scores
• Correct faceting
Elevation query
Additional requirements / expectations
• One request / one pass search
• Usable via some new syntax / parser support
• Implemented as plugin
Furthermore it should be possible to
• impose a limit on the results of each subquery
• provide a sort parameter for each subquery
Implementation
Idea
Where to start
• TopFieldCollector.collect
• TFC manages a priority queue
• The priority queue is parametrized with
size and sorting
• DisjunctionMaxQuery:
– „generates the union of documents produced by
its subqueries“
Q1
8
71
7
28
6
13
5
23
4
50
3
10
2
31
1
23
7
28
6
13
4
50
3
10
1
23
9
66
8
71
7
28
6
13
5
23
4
50
3
10
2
31
1
23
1
23
10
42
9
66
8
71
7
28
6
13
5
23
4
50
3
10
2
31
2
31
11
63
10
42
9
66
8
71
7
28
6
13
5
23
4
50
3
10
3
10
12
19
11
63
10
42
9
66
8
71
7
28
6
13
5
23
4
50
4
50
1
23
3
10
13
36
12
19
11
63
10
42
9
66
8
71
7
28
6
13
5
23
14
47
13
36
12
19
11
63
10
42
9
66
8
71
7
28
6
13
6
13
3
10
15
99
14
47
13
36
12
19
11
63
10
42
9
66
8
71
7
28
7
28
1
23
6
13
Implementation
Idea
Where to go
• Provide more than one queue to collector
• Propagate information from
DisjunctionMaxScorer to the collector
• Some additional bookkeeping
– Scores
– Sort field values
– Subquery index
– (Facets)
Q3
Q2
Q1
87654321 87654321 98765432
1
50
1
-
1
-
1
50
1
50
1
-
1
-
109876543
1
50
2
-
2
43
2
-
1
50
2
-
2
43
2
43
2
-
1110987654
1
50
3
-
2
43
3
31
3
55
1110987654
1
50
3
-
2
43
3
31
3
31
3
55
12111098765
1
50
4
51
2
43
3
31
4
-
4
76
4
51
1
50
4
51
2
43
3
31
4
-
4
76
131211109876
2
43
3
31
5
-
5
-
4
51
1
50
5
74
131211109876
2
43
3
31
5
-
5
-
5
74
4
51
5
74
2
43
3
31
4
76
5
74
4
51
5
74
2
43
3
31
4
76
4
76
5
74
4
51
5
74
1413121110987
2
43
3
31
6
88
4
76
6
-
5
74
4
51
6
-
1413121110987
6
88
2
43
3
31
4
76
6
-
5
74
4
51
6
-
15141312111098
6
88
2
43
3
31
7
12
4
76
7
99
5
74
4
51
7
-
15141312111098
6
88
2
43
3
31
7
12
7
12
4
76
7
99
5
74
4
51
7
-
15141312111098
7
12
15141312111099
6
88
2
43
3
31
7
12
8
-
4
76
8
55
5
74
4
51
8
-
15141312111099
6
88
2
43
3
31
7
12
8
-
4
76
8
55
8
55
5
74
4
51
8
-
15141312111099
Implementation status
• https://github.com/rokirx/solr-eq
• Working
– Collector logic / multiple queues
– Sort and limit parameter per subquery
– Parser support
• In testing
– Correct scoring
– Faceting
– Multiple sort fields per subquery
• Works with 6.4, 7.6, 8.0, 8.2
Case Study: Autosuggest
User Input tax
• Assumptions on the relevancy of completion:
– Highest priority if the term at the beginning and exact match, eg tax relief
– Lower priority exact match but term not at the beginnilng, eg income tax
– Lowest priority prefix match anywhere in the phrase, eg estate taxes
• Map this condition to queries:
– Term at the beginning of a phrase and exact match: ^tax$
– Exact match in the middle of a phrase: tax$
– Prefix match (edge n-gram): tax
Case Study: Autosuggest
User Input tax
• Resulting query: ^tax$ << tax$ << tax guarantees the specified behavior
• Additional benefit: optimize the performance by cancelling out subqueries
– If the exact hit count is not necessary
– And the minimum required number of hits in the preceeding queues is collected
– Stop fetching the docs from lower priority queue by cancelling them out of the collector/scorer
– Whitout missing out any relevant documents
Potential benefits
• Reduce the number of search requests
• Reduce the complexity of the architecture
• Additional dimension to control rank
• Pluggable, easy to evaluate
• Improve performance through runtime subquery cancellation
Summary
It is technically possible to implement a concept of subquery into Solr/Lucene
• Single request / one pass collection of results
• Individual limits on each subquery
• Individual sort parameters on each subquery
• Optimization if no total hits number needed
– cancel lower prioritized subqueries during evaluation without affecting top hits
• Plugin
THANK YOU

More Related Content

What's hot

Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
HostedbyConfluent
 
Telling the LivePerson Technology Story at Couchbase [SF] 2013
Telling the LivePerson Technology Story at Couchbase [SF] 2013Telling the LivePerson Technology Story at Couchbase [SF] 2013
Telling the LivePerson Technology Story at Couchbase [SF] 2013
LivePerson
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
HostedbyConfluent
 
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
Kafka, Killer of Point-to-Point Integrations, Lucian LitaKafka, Killer of Point-to-Point Integrations, Lucian Lita
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
confluent
 
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike SpicerKafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
confluent
 
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin KumarSiphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
confluent
 
Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Sparktsliwowicz
 
Rapid Data Analytics @ Netflix
Rapid Data Analytics @ NetflixRapid Data Analytics @ Netflix
Rapid Data Analytics @ Netflix
Data Con LA
 
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
confluent
 
Ml sprint16 thesis_intro
Ml sprint16 thesis_introMl sprint16 thesis_intro
Ml sprint16 thesis_intro
ThanhNguyen3805
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Databricks
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
HostedbyConfluent
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyft
markgrover
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETL
Databricks
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Wei Di
 
Correlate Log Data with Business Metrics Like a Jedi
Correlate Log Data with Business Metrics Like a JediCorrelate Log Data with Business Metrics Like a Jedi
Correlate Log Data with Business Metrics Like a Jedi
Trevor Parsons
 
Jamie Grier - Robust Stream Processing with Apache Flink
Jamie Grier - Robust Stream Processing with Apache FlinkJamie Grier - Robust Stream Processing with Apache Flink
Jamie Grier - Robust Stream Processing with Apache Flink
Flink Forward
 
Streaming datasets for personalization
Streaming datasets for personalizationStreaming datasets for personalization
Streaming datasets for personalization
Shriya Arora
 
Stream Processing @ Lyft
Stream Processing @ LyftStream Processing @ Lyft
Stream Processing @ Lyft
Jamie Grier
 
Spline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured StreamingSpline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured Streaming
Vaclav Kosar
 

What's hot (20)

Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
Sub-Second SQL Search, Aggregations and Joins with Kafka and Rockset | Dhruba...
 
Telling the LivePerson Technology Story at Couchbase [SF] 2013
Telling the LivePerson Technology Story at Couchbase [SF] 2013Telling the LivePerson Technology Story at Couchbase [SF] 2013
Telling the LivePerson Technology Story at Couchbase [SF] 2013
 
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
Hybrid Kafka, Taking Real-time Analytics to the Business (Cody Irwin, Google ...
 
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
Kafka, Killer of Point-to-Point Integrations, Lucian LitaKafka, Killer of Point-to-Point Integrations, Lucian Lita
Kafka, Killer of Point-to-Point Integrations, Lucian Lita
 
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike SpicerKafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
Kafka and Stream Processing, Taking Analytics Real-time, Mike Spicer
 
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin KumarSiphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
 
Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Spark
 
Rapid Data Analytics @ Netflix
Rapid Data Analytics @ NetflixRapid Data Analytics @ Netflix
Rapid Data Analytics @ Netflix
 
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
 
Ml sprint16 thesis_intro
Ml sprint16 thesis_introMl sprint16 thesis_intro
Ml sprint16 thesis_intro
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...How a distributed graph analytics platform uses Apache Kafka for data ingesti...
How a distributed graph analytics platform uses Apache Kafka for data ingesti...
 
Dogfooding data at Lyft
Dogfooding data at LyftDogfooding data at Lyft
Dogfooding data at Lyft
 
Empowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETLEmpowering Zillow’s Developers with Self-Service ETL
Empowering Zillow’s Developers with Self-Service ETL
 
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligenceSpark summit 2017- Transforming B2B sales with Spark powered sales intelligence
Spark summit 2017- Transforming B2B sales with Spark powered sales intelligence
 
Correlate Log Data with Business Metrics Like a Jedi
Correlate Log Data with Business Metrics Like a JediCorrelate Log Data with Business Metrics Like a Jedi
Correlate Log Data with Business Metrics Like a Jedi
 
Jamie Grier - Robust Stream Processing with Apache Flink
Jamie Grier - Robust Stream Processing with Apache FlinkJamie Grier - Robust Stream Processing with Apache Flink
Jamie Grier - Robust Stream Processing with Apache Flink
 
Streaming datasets for personalization
Streaming datasets for personalizationStreaming datasets for personalization
Streaming datasets for personalization
 
Stream Processing @ Lyft
Stream Processing @ LyftStream Processing @ Lyft
Stream Processing @ Lyft
 
Spline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured StreamingSpline: Data Lineage For Spark Structured Streaming
Spline: Data Lineage For Spark Structured Streaming
 

Similar to Elevation Query Extension: Introducing Subselects into Lucene Queries

Schema on read with runtime fields
Schema on read with runtime fieldsSchema on read with runtime fields
Schema on read with runtime fields
Elasticsearch
 
Activate 2019 - Search and relevance at scale for online classifieds
Activate 2019 - Search and relevance at scale for online classifiedsActivate 2019 - Search and relevance at scale for online classifieds
Activate 2019 - Search and relevance at scale for online classifieds
Roger Rafanell Mas
 
The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...
The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...
The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...
C4Media
 
WIPS Global Brochure, New
WIPS Global Brochure, NewWIPS Global Brochure, New
WIPS Global Brochure, Newshikha gupta
 
Behind the Wizard’s Curtain: Scalability and Security at Zuora (Subscribed13)
Behind the Wizard’s Curtain:  Scalability and Security at Zuora (Subscribed13)Behind the Wizard’s Curtain:  Scalability and Security at Zuora (Subscribed13)
Behind the Wizard’s Curtain: Scalability and Security at Zuora (Subscribed13)
Zuora, Inc.
 
Enhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuEnhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min Qiu
Spark Summit
 
MakeServiceContractEasy_NEOAUG_20120611
MakeServiceContractEasy_NEOAUG_20120611MakeServiceContractEasy_NEOAUG_20120611
MakeServiceContractEasy_NEOAUG_20120611Ravindra Tripathi
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservices
Ronald Hsu
 
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
confluent
 
Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Lucidworks
 
IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0
Matt Lucas
 
SplunkLive! Presentation - Data Onboarding with Splunk
SplunkLive! Presentation - Data Onboarding with SplunkSplunkLive! Presentation - Data Onboarding with Splunk
SplunkLive! Presentation - Data Onboarding with Splunk
Splunk
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunk
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing Assignment
RTTS
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology
confluent
 
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Databricks
 
Efficient Query Processing Infrastructures
Efficient Query Processing InfrastructuresEfficient Query Processing Infrastructures
Efficient Query Processing Infrastructures
Crai Macdonald
 
Spm lecture-3
Spm lecture-3Spm lecture-3
Spm lecture-3
Sulman Ahmed
 
Lect3
Lect3Lect3
Replicating One Billion Records with Minimal API Usage
Replicating One Billion Records with Minimal API UsageReplicating One Billion Records with Minimal API Usage
Replicating One Billion Records with Minimal API Usage
Salesforce Developers
 

Similar to Elevation Query Extension: Introducing Subselects into Lucene Queries (20)

Schema on read with runtime fields
Schema on read with runtime fieldsSchema on read with runtime fields
Schema on read with runtime fields
 
Activate 2019 - Search and relevance at scale for online classifieds
Activate 2019 - Search and relevance at scale for online classifiedsActivate 2019 - Search and relevance at scale for online classifieds
Activate 2019 - Search and relevance at scale for online classifieds
 
The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...
The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...
The Evolution of Testing Methodology at AWS: From Status Quo to Formal Method...
 
WIPS Global Brochure, New
WIPS Global Brochure, NewWIPS Global Brochure, New
WIPS Global Brochure, New
 
Behind the Wizard’s Curtain: Scalability and Security at Zuora (Subscribed13)
Behind the Wizard’s Curtain:  Scalability and Security at Zuora (Subscribed13)Behind the Wizard’s Curtain:  Scalability and Security at Zuora (Subscribed13)
Behind the Wizard’s Curtain: Scalability and Security at Zuora (Subscribed13)
 
Enhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min QiuEnhancements on Spark SQL optimizer by Min Qiu
Enhancements on Spark SQL optimizer by Min Qiu
 
MakeServiceContractEasy_NEOAUG_20120611
MakeServiceContractEasy_NEOAUG_20120611MakeServiceContractEasy_NEOAUG_20120611
MakeServiceContractEasy_NEOAUG_20120611
 
Monolithic to microservices
Monolithic to microservicesMonolithic to microservices
Monolithic to microservices
 
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant Inside Kafka Streams—Monitoring Comcast’s Outside Plant
Inside Kafka Streams—Monitoring Comcast’s Outside Plant
 
Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...Using Deep Learning and Customized Solr Components to Improve search Relevanc...
Using Deep Learning and Customized Solr Components to Improve search Relevanc...
 
IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0IBM Blockchain Platform - Architectural Good Practices v1.0
IBM Blockchain Platform - Architectural Good Practices v1.0
 
SplunkLive! Presentation - Data Onboarding with Splunk
SplunkLive! Presentation - Data Onboarding with SplunkSplunkLive! Presentation - Data Onboarding with Splunk
SplunkLive! Presentation - Data Onboarding with Splunk
 
SplunkLive! Advanced Session
SplunkLive! Advanced SessionSplunkLive! Advanced Session
SplunkLive! Advanced Session
 
Creating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing AssignmentCreating a Project Plan for a Data Warehouse Testing Assignment
Creating a Project Plan for a Data Warehouse Testing Assignment
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology
 
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
Operationalizing Machine Learning—Managing Provenance from Raw Data to Predic...
 
Efficient Query Processing Infrastructures
Efficient Query Processing InfrastructuresEfficient Query Processing Infrastructures
Efficient Query Processing Infrastructures
 
Spm lecture-3
Spm lecture-3Spm lecture-3
Spm lecture-3
 
Lect3
Lect3Lect3
Lect3
 
Replicating One Billion Records with Minimal API Usage
Replicating One Billion Records with Minimal API UsageReplicating One Billion Records with Minimal API Usage
Replicating One Billion Records with Minimal API Usage
 

More from Lucidworks

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
Lucidworks
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
Lucidworks
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
Lucidworks
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
Lucidworks
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Lucidworks
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
Lucidworks
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
Lucidworks
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Lucidworks
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
Lucidworks
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
Lucidworks
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Lucidworks
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
Lucidworks
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
Lucidworks
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
Lucidworks
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Lucidworks
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Lucidworks
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Lucidworks
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
Lucidworks
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
Lucidworks
 

More from Lucidworks (20)

Search is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce StrategySearch is the Tip of the Spear for Your B2B eCommerce Strategy
Search is the Tip of the Spear for Your B2B eCommerce Strategy
 
Drive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in SalesforceDrive Agent Effectiveness in Salesforce
Drive Agent Effectiveness in Salesforce
 
How Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant ProductsHow Crate & Barrel Connects Shoppers with Relevant Products
How Crate & Barrel Connects Shoppers with Relevant Products
 
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product DiscoveryLucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
Lucidworks & IMRG Webinar – Best-In-Class Retail Product Discovery
 
Connected Experiences Are Personalized Experiences
Connected Experiences Are Personalized ExperiencesConnected Experiences Are Personalized Experiences
Connected Experiences Are Personalized Experiences
 
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...
 
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...
 
Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020Preparing for Peak in Ecommerce | eTail Asia 2020
Preparing for Peak in Ecommerce | eTail Asia 2020
 
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...
 
AI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and RosetteAI-Powered Linguistics and Search with Fusion and Rosette
AI-Powered Linguistics and Search with Fusion and Rosette
 
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentThe Service Industry After COVID-19: The Soul of Service in a Virtual Moment
The Service Industry After COVID-19: The Soul of Service in a Virtual Moment
 
Webinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - EuropeWebinar: Smart answers for employee and customer support after covid 19 - Europe
Webinar: Smart answers for employee and customer support after covid 19 - Europe
 
Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19Smart Answers for Employee and Customer Support After COVID-19
Smart Answers for Employee and Customer Support After COVID-19
 
Applying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 ResearchApplying AI & Search in Europe - featuring 451 Research
Applying AI & Search in Europe - featuring 451 Research
 
Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1Webinar: Accelerate Data Science with Fusion 5.1
Webinar: Accelerate Data Science with Fusion 5.1
 
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyWebinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce Strategy
 
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...
 
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceApply Knowledge Graphs and Search for Real-World Decision Intelligence
Apply Knowledge Graphs and Search for Real-World Decision Intelligence
 
Webinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise SearchWebinar: Building a Business Case for Enterprise Search
Webinar: Building a Business Case for Enterprise Search
 
Why Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and BeyondWhy Insight Engines Matter in 2020 and Beyond
Why Insight Engines Matter in 2020 and Beyond
 

Recently uploaded

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 

Recently uploaded (20)

GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 

Elevation Query Extension: Introducing Subselects into Lucene Queries

  • 1. STAY CONNECTED Twitter @activate_conf Facebook @activateconf #Activate19 Log in to wifi, follow Activate on social media, and download the event app where you can submit an evaluation after the session WIFI NETWORK: Activate2019 PASSWORD: Lucidworks DOWNLOAD THE ACTIVATE 2019 MOBILE APP Search Activate2019 in the App/Play store Or visit: http://crowd.cc/activate19
  • 2.
  • 4. Speaker Slide R O B E R T K I R C H G E S S N E R Search Technology Architect Wolters Kluwer E X P E R I E N C E • Search Algorithms Development • Content Analysis • Entity Recognition • Solr plugins / extensions • Strong software development experience for about 14 years in different commercial projects • Last 4 years working on search expertise, particularly with Apache Solr and cloud-based solution for this including availability and scalability. • Customers: Wolters Kluwer, TRAFIGURA, Daadkracht... N A Z A R S E N I U K Lead Software Engineer EPAM
  • 5. Agenda • Motivation • Implementation Idea • Implementation status • Case study autosuggest • Summary
  • 6. Some background • Developing search applications for legal market since 2003 • Inhomogeneous, structured content, rich metadata (laws, cases, commentaries) • Use of metadata for ranking is essential for good results • Up to 30% of queries contain legal / other entities • Relying on query cooking using entity recognition in the user input • Combining with full text search and tuning the results becomes a challenge
  • 7. Example User input: § 123 BGB Transformed to queries Q1, Q2, Q3, Q4 Expected output: • § 123 BGB (law document) • Legal commentary A to § 123 BGB (promoted content) • Legal commentary B to § 123 BGB (promoted content) • Some latest cases based on § 123 BGB (relevant content) • Full text (or whatever needed) How to achieve?
  • 8. Requirements C O N T E N T S T R U C T U R E • Handle entities in the user input properly: legal citations, locations, dates, names – e.g. place the correct document cited in the query on the top – given a book title place an entry document (table of contents) on the top • Top (1-5) hits expected to be unambiguous • Use the top slots efficiently (10-100 hits) • Keep balance between numerous document types (legal cases) and relevant or promoted document types Generally more precise control of what is going on in the top 10
  • 9. Possible solutions • Boost factors on queries, terms, documents • Sort fields • Ranking functions • Function queries • Reranking (in Solr or application) • Filtering • Multiple requests
  • 10. Works, but… • Some are too complex • Some are too slow • Others are not reliable • Missing a concept of subquery: – tracking from which subquery a document is coming from • Missing LIMIT as in SQL
  • 11. Example continued User input: § 123 BGB Transformed to queries Q1, Q2, Q3, Q4 Expected output: • § 123 BGB (law document) • Legal commentary A to § 123 BGB (promoted content) • Legal commentary B to § 123 BGB (promoted content) • Some latest cases based on § 123 BGB (relevant content) • Full text (or whatever needed) Want the request look like: Q1 << Q2 << Q3 << Q4
  • 12. Elevation query Initial Idea / Specification Given a list of queries Q1, Q2, …, QN produce a result fulfilling the conditions: • All the documents of Qn are placed before the documents of Qm for m>n • Each hit should occur in the leftmost possible subset • No duplication of hits • Meaningful scores • Correct faceting
  • 13. Elevation query Additional requirements / expectations • One request / one pass search • Usable via some new syntax / parser support • Implemented as plugin Furthermore it should be possible to • impose a limit on the results of each subquery • provide a sort parameter for each subquery
  • 14. Implementation Idea Where to start • TopFieldCollector.collect • TFC manages a priority queue • The priority queue is parametrized with size and sorting • DisjunctionMaxQuery: – „generates the union of documents produced by its subqueries“ Q1 8 71 7 28 6 13 5 23 4 50 3 10 2 31 1 23 7 28 6 13 4 50 3 10 1 23 9 66 8 71 7 28 6 13 5 23 4 50 3 10 2 31 1 23 1 23 10 42 9 66 8 71 7 28 6 13 5 23 4 50 3 10 2 31 2 31 11 63 10 42 9 66 8 71 7 28 6 13 5 23 4 50 3 10 3 10 12 19 11 63 10 42 9 66 8 71 7 28 6 13 5 23 4 50 4 50 1 23 3 10 13 36 12 19 11 63 10 42 9 66 8 71 7 28 6 13 5 23 14 47 13 36 12 19 11 63 10 42 9 66 8 71 7 28 6 13 6 13 3 10 15 99 14 47 13 36 12 19 11 63 10 42 9 66 8 71 7 28 7 28 1 23 6 13
  • 15. Implementation Idea Where to go • Provide more than one queue to collector • Propagate information from DisjunctionMaxScorer to the collector • Some additional bookkeeping – Scores – Sort field values – Subquery index – (Facets) Q3 Q2 Q1 87654321 87654321 98765432 1 50 1 - 1 - 1 50 1 50 1 - 1 - 109876543 1 50 2 - 2 43 2 - 1 50 2 - 2 43 2 43 2 - 1110987654 1 50 3 - 2 43 3 31 3 55 1110987654 1 50 3 - 2 43 3 31 3 31 3 55 12111098765 1 50 4 51 2 43 3 31 4 - 4 76 4 51 1 50 4 51 2 43 3 31 4 - 4 76 131211109876 2 43 3 31 5 - 5 - 4 51 1 50 5 74 131211109876 2 43 3 31 5 - 5 - 5 74 4 51 5 74 2 43 3 31 4 76 5 74 4 51 5 74 2 43 3 31 4 76 4 76 5 74 4 51 5 74 1413121110987 2 43 3 31 6 88 4 76 6 - 5 74 4 51 6 - 1413121110987 6 88 2 43 3 31 4 76 6 - 5 74 4 51 6 - 15141312111098 6 88 2 43 3 31 7 12 4 76 7 99 5 74 4 51 7 - 15141312111098 6 88 2 43 3 31 7 12 7 12 4 76 7 99 5 74 4 51 7 - 15141312111098 7 12 15141312111099 6 88 2 43 3 31 7 12 8 - 4 76 8 55 5 74 4 51 8 - 15141312111099 6 88 2 43 3 31 7 12 8 - 4 76 8 55 8 55 5 74 4 51 8 - 15141312111099
  • 16. Implementation status • https://github.com/rokirx/solr-eq • Working – Collector logic / multiple queues – Sort and limit parameter per subquery – Parser support • In testing – Correct scoring – Faceting – Multiple sort fields per subquery • Works with 6.4, 7.6, 8.0, 8.2
  • 17. Case Study: Autosuggest User Input tax • Assumptions on the relevancy of completion: – Highest priority if the term at the beginning and exact match, eg tax relief – Lower priority exact match but term not at the beginnilng, eg income tax – Lowest priority prefix match anywhere in the phrase, eg estate taxes • Map this condition to queries: – Term at the beginning of a phrase and exact match: ^tax$ – Exact match in the middle of a phrase: tax$ – Prefix match (edge n-gram): tax
  • 18. Case Study: Autosuggest User Input tax • Resulting query: ^tax$ << tax$ << tax guarantees the specified behavior • Additional benefit: optimize the performance by cancelling out subqueries – If the exact hit count is not necessary – And the minimum required number of hits in the preceeding queues is collected – Stop fetching the docs from lower priority queue by cancelling them out of the collector/scorer – Whitout missing out any relevant documents
  • 19. Potential benefits • Reduce the number of search requests • Reduce the complexity of the architecture • Additional dimension to control rank • Pluggable, easy to evaluate • Improve performance through runtime subquery cancellation
  • 20. Summary It is technically possible to implement a concept of subquery into Solr/Lucene • Single request / one pass collection of results • Individual limits on each subquery • Individual sort parameters on each subquery • Optimization if no total hits number needed – cancel lower prioritized subqueries during evaluation without affecting top hits • Plugin

Editor's Notes

  1. 2
  2. 2
  3. 5
  4. 7
  5. 9
  6. 11
  7. 12
  8. 12.5
  9. 13
  10. 14.5
  11. 18
  12. 23
  13. 24
  14. 25
  15. 27
  16. 29
  17. 30