Rated Ranking Evaluator Enterprise:
the Next Generation of Free Search
Quality Evaluation Tools



Alessandro Benedetti, Director
Andrea Gazzarini, Co-Founder
15th
September 2021
‣ Born in Tarquinia(ancient Etruscan city)
‣ R&D Software Engineer
‣ Director
‣ Master in Computer Science
‣ Apache Lucene/Solr PMC member/committer
‣ Elasticsearch expert
‣ Semantic, NLP, Machine Learning
technologies passionate
‣ Beach Volleyball player and Snowboarder
Who We Are
Alessandro Benedetti
‣ Born in Viterbo
‣ Hermit Software Engineer
‣ Master in Economy
‣ Programming Passionate
‣ RRE Creator
‣ Apache Lucene/Solr Expert
‣ Elasticsearch Expert
‣ Apache Qpid Committer
‣ Father, Husband
‣ Bass player, aspiring (still frustrated at the moment) Chapman
Stick player
Who We Are
Andrea Gazzarini
‣ Headquarter in London/distributed
‣ Open Source Enthusiasts
‣ Apache Lucene/Solr experts
‣ Elasticsearch experts
‣ Community Contributors
‣ Active Researchers
‣ Hot Trends : Learning To Rank,
Document Similarity,
Search Quality Evaluation,
Relevancy Tuning
Search Services
www.sease.io
Clients
Agenda
RRE Open Source
RRE Enterprise (RREE)
Query Discovery
Rating Generation
Explore Evaluation Results
Agenda
RRE Open Source
RRE Enterprise (RREE)
Query Discovery
Rating Generation
Explore Evaluation Results
‣ Open Source library for Search Quality Evaluation
‣ Ratings are expected in input (Json supported)
‣ Many offline metrics available out-of-the-box
(Precision@k, NDCG@k, F-Measure…)
‣ Apache Solr and Elasticsearch support
‣ Development-centric approach
‣ Evaluation on the fly and results in various formats
‣ Community building up!
‣ RRE-User mailing list:
https://groups.google.com/g/rre-user
‣ https://github.com/SeaseLtd/rated-ranking-evaluator
Rated Ranking Evaluator : RRE Open Source
What is it ?
Rated Ranking Evaluator : The Genesis
2018
Search Consultancy Project
A customer explicitly asked for a
rudimental search quality evaluation
tool while we were working on its
search infrastructure.
Jun
Search Quality Evaluation
A Developer Perspective
0.9
Search Quality Evaluation
Tools And Techniques
Oct
1.0
to be continued...
mumble mumble
Rated Ranking Evaluator: The Idea
RRE has been thought as a development tool that
executes search quality evaluations as part of a
project build process.
It’s like a read–eval–print loop (REPL) tied on top of an
Information Retrieval subsystem that encourages an
incremental / iterative approach.
The underlying idea
New system Existing system
Here are the requirements
Ok
V1.0 has been released
Cool!
a month later…
We have a change request.
We found a bug
We need to improve our search
system, users are complaining about
junk in search results.
Ok
v0.1
…
v0.9
v1.1
v1.2
v1.3
…
v2.0
v2.0
In terms of retrieval effectiveness,
how can we know the system
performance across various versions?
Rated Ranking Evaluator: Domain Model
RRE Domain Model is organized into a composite /
tree-like structure where the relationships between
entities are always 1 to many.
The top level entity is a placeholder representing an
evaluation execution.
Versioned metrics are computed at query level and
then reported, using an aggregation function, at
upper levels.
The benefit of having a composite structure is clear:
we can see a metric value at different levels (e.g. a
query, all queries belonging to a query group, all
queries belonging to a topic or at corpus level)
Domain Model
Evaluation
Corpus
1..*
v1.0
P@10
NDCG
AP
F-MEASURE
….
v1.1
P@10
NDCG
AP
F-MEASURE
….
v1.2
P@10
NDCG
AP
F-MEASURE
….
v1.n
P@10
NDCG
AP
F-MEASURE
….
Topic
Query Group
Query
1..*
1..*
1..*
…
Top level domain entity
Test dataset / collection
Information need
Query variants
Queries
Rated Ranking Evaluator : How it works
Data
Configuration
Ratings
Search Platform
uses a
produces
Evaluation Data
INPUT LAYER EVALUATION LAYER OUTPUT LAYER
JSON
RRE Console
…
used for generating
Explainability - Why is important in Information Retrieval?
Dev, tune & Build
Check evaluation results
We are thinking about how
to fill a third monitor
Agenda
RRE Open Source
RRE Enterprise (RREE)
Query Discovery
Rating Generation
Explore Evaluation Results
RRE Enterprise : The Genesis
2018
Search Consultancy Project
A customer explicitly asked for a rudimental search quality evaluation
tool while we were working on its search infrastructure.
2019
Rated Ranking Evaluator
An Open Source Approach
to be continued...
The first sketches depicting an idea
about an enterprise-level version of
RRE.
The development starts few months
later.
Rated Ranking Evaluator : The Genesis
2018
Search Consultancy Project
A customer explicitly asked for a rudimental search quality evaluation
tool while we were working on its search infrastructure.
2019
to be continued...
2020
2021
JBCP
ID Discovery
Query Discovery
Agenda
RRE Open Source
RRE Enterprise (RREE)
Query Discovery
Rating Generation
Explore Evaluation Results
RRE Open Source Recap: How it works (½)
Data
Configuration
Ratings
targets produces
Evaluation Data
INPUT LAYER EVALUATION LAYER OUTPUT LAYER
JSON
RRE Console
…
used for generating
OR
RRE Open Source Recap: How it works (2/2)
Data
Configuration
Ratings
targets produces
Evaluation Data
INPUT LAYER EVALUATION LAYER OUTPUT LAYER
JSON
RRE Console
…
used for generating
OR
RRE Enterprise: Query Discovery?
API
Problem: I have an intermediate Search-API that builds complex Apache Solr/Elasticsearch queries
RRE Open Source: No Query Discovery
targets
OR
1
Evaluation Data
produces
2
Query Discovery
RRE Enterprise: Query Discovery & Evaluation
2
Evaluation Data
produces
3
targets
OR
{SEARCH API}
1
correlates on
targets
Input Rating: RRE Open Source vs RREE
RRE Enterprise
RRE Open Source
Topic
Query Group
Query
Information need
Query variant
(Search Engine) Query
+ 3
Rated Document
Topic
Query Group
API Request
Information need
Query variant
(Search API) Request
+ 3
Rated Document
Query (Search Engine) Query
Agenda
RRE Open Source
RRE Enterprise (RREE)
Query Discovery
Rating Generation
Explore Evaluation Results
RRE Enterprise: Rating Generation
A fundamental requirement of
offline search quality
evaluation is to gather
<query,document,rating>
triples that represent the
relevance(rating) of a
document given a user
information need(query).
Before assessing the retrieval
effectiveness of a system it is
necessary to associate a
relevance rating to each pair
<query, document> involved in
our evaluation.
RRE Enterprise: Explicit Ratings (1/2)
• Explicitly provided by domain experts
• High accuracy
• High effort / time / resources
• RRE Open Source accepts only explicit ratings
Explicit Ratings
RRE Rating Structure
Topic
Query Group
Query
Information need
Query variant
Query
+ 3
Rated Document
Question: how to minimize the relevant effort required
to provide explicit ratings?
RRE Enterprise: Explicit Ratings (2/2)
• Chrome plugin which applies an
evaluation layer on top of an arbitrary
website
• Ratings are generated using directly the
customer website
• Users Lowest learning curve
• Generated ratings are sent to RREE
through a specific endpoint
• An “ID Discovery” component translates
the received data (rated web items) in RRE
Ratings (rated Solr/Elasticsearch
documents)
Judgment Collector
Implicit Feedback - Click
{
"collection": "papers",
"query": "interleaving",
"blackBoxQueryRequest": "GET
/query?q=interleaving HTTP/1.1rnHost:
localhost:5063",
"documentId": "1",
"click": 1,
"timestamp":
"2021-06-23T16:10:49Z",
"queryDocumentPosition": 0
}
Click
http://vps-933d20b7.vps.ovh.net:8080/1.0
/rre-enterprise-api/input-api/interaction
Search Result Page
Implicit Feedback - Add To Cart
{
"collection": "papers",
"query": "interleaving",
"blackBoxQueryRequest": "GET
/query?q=interleaving HTTP/1.1rnHost:
localhost:5063",
"documentId": "1",
"addToCart": 1,
"timestamp":
"2021-06-23T16:10:49Z",
"queryDocumentPosition": 0
}
Add To Cart
http://vps-933d20b7.vps.ovh.net:8080/1.0
/rre-enterprise-api/input-api/interaction
Search Result Page
Implicit Feedback - Sale
{
"collection": "papers",
"query": "interleaving",
"blackBoxQueryRequest": "GET
/query?q=interleaving HTTP/1.1rnHost:
localhost:5063",
"documentId": "1",
"sale": 1,
"timestamp":
"2021-06-23T16:10:49Z",
"queryDocumentPosition": 0
}
Sale
http://vps-933d20b7.vps.ovh.net:8080/1.0
/rre-enterprise-api/input-api/interaction
Search Result Page
Implicit Feedback - Revenue
{
"collection": "papers",
"query": "interleaving",
"blackBoxQueryRequest": "GET
/query?q=interleaving HTTP/1.1rnHost:
localhost:5063",
"documentId": "1",
"revenue": 100,
"timestamp":
"2021-06-23T16:10:49Z",
"queryDocumentPosition": 0
}
Revenue
http://vps-933d20b7.vps.ovh.net:8080/1.0
/rre-enterprise-api/input-api/interaction
Search Result Page
Implicit Feedback - Storage
{
"collection": "papers",
"query": "interleaving",
"blackBoxQueryRequest": "GET
/query?q=interleaving HTTP/1.1rnHost:
localhost:5063",
"documentId": "1",
"revenue": 100,
"timestamp":
"2021-06-23T16:10:49Z",
"queryDocumentPosition": 0
}
http://vps-933d20b7.vps.ovh.net:8080/1.0
/rre-enterprise-api/input-api/interaction
Interactions
Implicit Feedback - Calculate Online Metrics
Interactions
<query,document>
Implicit Feedback - Estimate Relevance
Global Max: 1 Local Max: [0...1]
Global Min: 0 Local Min: [0...1]
Simple Click Model
0 1 2 3 4
Rating
Metric Score
0 1
0.5
Agenda
RRE Open Source
RRE Enterprise (RREE)
Query Discovery
Rating Generation
Explore Evaluation Results
‣ UI using React library
‣ Configuration support
‣ Navigation of Evaluation Results
‣ Overview - quick view for Business Stakeholders
‣ Explore (an evaluation) - deep view for Software
Engineers
‣ Compare (two evaluations) - deep comparison for
Software Engineers
Explore Evaluation Results: UI
Evaluation Results - Overview
https://rre-enterprise.netlify.app/overview
Evaluation Results - Overview Expanded
Evaluation Results - Overview Zoom
Evaluation Results - Explore
Evaluation Results - Explore Query Info
Evaluation Results - Explore Query Info 2
Evaluation Results - Compare (1/3)
Evaluation Results - Compare (2/3)
Evaluation Results - Compare (3/3)
Evaluation Results - Compare Query Info
‣ Release with free usage plan
‣ Configuration support
‣ Support for multimedia document properties
‣ Intelligent insights on weak performing queries,
groups, topics
‣ Improvements on Click Modelling for Implicit
Relevance estimation
RRE Enterprise: Future Work
Thank You!

Rated Ranking Evaluator Enterprise: the next generation of free Search Quality Evaluation Tools

  • 1.
    Rated Ranking EvaluatorEnterprise: the Next Generation of Free Search Quality Evaluation Tools
 
 Alessandro Benedetti, Director Andrea Gazzarini, Co-Founder 15th September 2021
  • 2.
    ‣ Born inTarquinia(ancient Etruscan city) ‣ R&D Software Engineer ‣ Director ‣ Master in Computer Science ‣ Apache Lucene/Solr PMC member/committer ‣ Elasticsearch expert ‣ Semantic, NLP, Machine Learning technologies passionate ‣ Beach Volleyball player and Snowboarder Who We Are Alessandro Benedetti
  • 3.
    ‣ Born inViterbo ‣ Hermit Software Engineer ‣ Master in Economy ‣ Programming Passionate ‣ RRE Creator ‣ Apache Lucene/Solr Expert ‣ Elasticsearch Expert ‣ Apache Qpid Committer ‣ Father, Husband ‣ Bass player, aspiring (still frustrated at the moment) Chapman Stick player Who We Are Andrea Gazzarini
  • 4.
    ‣ Headquarter inLondon/distributed ‣ Open Source Enthusiasts ‣ Apache Lucene/Solr experts ‣ Elasticsearch experts ‣ Community Contributors ‣ Active Researchers ‣ Hot Trends : Learning To Rank, Document Similarity, Search Quality Evaluation, Relevancy Tuning Search Services www.sease.io
  • 5.
  • 6.
    Agenda RRE Open Source RREEnterprise (RREE) Query Discovery Rating Generation Explore Evaluation Results
  • 7.
    Agenda RRE Open Source RREEnterprise (RREE) Query Discovery Rating Generation Explore Evaluation Results
  • 8.
    ‣ Open Sourcelibrary for Search Quality Evaluation ‣ Ratings are expected in input (Json supported) ‣ Many offline metrics available out-of-the-box (Precision@k, NDCG@k, F-Measure…) ‣ Apache Solr and Elasticsearch support ‣ Development-centric approach ‣ Evaluation on the fly and results in various formats ‣ Community building up! ‣ RRE-User mailing list: https://groups.google.com/g/rre-user ‣ https://github.com/SeaseLtd/rated-ranking-evaluator Rated Ranking Evaluator : RRE Open Source What is it ?
  • 9.
    Rated Ranking Evaluator: The Genesis 2018 Search Consultancy Project A customer explicitly asked for a rudimental search quality evaluation tool while we were working on its search infrastructure. Jun Search Quality Evaluation A Developer Perspective 0.9 Search Quality Evaluation Tools And Techniques Oct 1.0 to be continued... mumble mumble
  • 10.
    Rated Ranking Evaluator:The Idea RRE has been thought as a development tool that executes search quality evaluations as part of a project build process. It’s like a read–eval–print loop (REPL) tied on top of an Information Retrieval subsystem that encourages an incremental / iterative approach. The underlying idea New system Existing system Here are the requirements Ok V1.0 has been released Cool! a month later… We have a change request. We found a bug We need to improve our search system, users are complaining about junk in search results. Ok v0.1 … v0.9 v1.1 v1.2 v1.3 … v2.0 v2.0 In terms of retrieval effectiveness, how can we know the system performance across various versions?
  • 11.
    Rated Ranking Evaluator:Domain Model RRE Domain Model is organized into a composite / tree-like structure where the relationships between entities are always 1 to many. The top level entity is a placeholder representing an evaluation execution. Versioned metrics are computed at query level and then reported, using an aggregation function, at upper levels. The benefit of having a composite structure is clear: we can see a metric value at different levels (e.g. a query, all queries belonging to a query group, all queries belonging to a topic or at corpus level) Domain Model Evaluation Corpus 1..* v1.0 P@10 NDCG AP F-MEASURE …. v1.1 P@10 NDCG AP F-MEASURE …. v1.2 P@10 NDCG AP F-MEASURE …. v1.n P@10 NDCG AP F-MEASURE …. Topic Query Group Query 1..* 1..* 1..* … Top level domain entity Test dataset / collection Information need Query variants Queries
  • 12.
    Rated Ranking Evaluator: How it works Data Configuration Ratings Search Platform uses a produces Evaluation Data INPUT LAYER EVALUATION LAYER OUTPUT LAYER JSON RRE Console … used for generating
  • 13.
    Explainability - Whyis important in Information Retrieval? Dev, tune & Build Check evaluation results We are thinking about how to fill a third monitor
  • 14.
    Agenda RRE Open Source RREEnterprise (RREE) Query Discovery Rating Generation Explore Evaluation Results
  • 15.
    RRE Enterprise :The Genesis 2018 Search Consultancy Project A customer explicitly asked for a rudimental search quality evaluation tool while we were working on its search infrastructure. 2019 Rated Ranking Evaluator An Open Source Approach to be continued... The first sketches depicting an idea about an enterprise-level version of RRE. The development starts few months later.
  • 16.
    Rated Ranking Evaluator: The Genesis 2018 Search Consultancy Project A customer explicitly asked for a rudimental search quality evaluation tool while we were working on its search infrastructure. 2019 to be continued... 2020 2021 JBCP ID Discovery Query Discovery
  • 17.
    Agenda RRE Open Source RREEnterprise (RREE) Query Discovery Rating Generation Explore Evaluation Results
  • 18.
    RRE Open SourceRecap: How it works (½) Data Configuration Ratings targets produces Evaluation Data INPUT LAYER EVALUATION LAYER OUTPUT LAYER JSON RRE Console … used for generating OR
  • 19.
    RRE Open SourceRecap: How it works (2/2) Data Configuration Ratings targets produces Evaluation Data INPUT LAYER EVALUATION LAYER OUTPUT LAYER JSON RRE Console … used for generating OR
  • 20.
    RRE Enterprise: QueryDiscovery? API Problem: I have an intermediate Search-API that builds complex Apache Solr/Elasticsearch queries
  • 21.
    RRE Open Source:No Query Discovery targets OR 1 Evaluation Data produces 2
  • 22.
    Query Discovery RRE Enterprise:Query Discovery & Evaluation 2 Evaluation Data produces 3 targets OR {SEARCH API} 1 correlates on targets
  • 23.
    Input Rating: RREOpen Source vs RREE RRE Enterprise RRE Open Source Topic Query Group Query Information need Query variant (Search Engine) Query + 3 Rated Document Topic Query Group API Request Information need Query variant (Search API) Request + 3 Rated Document Query (Search Engine) Query
  • 24.
    Agenda RRE Open Source RREEnterprise (RREE) Query Discovery Rating Generation Explore Evaluation Results
  • 25.
    RRE Enterprise: RatingGeneration A fundamental requirement of offline search quality evaluation is to gather <query,document,rating> triples that represent the relevance(rating) of a document given a user information need(query). Before assessing the retrieval effectiveness of a system it is necessary to associate a relevance rating to each pair <query, document> involved in our evaluation.
  • 26.
    RRE Enterprise: ExplicitRatings (1/2) • Explicitly provided by domain experts • High accuracy • High effort / time / resources • RRE Open Source accepts only explicit ratings Explicit Ratings RRE Rating Structure Topic Query Group Query Information need Query variant Query + 3 Rated Document Question: how to minimize the relevant effort required to provide explicit ratings?
  • 27.
    RRE Enterprise: ExplicitRatings (2/2) • Chrome plugin which applies an evaluation layer on top of an arbitrary website • Ratings are generated using directly the customer website • Users Lowest learning curve • Generated ratings are sent to RREE through a specific endpoint • An “ID Discovery” component translates the received data (rated web items) in RRE Ratings (rated Solr/Elasticsearch documents) Judgment Collector
  • 28.
    Implicit Feedback -Click { "collection": "papers", "query": "interleaving", "blackBoxQueryRequest": "GET /query?q=interleaving HTTP/1.1rnHost: localhost:5063", "documentId": "1", "click": 1, "timestamp": "2021-06-23T16:10:49Z", "queryDocumentPosition": 0 } Click http://vps-933d20b7.vps.ovh.net:8080/1.0 /rre-enterprise-api/input-api/interaction Search Result Page
  • 29.
    Implicit Feedback -Add To Cart { "collection": "papers", "query": "interleaving", "blackBoxQueryRequest": "GET /query?q=interleaving HTTP/1.1rnHost: localhost:5063", "documentId": "1", "addToCart": 1, "timestamp": "2021-06-23T16:10:49Z", "queryDocumentPosition": 0 } Add To Cart http://vps-933d20b7.vps.ovh.net:8080/1.0 /rre-enterprise-api/input-api/interaction Search Result Page
  • 30.
    Implicit Feedback -Sale { "collection": "papers", "query": "interleaving", "blackBoxQueryRequest": "GET /query?q=interleaving HTTP/1.1rnHost: localhost:5063", "documentId": "1", "sale": 1, "timestamp": "2021-06-23T16:10:49Z", "queryDocumentPosition": 0 } Sale http://vps-933d20b7.vps.ovh.net:8080/1.0 /rre-enterprise-api/input-api/interaction Search Result Page
  • 31.
    Implicit Feedback -Revenue { "collection": "papers", "query": "interleaving", "blackBoxQueryRequest": "GET /query?q=interleaving HTTP/1.1rnHost: localhost:5063", "documentId": "1", "revenue": 100, "timestamp": "2021-06-23T16:10:49Z", "queryDocumentPosition": 0 } Revenue http://vps-933d20b7.vps.ovh.net:8080/1.0 /rre-enterprise-api/input-api/interaction Search Result Page
  • 32.
    Implicit Feedback -Storage { "collection": "papers", "query": "interleaving", "blackBoxQueryRequest": "GET /query?q=interleaving HTTP/1.1rnHost: localhost:5063", "documentId": "1", "revenue": 100, "timestamp": "2021-06-23T16:10:49Z", "queryDocumentPosition": 0 } http://vps-933d20b7.vps.ovh.net:8080/1.0 /rre-enterprise-api/input-api/interaction Interactions
  • 33.
    Implicit Feedback -Calculate Online Metrics Interactions <query,document>
  • 34.
    Implicit Feedback -Estimate Relevance Global Max: 1 Local Max: [0...1] Global Min: 0 Local Min: [0...1] Simple Click Model 0 1 2 3 4 Rating Metric Score 0 1 0.5
  • 35.
    Agenda RRE Open Source RREEnterprise (RREE) Query Discovery Rating Generation Explore Evaluation Results
  • 36.
    ‣ UI usingReact library ‣ Configuration support ‣ Navigation of Evaluation Results ‣ Overview - quick view for Business Stakeholders ‣ Explore (an evaluation) - deep view for Software Engineers ‣ Compare (two evaluations) - deep comparison for Software Engineers Explore Evaluation Results: UI
  • 37.
    Evaluation Results -Overview https://rre-enterprise.netlify.app/overview
  • 38.
    Evaluation Results -Overview Expanded
  • 39.
    Evaluation Results -Overview Zoom
  • 40.
  • 41.
    Evaluation Results -Explore Query Info
  • 42.
    Evaluation Results -Explore Query Info 2
  • 43.
    Evaluation Results -Compare (1/3)
  • 44.
    Evaluation Results -Compare (2/3)
  • 45.
    Evaluation Results -Compare (3/3)
  • 46.
    Evaluation Results -Compare Query Info
  • 47.
    ‣ Release withfree usage plan ‣ Configuration support ‣ Support for multimedia document properties ‣ Intelligent insights on weak performing queries, groups, topics ‣ Improvements on Click Modelling for Implicit Relevance estimation RRE Enterprise: Future Work
  • 48.