ES-Rank is an evolutionary strategy approach for learning to rank documents. It uses a (1+1)-evolutionary strategy with Gaussian*exp(Cauchy) random mutation to evolve a ranking model. Experimental results on three datasets show that ES-Rank performs competitively in terms of accuracy metrics like MAP and NDCG compared to 14 other ranking techniques, while having reasonable runtimes. It considers the whole training dataset with each iteration to effectively learn rankings for large collections.
ES-Rank: Evolutionary Strategy Learning to Rank Approach
1. ES-Rank: Evolutionary Strategy
Learning to Rank Approach
Osman AliOsman Ali SadekSadek Ibrahim, Dario LandaIbrahim, Dario Landa--SilvaSilva
ASAP Research GroupASAP Research Group
The UniversityThe University oof Nottingham, UKf Nottingham, UK
April,April, 20172017
2. Outline of the PresentationOutline of the Presentation
Background
Information Retrieval System Architecture
Learning to Rank Architecture
Problem Statement
The Proposed ES-Rank
2
High-Level Overview
Detailed steps
Experimental Studies
Experimental Study Settings.
Experimental Results.
Conclusion
3. Information Retrieval (IR) System ArchitectureInformation Retrieval (IR) System Architecture
3Figure 1: Typical IR System Architecture
4. Typical Structure of a LTR DatasetTypical Structure of a LTR Dataset
4
In general, there are three categories of LTR methods:
point-wise, pair-wise and list-wise. Hybrid approaches
have also been proposed.
6. Some of the limitations of existing Learning to Rank techniques:
Not considering the whole training dataset in each
learning iteration. Performance vs Computation Effort.
Problem StatementProblem Statement
6
Computational runtime and memory size requirements
for some of the big datasets. For example, rt-rank and
Ranklib packages.
The large variations in the characteristics of big datasets
demands their full consideration for more effective
learning.
7. Start IR System Using
Various Query-
Document Features
Document
Collection
OR
Users search the IR
system for Information
Proposed Approach (highProposed Approach (high--level description)level description)
7
Gathering Relevance
Feedback by IR System from
Users to Create the IR
Query-Document Pairs
Use Query-Document Pairs Training Set for
Evolving a Ranking Model using ES-Rank
OR
WWW
8. The Proposed Approach (ESThe Proposed Approach (ES--Rank)Rank)
Start
Initialize Chromosomes for
Parent and Offspring
Previous mutation
good and stopping
Mutate chromosome on
R genes using previous
NO
Yes
Stopping
criterion
reached?
End
YES
8
Mutate chromosome on
R random genes with
Gaussian*exp(Cauchy)
random number
good and stopping
criterion not
reached?
R genes using previous
mutation step
Mutated chromosome
better than current
chromosome and
stop criterion not
reached?
Mutated chromosome
becomes the current
chromosome
NO
Yes
NO
9. Initialization
Stopping Criterion
Random Mutation with
The Proposed Approach (ESThe Proposed Approach (ES--Rank)Rank)
9
Objective Function
Random Mutation with
adaptive procedure
according to pervious
iteration procedure.
10. Experimental Study SettingsExperimental Study Settings
• Each dataset includes the training, validation and test sets.
• ES-Rank and the other compared methods are applied to the training set in
order to evolve a linear ranking function.
• The performance of the linear ranking function is assessed using the test set
to obtain the predictive performance of the learning algorithm.to obtain the predictive performance of the learning algorithm.
• The comparison used three well-known LETOR datasets and two evaluation
fitness metrics.
10
LETOR Datasets used in experiments
Dataset Queries
Query-URL
Pairs
Features Relevance Labels No. Folds
MQ2007 1692 69623 46 {0,1,2} 5
MQ2008 784 15211 46 {0,1,2} 5
MSLR-
WEB10K
10000 1200192 136 {0,1,2,3,4} 5
11. Experimental Study SettingsExperimental Study Settings
• The experimental study conducted was a comparison between ES-Rank and
14 other Evolutionary and Machine Learning techniques.
• The default parameter settings suggested for the 14 Evolutionary and
Machine Learning techniques were used as described in well-known packagesMachine Learning techniques were used as described in well-known packages
(Ranklib, rt-Rank, LAGEP, SVM-Rank, Sofia-ml).
• ES-Rank uses (1+1)-Evolutionary Strategy with Gaussian*exp(Cauchy) Random
Number as the mutation step. The mutation has a random probability
(Random Walk) that adapts according to the performance in the previous
iteration. The number of evolving iterations was set to 1300 in these
experiments. 11
16. ConclusionConclusion
There are issues in improving IR using learning to rank approaches,
particularly on large document collections.
The proposes ES-Rank performs well against fourteen other techniques in
terms of NDCG@10 and MAP.
16
terms of NDCG@10 and MAP.
The proposed ES-Rank considers the whole instance (Query-Document Pairs)
in each learning iteration which helps to improve the performance of the
evolved ranking model.
The computational run-time of ES-Rank is reasonable given its performance in
terms of accuracy (third fastest after Linear Regression and MART).