PL-Tutorial-DS-11_machine learning .pdf

Preference
Learning:

A
Tutorial
Introduc5on

Eyke
Hüllermeier

Knowledge
Engineering
&
Bioinforma2cs
Lab

Dept.
of
Mathema2cs
and
Computer
Science

Marburg
University,
Germany

DS
2011,
Espoo,
Finland,
Oct
2011

Johannes
Fürnkranz

Knowledge
Engineering

Dept.
of
Computer
Science

Technical
University
Darmstadt,
Germany

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

What
is
Preference
Learning
?

§ Preference
learning
is
an
emerging
subﬁeld
of
machine
learning

§ Roughly
speaking,
it
deals
with
the
learning
of
(predic5ve)
preference

models
from
observed
(or
extracted)
preference
informa2on

MACHINE

LEARNING

PREFERENCE
MODELING

and
DECISION
ANALYSIS

Preference
Learning

computer
science

ar2ﬁcial
intelligence

opera2ons
research

social
sciences
(vo2ng
and
choice
theory)

economics
and
decision
theory

2

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Workshops
and
Related
Events

§ NIPS–01:
New
Methods
for
Preference
Elicita2on

§ NIPS–02:
Beyond
Classiﬁca2on
and
Regression:
Learning
Rankings,

Preferences,
Equality
Predicates,
and
Other
Structures

§ KI–03:
Preference
Learning:
Models,
Methods,
Applica2ons

§ NIPS–04:
Learning
With
Structured
Outputs

§ NIPS–05:
Workshop
on
Learning
to
Rank

§ IJCAI–05:
Advances
in
Preference
Handling

§ SIGIR
07–10:
Workshop
on
Learning
to
Rank
for
Informa2on
Retrieval

§ ECML/PDKK
08–10:
Workshop
on
Preference
Learning

§ NIPS–09:
Workshop
on
Advances
in
Ranking

§ American
Ins2tute
of
Mathema2cs
Workshop
in
Summer
2010:
The

Mathema2cs
of
Ranking

§ NIPS-‐11:
Workshop
on
Choice
Models
and
Preference
Learning

3

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Preferences
in
Ar5ﬁcial
Intelligence

User
preferences
play
a
key
role
in
various
ﬁelds
of
applica2on:

- recommender
systems,

- adap2ve
user
interfaces,

- adap2ve
retrieval
systems,

- autonomous
agents
(electronic
commerce),

- games,
…

Preferences
in
AI
research:

- preference
representa5on
(CP
nets,
GAU
networks,
logical

representa2ons,
fuzzy
constraints,
…)

- reasoning
with
preferences
(decision
theory,
constraint
sa2sfac2on,

non-‐monotonic
reasoning,
…)

- preference
acquisi5on
(preference
elicita2on,
preference

learning,
...)

More
generally,
„preferences“
is
a
key
topic
in
current
AI
research

4

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

AGENDA

1. Preference
Learning
Tasks

2. Performance
Assessment
and
Loss
Func2ons

3. Preference
Learning
Techniques

4. Conclusions

5

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Preference
Learning

Preference
learning
problems
can
be
dis2nguished
along
several

problem
dimensions,
including

§ representa5on
of
preferences,
type
of
preference
model:

- u2lity
func2on
(ordinal,
numeric),

- preference
rela2on
(par2al
order,
ranking,
…),

- logical
representa2on,
…

§ descrip5on
of
individuals/users
and
alterna5ves/items:

- iden2ﬁer,
feature
vector,
structured
object,
…

§ type
of
training
input:

- direct
or
indirect
feedback,

- complete
or
incomplete
rela2ons,

- u2li2es,
…

§ …

6

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Preference
Learning

7
Preferences

absolute
rela5ve

A B
C
D
1
1
0
0

A
B
C
D

.9
.8
.1
.3

binary
gradual
total
order
par5al
order

Â
A Â Â
B C D Â
A
B
C
D
Â
ordinal

numeric

A
B
C
D

+
+
-‐
0

à
(ordinal)
regression
à
classiﬁca2on/ranking

assessing
comparing

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Structure
of
this
Overview

(1) Preference
learning
as
an
extension
of
conven5onal
supervised
learning:

Learn
a
mapping

that
maps
instances
to
preference
models
(à
structured/complex
output

predic2on).

(2) Other
selngs
(object
ranking,
instance
ranking,
CF,
…)

8

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Structure
of
this
Overview

(1) Preference
learning
as
an
extension
of
conven5onal
supervised
learning:

Learn
a
mapping

that
maps
instances
to
preference
models
(à
structured/complex
output

predic2on).

Instances
are
typically
(though
not
necessarily)
characterized
in
terms
of
a

feature
vector.

The
output
space
consists
of
preference
models
over
a
fixed
set
of

alterna2ves
(classes,
labels,
…)
represented
in
terms
of
an
iden2fier

à
extensions
of
mul--‐class
classifica-on

9

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Mul5label
Classiﬁca5on
[Tsoumakas
&
Katakis
2007]

10
X1
X2
X3
X4
A
B
C
D

0.34
0
10
174
0
1
1
0

1.45
0
32
277
0
1
0
1

1.22
1
46
421
0
0
0
1

0.74
1
25
165
0
1
1
1

0.95
1
72
273
1
0
1
0

1.04
0
33
158
1
1
1
0

0.92
1
81
382
0
1
0
1

Training

0.92
1
81
382
1
1
0
1

Binary

preferences
on
a

ﬁxed
set
of
items:

liked
or
disliked

Predic5on

Ground
truth

LOSS

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Mul5label
Ranking

11
X1
X2
X3
X4
A
B
C
D

0.34
0
10
174
0
1
1
0

1.45
0
32
277
0
1
0
1

1.22
1
46
421
0
0
0
1

0.74
1
25
165
0
1
1
1

0.95
1
72
273
1
0
1
0

1.04
0
33
158
1
1
1
0

0.92
1
81
382
4
1
3
2

Training

Predic5on

0.92
1
81
382
1
1
0
1

Ground
truth

Â
B Â Â
D C A
Binary

preferences
on
a

ﬁxed
set
of
items:

liked
or
disliked

A
ranking
of
all

items

LOSS

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Graded
Mul5label
Classiﬁca5on
[Cheng
et
al.
2010]

12
X1
X2
X3
X4
A
B
C
D

0.34
0
10
174
-‐-‐
+
++
0

1.45
0
32
277
0
++
-‐-‐
+

1.22
1
46
421
-‐-‐
-‐-‐
0
+

0.74
1
25
165
0
+
+
++

0.95
1
72
273
+
0
++
-‐-‐

1.04
0
33
158
+
+
++
-‐-‐

0.92
1
81
382
-‐-‐
+
0
++

Training

Predic5on

0.92
1
81
382
0
++
-‐-‐
+

Ground
truth

Ordinal

preferences
on
a

ﬁxed
set
of
items:

liked,
disliked,
or

something
in-‐
between

LOSS

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Graded
Mul5label
Ranking

13
X1
X2
X3
X4
A
B
C
D

0.34
0
10
174
-‐-‐
+
++
0

1.45
0
32
277
0
++
-‐-‐
+

1.22
1
46
421
-‐-‐
-‐-‐
0
+

0.74
1
25
165
0
+
+
++

0.95
1
72
273
+
0
++
-‐-‐

1.04
0
33
158
+
+
++
-‐-‐

0.92
1
81
382
4
1
3
2

Training

Predic5on

0.92
1
81
382
0
++
-‐-‐
+

Ground
truth

Â
B Â Â
D C A
A
ranking
of
all

items

LOSS

Ordinal

preferences
on
a

ﬁxed
set
of
items:

liked,
disliked,
or

something
in-‐
between

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Label
Ranking
[Hüllermeier
et
al.
2008]

14
X1
X2
X3
X4
Preferences

0.34
0
10
174
A
Â
B,
B
Â
C,
C
Â
D

1.45
0
32
277
B
Â
C

1.22
1
46
421
B
Â
D,
A
Â
D,
C
Â
D,
A
Â
C

0.74
1
25
165
C
Â
A,
C
Â
D,
A
Â
B

0.95
1
72
273
B
Â
D,
A
Â
D

1.04
0
33
158
D
Â
A,
A
Â
B,
C
Â
B,
A
Â
C

0.92
1
81
382
4
1
3
2

Training

Predic5on

0.92
1
81
382
2
1
3
4

Ground
truth

Â
B Â Â
D C A
A
ranking
of
all

labels

Instances
are

associated
with

pairwise

preferences

between
labels.

LOSS

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Calibrated
Label
Ranking
[Fürnkranz
et
al.
2008]

15
Combining
absolute
and
rela2ve
evalua2on:

Â
a Â
b c
d Â Â Â
e f
g
relevant

posi2ve

liked

irrelevant

nega2ve

disliked

Preferences

absolute
rela5ve

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Structure
of
this
Overview

(1) Preference
Learning
as
an
extension
of
conven2onal
supervised
learning:

Learn
a
mapping

that
maps
instances
to
preference
models
(à
structured
output

predic2on).

(2) Other
sedngs
(no
clear
dis2nc2on
between
input/output
space)

object
ranking,
instance
ranking,
collabora2ve
ﬁltering

16

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Object
Ranking
[Cohen
et
al.
99]

17
Training

Pairwise

preferences

between
objects

(instances).

Predic5on
(ranking
a
new
set
of
objects)

Ground
truth
(ranking
or
top-‐ranking
or
subset
of
relevant
objects)

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Instance
Ranking
[Fürnkranz
et
al.
2009]

18
Training

X1
X2
X3
X4
class

0.34
0
10
174
-‐-‐

1.45
0
32
277
0

0.74
1
25
165
++

…
…
…
…
…

0.95
1
72
273
+

Predic5on
(ranking
a
new
set
of
objects)

+
0
++
++
-‐-‐
+
0
+
-‐-‐
0
0
-‐-‐
-‐-‐

Ground
truth
(ordinal
classes)

Absolute

preferences
on

an
ordinal
scale.

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Instance
Ranking
[Fürnkranz
et
al.
2009]

19
predicted
ranking,
e.g.,

through
sor2ng
by

es2mated

score

most
likely
good
most
likely
bad

ranking
error

Extension
of
AUC
maximiza2on
to
the
polytomous
case,
in
which

instances
are
rated
on
an
ordinal
scale
such
as
{
bad, medium, good}

Query
set
of
instances
to

be
ranked
(true
labels

are
unknown).

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Collabora5ve
Filtering
[Goldberg
et
al.
1992]

20
P1
P2
P3
…
P38
…
P88
P89
P90

U1
1
4
…
…
3

U2
2
2
…
…
1

…
…
…

U46
?
2
?
…
?
…
?
?
4

…
…
…

U98
5
…
…
4

U99
1
…
…
2

1:
very
bad,

2:
bad,

3:
fair,

4:
good,

5:
excellent

U

S

E

R

S

P
R
O
D
U
C
T
S

Inputs
and
outputs
as
iden2ﬁers,
absolute
preferences
in
terms
of
ordinal
degrees.

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Dyadic
Predic5on
[Menon
&
Elkan
2010]

21
P1
P2
P3
…
P38
…
P88
P89
P90

U1
1
4
…
…
3

U2
2
2
…
…
1

…
…
…

U46
?
2
?
…
?
…
?
?
4

…
…
…

U98
5
…
…
4

U99
1
…
…
2

?
?
1
5

?
?
0
4

?
?
0
6

?
?
1
5

?
?
1
7

?
?
0
6

?
?
1
6

?
?
?
?
?
?
?
?
?

10
14
45
32
52
61
16
33
53

Addi2onal
side-‐informa2on:
observed
features
+

latent
features
of
users
and
items

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Preference
Learning
Tasks

22
task
input
output
training
predic5on
ground
truth

collabora2ve

filtering

iden2fier
iden5fier
absolute

ordinal

absolute

ordinal

absolute

ordinal

mul2label

classifica2on

feature
iden5fier
absolute

binary

absolute

binary

absolute

binary

mul2label

ranking

feature
iden5fier
absolute

binary

ranking
absolute

binary

graded
mul2label

classifica2on

feature
iden5fier
absolute

ordinal

absolute

ordinal

absolute

ordinal

label

ranking

feature
iden5fier
rela2ve

binary

ranking
ranking

object

ranking

feature
-‐-‐
rela2ve

binary

ranking
ranking
or
subset

instance

ranking

feature
iden2fier
absolute

ordinal

ranking
absolute

ordinal

generalized

c
lassifica2on

ranking

Two
main
direc2ons:
(1)
ranking
and
variants
(2)
generaliza2ons
of
classifica2on.

representa2on
type
of
preference
informa2on

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

Loss
Func5ons

23
absolute
u2lity
degree
absolute
u2lity
degree

subset
of
preferred
items
subset
of
preferred
items

subset
of
preferred
items
ranking
of
items

fuzzy
subset
of
preferred
items
fuzzy
subset
of
preferred
items

ranking
of
items
ranking
of
items

ranking
of
items
ordered
par22on
of
items

Things
to
be
compared:

standard
comparison
of

scalar
predic2ons

non-‐standard

c
omparisons

predic5on
ground
truth

DS
2011
Tutorial
on
Preference
Learning
|
Part
1
|
J.
Fürnkranz
&
E.
Hüllermeier

24
References

§ W.
Cheng,
K.
Dembczynski
and
E.
Hüllermeier.
Graded
Mul-label
Classifica-on:
The
Ordinal
Case.
ICML-‐2010,
Haifa,
Israel,

2010.

§ W.
Cheng
and
E.
Hüllermeier.
Predic-ng
par-al
orders:
Ranking
with
absten-on.
ECML/PKDD-‐2010,
Barcelona,
2010.

§ Y.
Chevaleyre,
F.
Koriche,
J.
Lang,
J.
Mengin,
B.
Zanulni.
Learning
ordinal
preferences
on
mul-aCribute
domains:
The
case
of

CP-‐nets.
In:
J.
Fürnkranz
and
E.
Hüllermeier
(eds.)
Preference
Learning,
Springer-‐Verlag,
2010.

§ W.W.
Cohen,
R.E.
Schapire
and
Y.
Singer.
Learning
to
order
things.
Journal
of
Ar2ficial
Intelligence
Research,
10:243–270,
1999.

§ J.
Fürnkranz,
E.
Hüllermeier,
E.
Mencia,
and
K.
Brinker.
Mul-label
Classifica-on
via
Calibrated
Label
Ranking.
Machine
Learning

73(2):133-‐153,
2008.

§ J.
Fürnkranz,
E.
Hüllermeier
and
S.
Vanderlooy.
Binary
decomposi-on
methods
for
mul-par-te
ranking.
Proc.
ECML-‐2009,
Bled,

Slovenia,
2009.

§ D.
Goldberg,
D.
Nichols,
B.M.
Oki
and
D.
Terry.
Using
collabora-ve
filtering
to
weave
and
informa-on
tapestry.

Communica2ons
of
the
ACM,
35(12):61–70,
1992.

§ E.
Hüllermeier,
J.
Fürnkranz,
W.
Cheng
and
K.
Brinker.
Label
ranking
by
learning
pairwise

preferences.
Ar2ficial
Intelligence,

172:1897–1916,
2008.

§ G.
Tsoumakas
and
I.
Katakis.
Mul--‐label
classifica-on:
An
overview.
Int.
J.
Data
Warehouse
and
Mining,
3:1–13,
2007.

§ A.K.
Menon
and
C.
Elkan.
Predic-ng
labels
for
dyadic
data.
Data
Mining
and
Knowledge
Discovery,
21(2),
2010

DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
1
AGENDA
1. Preference Learning Tasks
2. Loss Functions
a. Evaluation of Rankings
b. Weighted Measures
c. Evaluation of Bipartite Rankings
d. Evaluation of Partial Rankings
3. Preference Learning Techniques
4. Conclusions

2
Rank Evaluation Measures
 In the following, we do not discriminate between different ranking
scenarios
 we use the term items for both, objects and labels
 All measures are applicable to both scenarii
 sometimes have different names according to context
 Label Ranking
 measure is applied to the ranking of the labels of each examples
 averaged over all examples
 Object Ranking
 measure is applied to the ranking of a set of objects
 we may need to average over different sets of objects which have disjoint
preference graphs
 e.g. different sets of query / answer set pairs in information retrieval

3
 Given:
 a set of items X = {x1, …, xc} to rank
 Example:
X = {A, B, C, D, E}
D
C
E
B
A
Ranking Errors
items can be
objects or labels

4
Ranking Errors
 Given:
 Example:
X = {A, B, C, D, E}
 a target ranking
 Example:
E  B  C  A  D
r
D
C
A
B
E
r

5
Ranking Errors
 Given:
 Example:
X = {A, B, C, D, E}
 a target ranking
 Example:
E  B  C  A  D
 a predicted ranking
 Example:
A  B  E  C  D
 Compute:
 a value that measures the
distance between the two rankings
r
D
E
C
B
A

r

r
D
C
A
B
E
r
d r , 
r
d 
r ,r

6
Notation
 and are functions from X → ℕ
 returning the rank of an item x
 the inverse functions r-1
: ℕ → X
 return the item at a certain position
 as a short-hand for , we also define
function R: ℕ → ℕ
 R(i) returns the true rank of the i-th item
in the predicted ranking
r
D
E
C
B
A

r 
r
D
C
A
B
E
r
r A=4

r A=1

r
−1
1=A r
−1
4=A
R1=r
r−1
1=4
r °
r−1

7
Spearman's Footrule
 Key idea:
 Measure the sum of absolute differences
between ranks
DSF r , 
r=∑
i=1
c
∣r xi −
rxi ∣=∑
i=1
c
∣i−Ri∣
D
E
C
B
A

r
D
C
A
B
E
r
=∑
i=1
c
d xi
r , 
r
dA=∣1−4∣=3
d B=0
dc=1
d D=0
dE=2
∑xi
d xi
=30102=6

8
Spearman Distance
 Key idea:
 Measure the sum of absolute differences
between ranks
 Value range:
→ Spearman Rank Correlation Coefficient
DS r , 
r=∑
i=1
c
rxi −
rxi2
=∑
i=1
c
i−Ri2
D
E
C
B
A

r
D
C
A
B
E
r
=∑
i=1
c
d xi
r , 
r2
dA=∣1−4∣=3
d B=0
dc=1
d D=0
dE=2
∑xi
d xi
2
=32
012
022
=14
squared
min DS r , 
r=0
max DS r , 
r=∑
i=1
c
c−i−i
2
=
c⋅c
2
−1
3
1−
6⋅DS r , 
r
c⋅c2
−1
∈[−1,1]

9
Kendall's Distance
 Key idea:
 number of item pairs that are inverted in the
predicted ranking
 Value range:
→ Kendall's tau
D
E
C
B
A

r
D
C
A
B
E
r
1−
4⋅D r , 
r
c⋅c−1
∈[−1,1]
D r , 
r =∣ {i , j∣rxirx j ∧ 
rxi 
r x j}∣
D r , 
r = 4
min D r , 
r=0
max D r , 
r=
c⋅c−1
2

10
AGENDA
2. Loss Functions
4. Conclusions

11
Weighted Ranking Errors
 The previous ranking functions give equal weight to all ranking positions
 i.e., differences in the first ranking positions have the same effect as differences
in the last ranking positions
 In many applications this is not desirable
 ranking of search results
 ranking of product recommendations
 ranking of labels for classification
 ...
D
C
E
B
A
E
C
D
B
A
E
C
D
B
A
E
C
D
A
B
D ,  = D , 
Higher ranking
positions should
be given more
weight
⇒

12
Position Error
 Key idea:
 in many applications we are interested in
providing a ranking where the target item
appears a high as possible in the
predicted ranking
 e.g. ranking a set of actions for the next step
in a plan
 Error is the number of wrong items that
are predicted before the target item
 Note:
 equivalent to Spearman's footrule with all
non-target weights set to 0
D
E
C
B
A

r
D
C
A
B
E
r
DPE r , 
r=
rarg minx∈X r x−1
DPE r , 
r=2
DPE r , 
r=∑
i=1
c
wi⋅d xi
r , 
r
wi = 〚xi=arg minx∈X r x〛
with

13
Discounted Error
 Higher ranks in the target position get
a higher weight than lower ranks
D
E
C
B
A

r r
D
A
C
B
E
DDRr , 
r=∑
i=1
c
wi⋅d xi
r , 
r
wi =
1
logrxi1
with
DDR r , 
r= 3
log 2
0
1
log 4
0
2
log6

14
(Normalized) Discounted Cumulative Gain
 a “positive” version of discounted error:
Discounted Cumulative Gain (DCG)
 Maximum possible value:
 the predicted ranking is correct,
i.e.
 Ideal Discounted Cumulative Gain (IDCG)
 Normalized DCG (NDCG)
D
E
C
B
A

r r
D
A
C
B
E
DCGr , 
r=∑
i=1
c
c−Ri
logi1
∀ i:i=Ri
IDCG=∑
i=1
c
c−i
logi1
NDCGr , 
r=
DCGr , 
r
IDCG
NDCGr , 
r=
1
log 2

3
log 3

4
log 4

2
log5

0
log6
4
log 2

3
log 3

2
log 4

1
log5

0
log6

15
AGENDA
2. Loss Functions
4. Conclusions

16
Bipartite Rankings
 The target ranking is not totally ordered
but a bipartite graph
 The two partitions may be viewed as
preference levels L = {0, 1}
 all c1 items of level 1 are preferred over all
c0 items of level 0
 We now have fewer preferences
 for a total order:
 for a bipartite graph:
Bipartite Rankings
D
E
C
B
A

r
D
C
A
B
E
r
c
2
⋅c−1
c1⋅c−c1

17
Evaluating Partial Target Rankings
 Many Measures can be directly adapted from
total target rankings to partial target rankings
 Recall: Kendall's distance
 number of item pairs that are inverted
in the target ranking
 can be directly used
 in case of normalization, we have to
consider that fewer items satisfy r(xi) < r(xj)
 Area under the ROC curve (AUC)
 the AUC is the fraction of pairs of (p,n) for
which the predicted score s(p) > s(n)
 Mann Whitney statistic is the absolute number
 This is 1 - normalized Kendall's distance
for a bipartite preference graph with L = {p,n}
D r , 
r =∣ {i , j∣r xirx j ∧ 
rxi
rx j}∣
D
E
C
B
A

r
D
C
A
B
E
r
D r , 
r = 2
AUCr , 
r = 4
6

18
Evaluating Multipartite Rankings
 Multipartite rankings:
 like Bipartite rankings
 but the target ranking r consists of
multiple relevance levels L = {1 … l},
where l < c
 total ranking is a special case where each
level has exactly one item
 # of preferences
 ci is the number of items in level I
 C-Index [Gnen & Heller, 2005]
 straight-forward generalization of AUC
 fraction of pairs (xi ,xj) for which
D
E
C
B
A

r
D
C
A
B
E
r
=∑
i , j
ci⋅c j ≤
c2
2
⋅1−
1
l

l il  j ∧ 
rxi
rx j 
D r , 
r = 3
C-Indexr , 
r = 5
8

19
Evaluating Multipartite Rankings
C-Index
 the C-index can be rewritten as a weighted sum of pairwise AUCs:
where and are the rankings and restricted to levels i and j.
Jonckheere-Terpstra statistic
 is an unweighted sum of pairwise AUCs:
 equivalent to well-known multi-class extension of AUC
[Hand & Till, MLJ 2001]
C-Indexr , 
r=
1
∑i , ji
ci⋅c j
∑i , ji
ci⋅c j
⋅AUCri , j , 
ri , j 
m-AUC=
2
l⋅l−1
∑i , ji
AUCri , j , 
ri , j 
Note:
C-Index and m-AUC
can be optimized by
optimization of
pairwise AUCs
Note:
C-Index and m-AUC
can be optimized by
optimization of
pairwise AUCs
ri , j 
ri , j r 
r

20
Normalized Discounted Cumulative Gain
[Jarvelin & Kekalainen, 2002]
D
E
C
B
A

r
D
C
A
B
E
r
 The original formulation of (normalized)
discounted cumulative gain refers to
this setting
 the sum of the true (relevance)
levels of the items
 each item weighted by its rank
in the predicted ranking
 Examples:
 retrieval of relevant or irrelevant pages
 2 relevance levels
 movie recommendation
 5 relevance levels
DCGr , 
r=∑
i=1
c
li
logi1

21
AGENDA
2. Loss Functions
4. Conclusions

22
Evaluating Partial Structures in the Predicted
Ranking
 For fixed types of partial structures, we have conventional measures
 bipartite graphs → binary classification
 accuracy, recall, precision, F1, etc.
 can also be used when the items are labels!
 e.g., accuracy on the set of labels for multilabel classification
 multipartite graphs → ordinal classification
 multiclass classification measures (accuracy, error, etc.)
 regression measures (sum of squared errors, etc.)
 For general partial structures
 some measures can be directly used on the reduced set of target preferences
 Kendall's distance, Gamma coefficient
 we can also use set measures on the set of binary preferences
 both, the source and the target ranking consist of a set of binary preferences
 e.g. Jaccard Coefficient
 size of interesection over size of union of the binary preferences in both sets
¿
¿

23
Gamma Coefficient
 Key idea: normalized difference between
 number of correctly ranked pairs
(Kendall's distance)
 number of incorrectly ranked pairs
 Gamma Coefficient
[Goodman & Kruskal, 1979]
 Identical to Kendall's tau
if both rankings are total
 i.e., if
D
E
C
B
A

r
D
C
A
B
E
r
d=Dr , 
r

d =∣ {i , j∣ rxirx j  ∧ 
rxi
rx j}∣
r , 
r=
d− 
d
d 
d
∈[−1,1]
r , 
r=2−1
21
=
1
3
d 
d =
c⋅c−1
2

24
References
 Cheng W., Rademaker M., De Baets B., Hüllermeier E.: Predicting Partial Orders: Ranking with Abstention. Proceedings ECML/PKDD-10(1): 215-230
(2010)
 Fürnkranz J., Hüllermeier E., Vanderlooy S.: Binary Decomposition Methods for Multipartite Ranking. Proceedings ECML/PKDD-09(1): 359-374 (2009)
 Gnen M., Heller G.: Concordance probability and discriminatory power in proportional hazards regression. Biometrika 92(4):965–970 (2005)
 Goodman, L., Kruskal, W.: Measures of Association for Cross Classifications. Springer-Verlag, New York (1979)
 Hand D.J., Till R.J.: A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2):171-
186 (2001)
 Jarvelin K., Kekalainen J.: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20(4): 422–446 (2002)
 Jonckheere, A. R.: A distribution-free k-sample test against ordered alternatives. Biometrika: 133–145 (1954)
 Kendall, M. A New Measure of Rank Correlation. Biometrika 30 (1-2): 81–89 (1938)
 Mann H. B.,Whitney D. R. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics,
18:50–60 (1947)
 Spearman C. The proof and measurement of association between two things. American Journal of Psychology, 15:72–101 (1904)

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

AGENDA

1. Preference
Learning
Tasks

2. Performance
Assessment
and
Loss
FuncEons

3. Preference
Learning
Techniques

a. Learning
UElity
FuncEons

b. Learning
Preference
RelaEons

c. Structured
Output
PredicEon

d. Model-‐Based
Preference
Learning

e. Local
Preference
AggregaEon

4. Conclusions

1

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Two
Ways
of
Represen>ng
Preferences

2

§ U>lity-‐based
approach:
EvaluaEng
single
alternaEves

§ Rela>onal
approach:
Comparing
pairs
of
alternaEves

weak
preference

strict
preference

indiﬀerence

incomparability

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

U>lity
Func>ons

3
§ A
u>lity
func>on
assigns
a
uElity
degree
(typically
a
real
number
or
an

ordinal
degree)
to
each
alternaEve.

§ Learning
such
a
funcEon
essenEally
comes
down
to
solving
an
(ordinal)

regression
problem.

§ OWen
addi>onal
condi>ons,
e.g.,
due
to
bounded
uElity
ranges
or

monotonicity
properEes
(à
learning
monotone
models)

§ A
u>lity
func>on
induces
a
ranking
(total
order),
but
not
the
other
way

around!

§ But
it
can
not
represent
more
general
relaEons,
e.g.,
a
par>al
order!

§ The
feedback
can
be
direct
(exemplary
uElity
degrees
given)
or
indirect

(inequality
induced
by
order
relaEon):

absolute
feedback
relaEve
feedback

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Predic>ng
U>li>es
on
Ordinal
Scales

4
(Graded)
mulElabel
classiﬁcaEon

CollaboraEve
ﬁltering

ExploiEng
dependencies

(correlaEons)
between
items

(labels,
products,
…)

à
see
work
in
MLC
and
RecSys
communiEes

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Learning
U>lity
Func>ons
from
Indirect
Feedback

5
§ A
(latent)
uElity
funcEon
can
also
be
used
to
solve
ranking
problems,

such
as
instance,
object
or
label
ranking

à
ranking
by
(es>mated)
u>lity
degrees
(scores)

Instance
ranking

Absolute
preferences
given,
so
in

principle
an
ordinal
regression

problem.
However,
the
goal
is
to

maximize
ranking
instead
of

classiﬁcaEon
performance.

Object
ranking

Find
a
uElity
funcEon
that
agrees

as
much
as
possible
with
the

preference
informaEon
in
the

sense
that,
for
most
examples,

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Ranking
versus
Classifica>on

6
posiEve
negaEve

A
ranker
can
be
turned
into
a
classifier
via
thresholding:

A
good
classifier
is
not
necessarily
a
good
ranker:

2
classificaEon
but

10
ranking
errors

à

learning
AUC-‐op'mizing
scoring
classifiers
!

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

RankSVM
and
Related
Methods
(Bipar>te
Case)

§ The
idea
is
to
minimize
a
convex
upper
bound
on
the
empirical
ranking

error
over
a
class
of
(kernelized)
ranking
funcEons:

7
convex
upper
bound
on

regularizer

check
for
all
posiEve/
negaEve
pairs

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

RankSVM
and
Related
Methods
(Bipar>te
Case)

§ The
biparEte
RankSVM
algorithm
[Herbrich
et
al.
2000,
Joachimes
2002]:

8
hinge
loss

regularizer

reproducing
kernel

Hilbert
space
(RKHS)
with

kernel
K
à
learning
comes
down
to
solving
a
QP
problem

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

RankSVM
and
Related
Methods
(Bipar>te
Case)

§ The
biparEte
RankBoost
algorithm
[Freund
et
al.
2003]:

9
class
of
linear

combinaEons
of
base

funcEons

à
learning
by
means
of
boosEng
techniques

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Learning
U>lity
Func>ons
for
Label
Ranking

10

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Label
Ranking:
Reduc>on
to
Binary
Classiﬁca>on
[Har-‐Peled
et
al.
2002]

11
Each
pairwise
comparison
is
turned
into
a
binary
classiﬁca>on

example
in
a
high-‐dimensional
space!

posiEve
example
in
the
new
instance
space

(m
x
k)-‐dimensional
weight
vector

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

AGENDA

1. Preference
Learning
Tasks

2. Performance
Assessment
and
Loss
FuncEons

3. Preference
Learning
Techniques

a. Learning
UElity
FuncEons

b. Learning
Preference
Rela>ons

c. Structured
Output
PredicEon

d. Model-‐Based
Preference
Learning

e. Local
Preference
AggregaEon

4. Conclusions

12

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Learning
Binary
Preference
Rela>ons

§ Learning
binary
preferences
(in
the
form
of
predicates
P(x,y))
is
oWen

simpler,
especially
if
the
training
informaEon
is
given
in
this
form,
too.

§ However,
it
implies
an
addiEonal
step,
namely
extrac>ng
a
ranking
from
a

(predicted)
preference
relaEon.

§ This
step
is
not
always
trivial,
since
a
predicted
preference
relaEon
may

exhibit
inconsistencies
and
may
not
suggest
a
unique
ranking
in
an

unequivocal
way.

13
1
1
0
0

0
0
1
0

0
1
0
0

1
0
1
1

1
1
1
0

inference

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Object
Ranking:
Learning
to
Order
Things
[Cohen
et
al.
99]

§ In
a
ﬁrst
step,
a
binary
preference
func>on
PREF
is
constructed;
PREF
(x,y) 2
[0,1]
is
a
measure
of
the
certainty
that
x
should
be
ranked
before

y,
and
PREF(x,y)=1- PREF(y,x).

§ This
funcEon
is
expressed
as
a
linear
combinaEon
of
base
preference

funcEons:

§ The
weights
can
be
learned,
e.g.,
by
means
of
the
weighted
majority

algorithm
[Lijlestone
&
Warmuth
94].

§ In
a
second
step,
a
total
order
is
derived,
which
is
as
much
as
possible
in

agreement
with
the
binary
preference
relaEon.

14

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Object
Ranking:
Learning
to
Order
Things
[Cohen
et
al.
99]

§ The
weighted
feedback
arc
set
problem:
Find
a
permutaEon
¼
such
that

becomes
minimal.

15
0.7

0.9

0.6

0.6

0.8

0.5

0.3

0.1
0.6

0.4

0.5

0.8

cost
=
0.1+0.6+0.8+0.5+0.3+0.4
=
2.7

0.1

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Object
Ranking:
Learning
to
Order
Things
[Cohen
et
al.
99]

§ Since
this
is
an
NP-‐hard
problem,
it
is
solved
heurisEcally.

16
Input:

Output:

let

for

do

while

do

let

let

for

do

endwhile

§ The
algorithm
successively
chooses
nodes
having
maximal
„net-‐ﬂow“
within
the

remaining
subgraph.

§
It
can
be
shown
to
provide
a
2-‐approxima>on
to
the
opEmal
soluEon.

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Label
Ranking:
Learning
by
Pairwise
Comparison
(LPC)
[Hüllermeier
et
al.
2008]

1

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

2
X1
X2
X3
X4
preferences
class

0.34
0
10
174
A
Â
B,
B
Â
C,
C
Â
D
1

1.45
0
32
277
B
Â
C

1.22
1
46
421
B
Â
D,
B
Â
A,
C
Â
D,
A
Â
C
0

0.74
1
25
165
C
Â
A,
C
Â
D,
A
Â
B
1

0.95
1
72
273
B
Â
D,
A
Â
D,

1.04
0
33
158
D
Â
A,
A
Â
B,
C
Â
B,
A
Â
C
1

Label
Ranking:
Learning
by
Pairwise
Comparison
(LPC)
[Hüllermeier
et
al.
2008]

Training
data
(for
the
label
pair
A
and
B):

X1
X2
X3
X4
class

0.34
0
10
174
1

1.22
1
46
421
0

0.74
1
25
165
1

1.04
0
33
158
1

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

3
At
predicUon
Ume,
a
query
instance
is
submiYed
to
all
models,
and
the

predicUons
are
combined
into
a
binary
preference
relaUon:

A
B
C
D

A
0.3
0.8
0.4

B
0.7
0.7
0.9

C
0.2
0.3
0.3

D
0.6
0.1
0.7

Label
Ranking:
Learning
by
Pairwise
Comparison
(LPC)
[Hüllermeier
et
al.
2008]

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

4
At
predicUon
Ume,
a
query
instance
is
submiYed
to
all
models,
and
the

predicUons
are
combined
into
a
binary
preference
relaUon:

A
B
C
D

A
0.3
0.8
0.4
1.5

B
0.7
0.7
0.9
2.3

C
0.2
0.3
0.3
0.8

D
0.6
0.1
0.7
1.4

From
this
relaUon,
a
ranking
is
derived
by
means
of
a
ranking
procedure.

In
the
simplest
case,
this
is
done
by
sorUng
the
labels
according
to
their

sum
of
weighted
votes.

B Â A
Â D
Â C
Label
Ranking:
Learning
by
Pairwise
Comparison
(LPC)
[Hüllermeier
et
al.
2008]

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

AGENDA

1. Preference
Learning
Tasks

2. Performance
Assessment
and
Loss
FuncUons

3. Preference
Learning
Techniques

a. Learning
UUlity
FuncUons

b. Learning
Preference
RelaUons

c. Structured
Output
PredicMon

d. Model-‐Based
Preference
Learning

e. Local
Preference
AggregaUon

4. Conclusions

5

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Structured
Output
PredicMon
[Bakir
et
al.
2007]

§ Rankings,
mulUlabel
classiﬁcaUons,
etc.
can
be
seen
as
speciﬁc
types
of

structured
(as
opposed
to
scalar)
outputs.

§ DiscriminaUve
structured
predicUon
algorithms
infer
a
joint
scoring

funcMon
on
input-‐output
pairs
and,
for
a
given
input,
predict
the
output

that
maximises
this
scoring
funcUon.

§ Joint
feature
map
and
scoring
funcUon

§ The
learning
problem
consists
of
esUmaUng
the
weight
vector,
e.g.,
using

structural
risk
minimizaUon.

§ PredicUon
requires
solving
a
decoding
problem:

6

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

§ Preferences
are
expressed
through
inequaliMes
on
inner
products:

§ The
potenUally
huge
number
of
constraints
cannot
be
handled
explicitly

and
calls
for
speciﬁc
techniques
(such
as
cucng
plane
opUmizaUon)

7
loss
funcUon

Structured
Output
PredicMon
[Bakir
et
al.
2007]

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

AGENDA

1. Preference
Learning
Tasks

2. Performance
Assessment
and
Loss
FuncUons

3. Preference
Learning
Techniques

a. Learning
UUlity
FuncUons

b. Learning
Preference
RelaUons

c. Structured
Output
PredicUon

d. Model-‐Based
Preference
Learning

e. Local
Preference
AggregaUon

4. Conclusions

8

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Model-‐Based
Methods
for
Ranking

§ Model-‐based
approaches
to
ranking
proceed
from
speciﬁc
assumpUons

about
the
possible
rankings
(representaMon
bias)
or
make
use
of

probabilisMc
models
for
rankings
(parametrized
probability
distribuUons

on
the
set
of
rankings).

§ In
the
following,
we
shall
see
examples
of
both
type:

- RestricUon
to
lexicographic
preferences

- CondiUonal
preference
networks
(CP-‐nets)

- Label
ranking
using
the
PlackeY-‐Luce
model

see
our
talk
tomorrow

9

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Learning
Lexicographic
Preference
Models
[Yaman
et
al.
2008]

§ Suppose
that
objects
are
represented
as
feature
vectors
of
length
m,
and

that
each
aYribute
has
k
values.

§ For
n=km
objects,
there
are
n!
permutaUons
(rankings).

§ A
lexicographic
order
is
uniquely
determined
by

- a
total
order
of
the
aYributes

- a
total
order
of
each
aYribute
domain

§ Example:
Four
binary
aYributes
(m=4,
k=2)

- there
are
16! ¼
2
·
1013
rankings

- but
only
(24)
·
4!
=
384
of
them
can
be
expressed
in
terms
of
a

lexicographic
order

§ [Yaman
et
al.
2008]
present
a
learning
algorithm
that
explictly
maintains

the
version
space,
i.e.,
the
aYribute-‐orders
compaUble
with
all
pairwise

preferences
seen
so
far
(assuming
binary
aYributes
with
1
preferred
to
0).

PredicUons
are
derived
based
on
the
„votes“
of
the
consistent
models.

10

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Learning
CondiMonal
Preference
(CP)
Networks
[Chevaleyre
et
al.
2010]

11
main
dish

drink
restaurant

meat > veggie > fish
meat: red wine > white wine
veggie: red wine > white wine
fish: white wine > red wine
meat: Italian > Chinese
veggie: Chinese > Italian
fish: Chinese > Italian
Compact
representaUon
of
a

parUal
order
relaUon,
exploiUng

condiUonal
independence
of

preferences
on
aYribute
values.

Induces
parUal
order
relaUon,
e.g.,

(meat, red wine, Italian) > (meat, white wine, Chinese)
(fish, white wine, Chinese) > (fish, red wine, Chinese)
(meat, white wine, Italian) ? (meat, red wine, Chinese)

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Learning
CondiMonal
Preference
(CP)
Networks
[Chevaleyre
et
al.
2010]

12
main
dish

drink
restaurant

meat > veggie > fish
meat: red wine > white wine
veggie: red wine > white wine
fish: white wine > red wine
meat: Italian > Chinese
veggie: Chinese > Italian
fish: Chinese > Italian
Compact
representaUon
of
a

parUal
order
relaUon,
exploiUng

condiUonal
independence
of

preferences
on
aYribute
values.

(meat, red wine, Italian) > (veggie, red wine, Italian)
(fish, whited wine, Chinease) > (veggie, red wine, Chinease)
(veggie, whited wine, Chinease) > (veggie, red wine, Italian)
… … …
Training
data
(possibly
noisy):

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

AGENDA

1. Preference
Learning
Tasks

2. Performance
Assessment
and
Loss
FuncUons

3. Preference
Learning
Techniques

a. Learning
UUlity
FuncUons

b. Learning
Preference
RelaUons

c. Structured
Output
PredicUon

d. Model-‐Based
Preference
Learning

e. Local
Preference
AggregaUon

see
our
talk
tomorrow

4. Conclusions

13

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

Summary
of
Main
Algorithmic
Principles

§ ReducMon
of
ranking
to
(binary)
classificaUon
(e.g.,
constraint

classificaUon,
LPC)

§ Direct
opMmizaMon
of
(regularized)
smooth
approximaUon
of
ranking

losses
(RankSVM,
RankBoost,
…)

§ Structured
output
predicMon,
learning
joint
scoring
(„matching“)

funcUon

§ Learning
parametrized
probabilisMc
ranking
models
(e.g.,
PlackeY-‐Luce)

§ Restricted
model
classes,
ficng
(parametrized)
determinisUc
models

(e.g.,
lexicographic
orders)

§ Lazy
learning,
local
preference
aggregaUon
(lazy
learning)

14

DS
2011
Tutorial
on
Preference
Learning
|
Part
3
|
J.
Fürnkranz
&
E.
Hüllermeier

References

15
§ G.
Bakir,
T.
Hofmann,
B.
Schölkopf,
A.
Smola,
B.
Taskar
and
S.
Vishwanathan.
Predic'ng
structured
data.
MIT
Press,
2007.

§ W.
Cheng,
K.
Dembczynski
and
E.
Hüllermeier.
Graded
Mul'label
Classifica'on:
The
Ordinal
Case.
ICML-‐2010,
Haifa,
Israel,
2010.

§ W.
Cheng,
K.
Dembczynski
and
E.
Hüllermeier.
Label
ranking
using
the
Placke=-‐Luce
model.
ICML-‐2010,
Haifa,
Israel,
2010.

§ W.
Cheng
and
E.
Hüllermeier.
Predic'ng
par'al
orders:
Ranking
with
absten'on.
ECML/PKDD-‐2010,
Barcelona,
2010.

§ W.
Cheng,
C.
Hühn
and
E.
Hüllermeier.
Decision
tree
and
instance-‐based
learning
for
label
ranking.
ICML-‐2009.

§ Y.
Chevaleyre,
F.
Koriche,
J.
Lang,
J.
Mengin,
B.
Zanucni.
Learning
ordinal
preferences
on
mul'a=ribute
domains:
The
case
of
CP-‐nets.
In:
J.

Fürnkranz
and
E.
Hüllermeier
(eds.)
Preference
Learning,
Springer-‐Verlag,
2010.

§ W.W.
Cohen,
R.E.
Schapire
and
Y.
Singer.
Learning
to
order
things.
Journal
of
ArUficial
Intelligence
Research,
10:243–270,
1999.

§ Y.
Freund,
R.
Iyer,
R.
E.
Schapire
and
Y.
Singer.
An
efficient
boos'ng
algorithm
for
combining
preferences.
Journal
of
Machine
Learning

Research,

4:933–969,
2003.

§ J.
Fürnkranz,
E.
Hüllermeier,
E.
Mencia,
and
K.
Brinker.
Mul'label
Classifica'on
via
Calibrated
Label
Ranking.
Machine
Learning
73(2):133-‐153,

2008.

§ J.
Fürnkranz,
E.
Hüllermeier
and
S.
Vanderlooy.
Binary
decomposi'on
methods
for
mul'par'te
ranking.
Proc.
ECML-‐2009,
Bled,
Slovenia,
2009.

§ D.
Goldberg,
D.
Nichols,
B.M.
Oki
and
D.
Terry.
Using
collabora've
filtering
to
weave
and
informa'on
tapestry.
CommunicaUons
of
the
ACM,
35
(12):61–70,
1992.

§ S.
Har-‐Peled,
D.
Roth
and
D.
Zimak.
Constraint
classifica'on:
A
new
approach
to
mul'class
classifica'on.
Proc.
ALT-‐2002.

§ R.
Herbrich,
T.
Graepel
and
K.
Obermayer.
Large
margin
rank
boundaries
for
ordinal
regression.
Advances
in
Large
Margin
Classifiers,
2000.

§ E.
Hüllermeier,
J.
Fürnkranz,
W.
Cheng
and
K.
Brinker.
Label
ranking
by
learning
pairwise

preferences.
ArUficial
Intelligence,
172:1897–1916,

2008.

§ T.
Joachims.
Op'mizing
search
engines
using
clickthrough
data.
Proc.
KDD
2002.

§
N.
LiYlestone
and
M.K.
Warmuth.
The
weighted
majority
algorithm.
InformaUon
and
ComputaUon,
108(2):
212–261,
1994.

§ G.
Tsoumakas
and
I.
Katakis.
Mul'label
classifica'on:
An
overview.
Int.
J.
Data
Warehouse
and
Mining,
3:1–13,
2007.

§ F.
Yaman,
T.
Walsh,
M.
LiYman
and
M.
desJardins.
Democra'c
Approxima'on
of
Lexicographic
Preference
Models.
ICML-‐2008.

DS
2011
Tutorial
on
Preference
Learning
|
Part
5
|
J.
Fürnkranz
&
E.
Hüllermeier

AGENDA

1. Preference
Learning
Tasks

2. Performance
Assessment
and
Loss
FuncEons

3. Preference
Learning
Techniques

4. Conclusions

1

DS
2011
Tutorial
on
Preference
Learning
|
Part
5
|
J.
Fürnkranz
&
E.
Hüllermeier

Conclusions

§ Preference
learning
is
an
emerging
subfield
of
machine
learning,

with
many
applica:ons
and
theore:cal
challenges.

§ PredicEon
of
preference
models
instead
of
scalar
outputs
(like
in

classificaEon
and
regression),
hitherto
with
a
focus
on
rankings.

§ Many
exisEng
machine
learning
problems
can
be
cast
in
the
framework
of

preference
learning
(à
preference
learning
„in
a
broad
sense“)

§ „Qualita:ve“
alterna:ve
to
convenEonal
numerical
approaches

- pairwise
comparison
instead
of
numerical
evaluaEon,

- order
relaEons
instead
of
individual
assessment.

§ SEll
many
open
problems
(unified
framework,
predicEons
more
general

than
rankings,
incorporaEng
numerical
informaEon,
etc.)

§ Interdisciplinary
field,
connecEons
to
many
other
areas.

2

DS
2011
Tutorial
on
Preference
Learning
|
Part
5
|
J.
Fürnkranz
&
E.
Hüllermeier

Connec:ons
to
Other
Fields

3
Preference

Learning

Recommender

Systems

MulElabel

ClassiﬁcaEon

Learning

Monotone

Models

Structured

Output

PredicEon

Ranking
in

InformaEon

Retrieval

Ordinal

ClassiﬁcaEon

OperaEons

Research

MulEple
Criteria

Decision
Making

Social

Choice

Economics
&

Decison
Theory

DS
2011
Tutorial
on
Preference
Learning
|
Part
5
|
J.
Fürnkranz
&
E.
Hüllermeier

Edited
Book
on
Preference
Learning

4
Preference
Learning:
An
IntroducEon

A
Preference
OpEmizaEon
based
Unifying
Framework
for
Supervised
Learning
Problems

Part
I
–
Label
Ranking

Label
Ranking
Algorithms:
A
Survey

Preference
Learning
and
Ranking
by
Pairwise
Comparison

Decision
Tree
Modeling
for
Ranking
Data

Co-‐regularized
Least-‐Squares
for
Label
Ranking

Part
II
–
Instance
Ranking

A
Survey
on
ROC-‐Based
Ordinal
Regression

Ranking
Cases
with
ClassiﬁcaEon
Rules

Part
III
–
Object
Ranking

A
Survey
and
Empirical
Comparison
of
Object
Ranking
Methods

Dimension
ReducEon
for
Object
Ranking

Learning
of
Rule
Ensembles
for
MulEple
A_ribute
Ranking
Problems

Part
IV
–
Preferences
in
Mul:aOribute
Domains

Learning
Lexicographic
Preference
Models

Learning
Ordinal
Preferences
on
MulEa_ribute
Domains:
the
Case
of
CP-‐nets

Choice-‐Based
Conjoint
Analysis:
ClassiﬁcaEon
vs.
Discrete
Choice
Models

Learning
AggregaEon
Operators
for
Preference
Modeling

Part
V
–
Preferences
in
Informa:on
Retrieval

EvaluaEng
Search
Engine
Relevance
with
Click-‐Based
Metrics

Learning
SVM
Ranking
FuncEon
from
User
Feedback
Using
Document
Metadata
and
AcEve
Learning
in
the
Biomedical

Domain

Part
VI
–
Preferences
in
Recommender
Systems

Learning
Preference
Models
in
Recommender
Systems

CollaboraEve
Preference
Learning

Discerning
Relevant
Model
Features
in
a
Content-‐Based
CollaboraEve
Recommender
System

J.
Fürnkranz
&

E.
Hüllermeier
(eds.)

Preference
Learning

Springer-‐Verlag
2010

DS
2011
Tutorial
on
Preference
Learning
|
Part
5
|
J.
Fürnkranz
&
E.
Hüllermeier

Edited
Book
on
Preference
Learning

5
Preference
Learning:
An
Introduc:on

A
Preference
OpEmizaEon
based
Unifying
Framework
for
Supervised
Learning
Problems

Part
I
–
Label
Ranking

Label
Ranking
Algorithms:
A
Survey

Preference
Learning
and
Ranking
by
Pairwise
Comparison

Decision
Tree
Modeling
for
Ranking
Data

Co-‐regularized
Least-‐Squares
for
Label
Ranking

Part
II
–
Instance
Ranking

A
Survey
on
ROC-‐Based
Ordinal
Regression

Ranking
Cases
with
ClassiﬁcaEon
Rules

Part
III
–
Object
Ranking

A
Survey
and
Empirical
Comparison
of
Object
Ranking
Methods

Dimension
ReducEon
for
Object
Ranking

Learning
of
Rule
Ensembles
for
MulEple
A_ribute
Ranking
Problems

Part
IV
–
Preferences
in
Mul:aOribute
Domains

Learning
Lexicographic
Preference
Models

Learning
Ordinal
Preferences
on
MulEa_ribute
Domains:
the
Case
of
CP-‐nets

Choice-‐Based
Conjoint
Analysis:
ClassiﬁcaEon
vs.
Discrete
Choice
Models

Learning
AggregaEon
Operators
for
Preference
Modeling

Part
V
–
Preferences
in
Informa:on
Retrieval

EvaluaEng
Search
Engine
Relevance
with
Click-‐Based
Metrics

Learning
SVM
Ranking
FuncEon
from
User
Feedback
Using
Document
Metadata
and
AcEve
Learning
in
the
Biomedical

Domain

Part
VI
–
Preferences
in
Recommender
Systems

Learning
Preference
Models
in
Recommender
Systems

CollaboraEve
Preference
Learning

Discerning
Relevant
Model
Features
in
a
Content-‐Based
CollaboraEve
Recommender
System

J.
Fürnkranz
&

E.
Hüllermeier
(eds.)

Preference
Learning

Springer-‐Verlag
2010

includes
several
introduc:ons

and
survey
ar:cles

DS
2011
Tutorial
on
Preference
Learning
|
Part
5
|
J.
Fürnkranz
&
E.
Hüllermeier

Preference
Learning
Website

§ Working
groups

§ Sobware

§ Data
Sets

§ Workshops

§ Tutorials

§ Books

§ ...

6
http://www.preference-learning.org/

PL-Tutorial-DS-11_machine learning .pdf

More Related Content

Featured

PL-Tutorial-DS-11_machine learning .pdf