Preference	
  Learning:	
  	
  
A	
  Tutorial	
  Introduc5on	
  
Eyke	
  Hüllermeier	
  
Knowledge	
  Engineering	
  &	
  Bioinforma2cs	
  Lab	
  
Dept.	
  of	
  Mathema2cs	
  and	
  Computer	
  Science	
  
Marburg	
  University,	
  Germany	
  
DS	
  2011,	
  Espoo,	
  Finland,	
  Oct	
  2011	
  
Johannes	
  Fürnkranz	
  
Knowledge	
  Engineering	
  
Dept.	
  of	
  Computer	
  Science	
  
Technical	
  University	
  Darmstadt,	
  Germany	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
What	
  is	
  Preference	
  Learning	
  ?	
  
§ Preference	
  learning	
  is	
  an	
  emerging	
  subfield	
  of	
  machine	
  learning	
  	
  
§ Roughly	
  speaking,	
  it	
  deals	
  with	
  the	
  learning	
  of	
  (predic5ve)	
  preference	
  
models	
  from	
  observed	
  (or	
  extracted)	
  preference	
  informa2on	
  	
  
MACHINE	
  
LEARNING	
  
PREFERENCE	
  MODELING	
  
and	
  DECISION	
  ANALYSIS	
  
Preference	
  Learning	
  
computer	
  science	
  
ar2ficial	
  intelligence	
  
opera2ons	
  research	
  
social	
  sciences	
  (vo2ng	
  and	
  choice	
  theory)	
  
economics	
  and	
  decision	
  theory	
  
2
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Workshops	
  and	
  Related	
  Events	
  
§ NIPS–01:	
  New	
  Methods	
  for	
  Preference	
  Elicita2on	
  
§ NIPS–02:	
  Beyond	
  Classifica2on	
  and	
  Regression:	
  Learning	
  Rankings,	
  
Preferences,	
  Equality	
  Predicates,	
  and	
  Other	
  Structures	
  
§ KI–03:	
  Preference	
  Learning:	
  Models,	
  Methods,	
  Applica2ons	
  
§ NIPS–04:	
  Learning	
  With	
  Structured	
  Outputs	
  
§ NIPS–05:	
  Workshop	
  on	
  Learning	
  to	
  Rank	
  
§ IJCAI–05:	
  Advances	
  in	
  Preference	
  Handling	
  
§ SIGIR	
  07–10:	
  Workshop	
  on	
  Learning	
  to	
  Rank	
  for	
  Informa2on	
  Retrieval	
  
§ ECML/PDKK	
  08–10:	
  Workshop	
  on	
  Preference	
  Learning	
  
§ NIPS–09:	
  Workshop	
  on	
  Advances	
  in	
  Ranking	
  
§ American	
  Ins2tute	
  of	
  Mathema2cs	
  Workshop	
  in	
  Summer	
  2010:	
  The	
  
Mathema2cs	
  of	
  Ranking	
  
§ NIPS-­‐11:	
  Workshop	
  on	
  Choice	
  Models	
  and	
  Preference	
  Learning	
  
3
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Preferences	
  in	
  Ar5ficial	
  Intelligence	
  
User	
  preferences	
  play	
  a	
  key	
  role	
  in	
  various	
  fields	
  of	
  applica2on:	
  
- recommender	
  systems,	
  
- adap2ve	
  user	
  interfaces,	
  
- adap2ve	
  retrieval	
  systems,	
  
- autonomous	
  agents	
  (electronic	
  commerce),	
  
- games,	
  …	
  
Preferences	
  in	
  AI	
  research:	
  
- preference	
  representa5on	
  (CP	
  nets,	
  GAU	
  networks,	
  logical	
  
representa2ons,	
  fuzzy	
  constraints,	
  …)	
  
- reasoning	
  with	
  preferences	
  (decision	
  theory,	
  constraint	
  sa2sfac2on,	
  
non-­‐monotonic	
  reasoning,	
  …)	
  
- preference	
  acquisi5on	
  (preference	
  elicita2on,	
  preference	
  
learning,	
  ...)	
  
More	
  generally,	
  „preferences“	
  is	
  a	
  key	
  topic	
  in	
  current	
  AI	
  research	
  
4
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
AGENDA	
  
1. Preference	
  Learning	
  Tasks	
  	
  
2. Performance	
  Assessment	
  and	
  Loss	
  Func2ons	
  	
  
3. Preference	
  Learning	
  Techniques	
  	
  
4. Conclusions	
  
5
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Preference	
  Learning	
  
Preference	
  learning	
  problems	
  can	
  be	
  dis2nguished	
  along	
  several	
  
problem	
  dimensions,	
  including	
  
§ representa5on	
  of	
  preferences,	
  type	
  of	
  preference	
  model:	
  	
  
- u2lity	
  func2on	
  (ordinal,	
  numeric),	
  	
  
- preference	
  rela2on	
  (par2al	
  order,	
  ranking,	
  …),	
  	
  
- logical	
  representa2on,	
  …	
  
§ descrip5on	
  of	
  individuals/users	
  and	
  alterna5ves/items:	
  	
  
- iden2fier,	
  feature	
  vector,	
  structured	
  object,	
  …	
  
§ type	
  of	
  training	
  input:	
  	
  
- direct	
  or	
  indirect	
  feedback,	
  	
  
- complete	
  or	
  incomplete	
  rela2ons,	
  	
  
- u2li2es,	
  …	
  
§ …	
  
6
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Preference	
  Learning	
  
7
Preferences	
  
absolute	
   rela5ve	
  
A B	
   C	
   D
1	
   1	
   0	
   0	
  
A	
   B	
   C	
   D	
  
.9	
   .8	
   .1	
   .3	
  
binary	
   gradual	
   total	
  order	
   par5al	
  order	
  
Â
A Â Â
B C D Â
A
B
C
D
Â
ordinal	
  
numeric	
  
A	
   B	
   C	
   D	
  
+	
   +	
   -­‐	
   0	
  
à	
  (ordinal)	
  regression	
   à	
  classifica2on/ranking	
  
assessing	
   comparing	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Structure	
  of	
  this	
  Overview	
  
(1) Preference	
  learning	
  as	
  an	
  extension	
  of	
  conven5onal	
  supervised	
  learning:	
  
Learn	
  a	
  mapping	
  
	
  
	
  
that	
  maps	
  instances	
  to	
  preference	
  models	
  (à	
  structured/complex	
  output	
  
predic2on).	
  
(2) Other	
  selngs	
  (object	
  ranking,	
  instance	
  ranking,	
  CF,	
  …)	
  
8
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Structure	
  of	
  this	
  Overview	
  
(1) Preference	
  learning	
  as	
  an	
  extension	
  of	
  conven5onal	
  supervised	
  learning:	
  
Learn	
  a	
  mapping	
  
	
  
	
  
that	
  maps	
  instances	
  to	
  preference	
  models	
  (à	
  structured/complex	
  output	
  
predic2on).	
  
	
  
	
  
Instances	
  are	
  typically	
  (though	
  not	
  necessarily)	
  characterized	
  in	
  terms	
  of	
  a	
  
feature	
  vector.	
  
	
  
The	
  output	
  space	
  consists	
  of	
  preference	
  models	
  over	
  a	
  fixed	
  set	
  of	
  
alterna2ves	
  (classes,	
  labels,	
  …)	
  represented	
  in	
  terms	
  of	
  an	
  iden2fier	
  
à	
  extensions	
  of	
  mul--­‐class	
  classifica-on	
  
9
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Mul5label	
  Classifica5on	
  [Tsoumakas	
  &	
  Katakis	
  2007]	
  
10
X1	
   X2	
   X3	
   X4	
   A	
   B	
   C	
   D	
  
0.34	
   0	
   10	
   174	
   0	
   1	
   1	
   0	
  
1.45	
   0	
   32	
   277	
   0	
   1	
   0	
   1	
  
1.22	
   1	
   46	
   421	
   0	
   0	
   0	
   1	
  
0.74	
   1	
   25	
   165	
   0	
   1	
   1	
   1	
  
0.95	
   1	
   72	
   273	
   1	
   0	
   1	
   0	
  
1.04	
   0	
   33	
   158	
   1	
   1	
   1	
   0	
  
0.92	
   1	
   81	
   382	
   0	
   1	
   0	
   1	
  
Training	
  
0.92	
   1	
   81	
   382	
   1	
   1	
   0	
   1	
  
Binary	
  
preferences	
  on	
  a	
  
fixed	
  set	
  of	
  items:	
  
liked	
  or	
  disliked	
  
Predic5on	
  
Ground	
  truth	
  
LOSS	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Mul5label	
  Ranking	
  
11
X1	
   X2	
   X3	
   X4	
   A	
   B	
   C	
   D	
  
0.34	
   0	
   10	
   174	
   0	
   1	
   1	
   0	
  
1.45	
   0	
   32	
   277	
   0	
   1	
   0	
   1	
  
1.22	
   1	
   46	
   421	
   0	
   0	
   0	
   1	
  
0.74	
   1	
   25	
   165	
   0	
   1	
   1	
   1	
  
0.95	
   1	
   72	
   273	
   1	
   0	
   1	
   0	
  
1.04	
   0	
   33	
   158	
   1	
   1	
   1	
   0	
  
0.92	
   1	
   81	
   382	
   4	
   1	
   3	
   2	
  
Training	
  
Predic5on	
  
0.92	
   1	
   81	
   382	
   1	
   1	
   0	
   1	
  
Ground	
  truth	
  
Â
B Â Â
D C A
Binary	
  
preferences	
  on	
  a	
  
fixed	
  set	
  of	
  items:	
  
liked	
  or	
  disliked	
  
A	
  ranking	
  of	
  all	
  
items	
  
LOSS	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Graded	
  Mul5label	
  Classifica5on	
  [Cheng	
  et	
  al.	
  2010]	
  
12
X1	
   X2	
   X3	
   X4	
   A	
   B	
   C	
   D	
  
0.34	
   0	
   10	
   174	
   -­‐-­‐	
   +	
   ++	
   0	
  
1.45	
   0	
   32	
   277	
   0	
   ++	
   -­‐-­‐	
   +	
  
1.22	
   1	
   46	
   421	
   -­‐-­‐	
   -­‐-­‐	
   0	
   +	
  
0.74	
   1	
   25	
   165	
   0	
   +	
   +	
   ++	
  
0.95	
   1	
   72	
   273	
   +	
   0	
   ++	
   -­‐-­‐	
  
1.04	
   0	
   33	
   158	
   +	
   +	
   ++	
   -­‐-­‐	
  
0.92	
   1	
   81	
   382	
   -­‐-­‐	
   +	
   0	
   ++	
  
Training	
  
Predic5on	
  
0.92	
   1	
   81	
   382	
   0	
   ++	
   -­‐-­‐	
   +	
  
Ground	
  truth	
  
Ordinal	
  
preferences	
  on	
  a	
  
fixed	
  set	
  of	
  items:	
  
liked,	
  disliked,	
  or	
  
something	
  in-­‐
between	
  	
  
	
  
LOSS	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Graded	
  Mul5label	
  Ranking	
  
13
X1	
   X2	
   X3	
   X4	
   A	
   B	
   C	
   D	
  
0.34	
   0	
   10	
   174	
   -­‐-­‐	
   +	
   ++	
   0	
  
1.45	
   0	
   32	
   277	
   0	
   ++	
   -­‐-­‐	
   +	
  
1.22	
   1	
   46	
   421	
   -­‐-­‐	
   -­‐-­‐	
   0	
   +	
  
0.74	
   1	
   25	
   165	
   0	
   +	
   +	
   ++	
  
0.95	
   1	
   72	
   273	
   +	
   0	
   ++	
   -­‐-­‐	
  
1.04	
   0	
   33	
   158	
   +	
   +	
   ++	
   -­‐-­‐	
  
0.92	
   1	
   81	
   382	
   4	
   1	
   3	
   2	
  
Training	
  
Predic5on	
  
0.92	
   1	
   81	
   382	
   0	
   ++	
   -­‐-­‐	
   +	
  
Ground	
  truth	
  
Â
B Â Â
D C A
A	
  ranking	
  of	
  all	
  
items	
  
LOSS	
  
Ordinal	
  
preferences	
  on	
  a	
  
fixed	
  set	
  of	
  items:	
  
liked,	
  disliked,	
  or	
  
something	
  in-­‐
between	
  	
  
	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Label	
  Ranking	
  [Hüllermeier	
  et	
  al.	
  2008]	
  
14
X1	
   X2	
   X3	
   X4	
   Preferences	
  
0.34	
   0	
   10	
   174	
   A	
  Â	
  B,	
  B	
  Â	
  C,	
  C	
  Â	
  D	
  
1.45	
   0	
   32	
   277	
   B	
  Â	
  C	
  
1.22	
   1	
   46	
   421	
   B	
  Â	
  D,	
  A	
  Â	
  D,	
  C	
  Â	
  D,	
  A	
  Â	
  C	
  
0.74	
   1	
   25	
   165	
   C	
  Â	
  A,	
  C	
  Â	
  D,	
  A	
  Â	
  B	
  
0.95	
   1	
   72	
   273	
   B	
  Â	
  D,	
  A	
  Â	
  D	
  
1.04	
   0	
   33	
   158	
   D	
  Â	
  A,	
  A	
  Â	
  B,	
  C	
  Â	
  B,	
  A	
  Â	
  C	
  
0.92	
   1	
   81	
   382	
   4	
   1	
   3	
   2	
  
Training	
  
Predic5on	
  
0.92	
   1	
   81	
   382	
   2	
   1	
   3	
   4	
  
Ground	
  truth	
  
Â
B Â Â
D C A
A	
  ranking	
  of	
  all	
  
labels	
  
Instances	
  are	
  
associated	
  with	
  
pairwise	
  
preferences	
  
between	
  labels.	
  
LOSS	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Calibrated	
  Label	
  Ranking	
  [Fürnkranz	
  et	
  al.	
  2008]	
  
15
Combining	
  absolute	
  and	
  rela2ve	
  evalua2on:	
  
Â
a Â
b c	
   d   Â
e f	
   g
relevant	
  
posi2ve	
  
liked	
  
irrelevant	
  
nega2ve	
  
disliked	
  
Preferences	
  
absolute	
   rela5ve	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Structure	
  of	
  this	
  Overview	
  
(1) Preference	
  Learning	
  as	
  an	
  extension	
  of	
  conven2onal	
  supervised	
  learning:	
  
Learn	
  a	
  mapping	
  
	
  
	
  
that	
  maps	
  instances	
  to	
  preference	
  models	
  (à	
  structured	
  output	
  
predic2on).	
  
(2) Other	
  sedngs	
  (no	
  clear	
  dis2nc2on	
  between	
  input/output	
  space)	
  
	
  	
  
	
  object	
  ranking,	
  instance	
  ranking,	
  collabora2ve	
  filtering	
  	
  
16
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Object	
  Ranking	
  [Cohen	
  et	
  al.	
  99]	
  
17
Training	
  
Pairwise	
  
preferences	
  
between	
  objects	
  
(instances).	
  
Predic5on	
  (ranking	
  a	
  new	
  set	
  of	
  objects)	
  
Ground	
  truth	
  (ranking	
  or	
  top-­‐ranking	
  or	
  subset	
  of	
  relevant	
  objects)	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Instance	
  Ranking	
  [Fürnkranz	
  et	
  al.	
  2009]	
  
18
Training	
  
X1	
   X2	
   X3	
   X4	
   class	
  
0.34	
   0	
   10	
   174	
   -­‐-­‐	
  
1.45	
   0	
   32	
   277	
   0	
  
0.74	
   1	
   25	
   165	
   ++	
  
…	
   …	
   …	
   …	
   …	
  
0.95	
   1	
   72	
   273	
   +	
  
Predic5on	
  (ranking	
  a	
  new	
  set	
  of	
  objects)	
  
+	
   0	
   ++	
   ++	
   -­‐-­‐	
   +	
   0	
   +	
   -­‐-­‐	
   0	
   0	
   -­‐-­‐	
   -­‐-­‐	
  
Ground	
  truth	
  (ordinal	
  classes)	
  
Absolute	
  
preferences	
  on	
  
an	
  ordinal	
  scale.	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Instance	
  Ranking	
  [Fürnkranz	
  et	
  al.	
  2009]	
  
19
predicted	
  ranking,	
  e.g.,	
  
through	
  sor2ng	
  by	
  
es2mated	
  	
  score	
  
most	
  likely	
  good	
   most	
  likely	
  bad	
  
ranking	
  error	
  
Extension	
  of	
  AUC	
  maximiza2on	
  to	
  the	
  polytomous	
  case,	
  in	
  which	
  
instances	
  are	
  rated	
  on	
  an	
  ordinal	
  scale	
  such	
  as	
  {	
  bad, medium, good}	
  
Query	
  set	
  of	
  instances	
  to	
  
be	
  ranked	
  (true	
  labels	
  
are	
  unknown).	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Collabora5ve	
  Filtering	
  [Goldberg	
  et	
  al.	
  1992]	
  
20
P1	
   P2	
   P3	
   …	
   P38	
   …	
   P88	
   P89	
   P90	
  
U1	
   1	
   4	
   …	
   …	
   3	
  
U2	
   2	
   2	
   …	
   …	
   1	
  
…	
   …	
   …	
  
U46	
   ?	
   2	
   ?	
   …	
   ?	
   …	
   ?	
   ?	
   4	
  
…	
   …	
   …	
  
U98	
   5	
   …	
   …	
   4	
  
U99	
   1	
   …	
   …	
   2	
  
1:	
  very	
  bad,	
  	
  	
  2:	
  bad,	
  	
  	
  3:	
  fair,	
  	
  	
  4:	
  good,	
  	
  	
  5:	
  excellent	
  
U	
  
S
	
  
E
	
  
R
	
  
S
	
  
	
  P	
  R	
  O	
  D	
  U	
  C	
  T	
  S	
  
Inputs	
  and	
  outputs	
  as	
  iden2fiers,	
  absolute	
  preferences	
  in	
  terms	
  of	
  ordinal	
  degrees.	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Dyadic	
  Predic5on	
  [Menon	
  &	
  Elkan	
  2010]	
  
21
P1	
   P2	
   P3	
   …	
   P38	
   …	
   P88	
   P89	
   P90	
  
U1	
   1	
   4	
   …	
   …	
   3	
  
U2	
   2	
   2	
   …	
   …	
   1	
  
…	
   …	
   …	
  
U46	
   ?	
   2	
   ?	
   …	
   ?	
   …	
   ?	
   ?	
   4	
  
…	
   …	
   …	
  
U98	
   5	
   …	
   …	
   4	
  
U99	
   1	
   …	
   …	
   2	
  
?	
   ?	
   1	
   5	
  
?	
   ?	
   0	
   4	
  
?	
   ?	
   0	
   6	
  
?	
   ?	
   1	
   5	
  
?	
   ?	
   1	
   7	
  
?	
   ?	
   0	
   6	
  
?	
   ?	
   1	
   6	
  
?	
   ?	
   ?	
   ?	
   ?	
   ?	
   ?	
   ?	
   ?	
  
10	
   14	
   45	
   32	
   52	
   61	
   16	
   33	
   53	
  
Addi2onal	
  side-­‐informa2on:	
  observed	
  features	
  +	
  
latent	
  features	
  of	
  users	
  and	
  items	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Preference	
  Learning	
  Tasks	
  
22
task	
   input	
   output	
   training	
   predic5on	
   ground	
  truth	
  
collabora2ve	
  
filtering	
  
iden2fier	
   iden5fier	
   absolute	
  
ordinal	
  
absolute	
  
ordinal	
  
absolute	
  
ordinal	
  
mul2label	
  
classifica2on	
  
feature	
   iden5fier	
   absolute	
  
binary	
  
absolute	
  
binary	
  
absolute	
  
binary	
  
mul2label	
  	
  
ranking	
  
feature	
   iden5fier	
   absolute	
  
binary	
  
ranking	
   absolute	
  
binary	
  
graded	
  mul2label	
  
classifica2on	
  
feature	
   iden5fier	
   absolute	
  
ordinal	
  
absolute	
  
ordinal	
  
absolute	
  
ordinal	
  
label	
  	
  
ranking	
  
feature	
   iden5fier	
   rela2ve	
  
binary	
  
ranking	
   ranking	
  
object	
  	
  
ranking	
  
feature	
   -­‐-­‐	
   rela2ve	
  
binary	
  
ranking	
   ranking	
  or	
  subset	
  
instance	
  	
  
ranking	
  
feature	
   iden2fier	
   absolute	
  
ordinal	
  
ranking	
   absolute	
  
ordinal	
  
generalized	
  
c
lassifica2on	
  
ranking	
  
Two	
  main	
  direc2ons:	
  (1)	
  ranking	
  and	
  variants	
  (2)	
  generaliza2ons	
  of	
  classifica2on.	
  
representa2on	
   type	
  of	
  preference	
  informa2on	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Loss	
  Func5ons	
  
23
absolute	
  u2lity	
  degree	
   absolute	
  u2lity	
  degree	
  
subset	
  of	
  preferred	
  items	
   subset	
  of	
  preferred	
  items	
  
subset	
  of	
  preferred	
  items	
   ranking	
  of	
  items	
  
fuzzy	
  subset	
  of	
  preferred	
  items	
   fuzzy	
  subset	
  of	
  preferred	
  items	
  
ranking	
  of	
  items	
   ranking	
  of	
  items	
  
ranking	
  of	
  items	
   ordered	
  par22on	
  of	
  items	
  
Things	
  to	
  be	
  compared:	
  
standard	
  comparison	
  of	
  
scalar	
  predic2ons	
  
non-­‐standard	
  
c
omparisons	
  
predic5on	
   ground	
  truth	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  1	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
24
References	
  
§ W.	
  Cheng,	
  K.	
  Dembczynski	
  and	
  E.	
  Hüllermeier.	
  Graded	
  Mul-label	
  Classifica-on:	
  The	
  Ordinal	
  Case.	
  ICML-­‐2010,	
  Haifa,	
  Israel,	
  
2010.	
  
§ W.	
  Cheng	
  and	
  E.	
  Hüllermeier.	
  Predic-ng	
  par-al	
  orders:	
  Ranking	
  with	
  absten-on.	
  ECML/PKDD-­‐2010,	
  Barcelona,	
  2010.	
  
§ Y.	
  Chevaleyre,	
  F.	
  Koriche,	
  J.	
  Lang,	
  J.	
  Mengin,	
  B.	
  Zanulni.	
  Learning	
  ordinal	
  preferences	
  on	
  mul-aCribute	
  domains:	
  The	
  case	
  of	
  
CP-­‐nets.	
  In:	
  J.	
  Fürnkranz	
  and	
  E.	
  Hüllermeier	
  (eds.)	
  Preference	
  Learning,	
  Springer-­‐Verlag,	
  2010.	
  	
  
§ W.W.	
  Cohen,	
  R.E.	
  Schapire	
  and	
  Y.	
  Singer.	
  Learning	
  to	
  order	
  things.	
  Journal	
  of	
  Ar2ficial	
  Intelligence	
  Research,	
  10:243–270,	
  1999.	
  
§ J.	
  Fürnkranz,	
  E.	
  Hüllermeier,	
  E.	
  Mencia,	
  and	
  K.	
  Brinker.	
  Mul-label	
  Classifica-on	
  via	
  Calibrated	
  Label	
  Ranking.	
  Machine	
  Learning	
  
73(2):133-­‐153,	
  2008.	
  	
  
§ J.	
  Fürnkranz,	
  E.	
  Hüllermeier	
  and	
  S.	
  Vanderlooy.	
  Binary	
  decomposi-on	
  methods	
  for	
  mul-par-te	
  ranking.	
  Proc.	
  ECML-­‐2009,	
  Bled,	
  
Slovenia,	
  2009.	
  
§ D.	
  Goldberg,	
  D.	
  Nichols,	
  B.M.	
  Oki	
  and	
  D.	
  Terry.	
  Using	
  collabora-ve	
  filtering	
  to	
  weave	
  and	
  informa-on	
  tapestry.	
  
Communica2ons	
  of	
  the	
  ACM,	
  35(12):61–70,	
  1992.	
  
§ E.	
  Hüllermeier,	
  J.	
  Fürnkranz,	
  W.	
  Cheng	
  and	
  K.	
  Brinker.	
  Label	
  ranking	
  by	
  learning	
  pairwise	
  	
  preferences.	
  Ar2ficial	
  Intelligence,	
  
172:1897–1916,	
  2008.	
  
§ G.	
  Tsoumakas	
  and	
  I.	
  Katakis.	
  Mul--­‐label	
  classifica-on:	
  An	
  overview.	
  Int.	
  J.	
  Data	
  Warehouse	
  and	
  Mining,	
  3:1–13,	
  2007.	
  
§ A.K.	
  Menon	
  and	
  C.	
  Elkan.	
  Predic-ng	
  labels	
  for	
  dyadic	
  data.	
  Data	
  Mining	
  and	
  Knowledge	
  Discovery,	
  21(2),	
  2010	
  
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
1
AGENDA
1. Preference Learning Tasks
2. Loss Functions
a. Evaluation of Rankings
b. Weighted Measures
c. Evaluation of Bipartite Rankings
d. Evaluation of Partial Rankings
3. Preference Learning Techniques
4. Conclusions
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
2
Rank Evaluation Measures
 In the following, we do not discriminate between different ranking
scenarios
 we use the term items for both, objects and labels
 All measures are applicable to both scenarii
 sometimes have different names according to context
 Label Ranking
 measure is applied to the ranking of the labels of each examples
 averaged over all examples
 Object Ranking
 measure is applied to the ranking of a set of objects
 we may need to average over different sets of objects which have disjoint
preference graphs
 e.g. different sets of query / answer set pairs in information retrieval
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
3
 Given:
 a set of items X = {x1, …, xc} to rank
 Example:
X = {A, B, C, D, E}
D
C
E
B
A
Ranking Errors
items can be
objects or labels
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
4
Ranking Errors
 Given:
 a set of items X = {x1, …, xc} to rank
 Example:
X = {A, B, C, D, E}
 a target ranking
 Example:
E  B  C  A  D
r
D
C
A
B
E
r
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
5
Ranking Errors
 Given:
 a set of items X = {x1, …, xc} to rank
 Example:
X = {A, B, C, D, E}
 a target ranking
 Example:
E  B  C  A  D
 a predicted ranking
 Example:
A  B  E  C  D
 Compute:
 a value that measures the
distance between the two rankings
r
D
E
C
B
A

r

r
D
C
A
B
E
r
d r , 
r
d 
r ,r
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
6
Notation
 and are functions from X → ℕ
 returning the rank of an item x
 the inverse functions r-1
: ℕ → X
 return the item at a certain position
 as a short-hand for , we also define
function R: ℕ → ℕ
 R(i) returns the true rank of the i-th item
in the predicted ranking
r
D
E
C
B
A

r 
r
D
C
A
B
E
r
r A=4

r A=1

r
−1
1=A r
−1
4=A
R1=r
r−1
1=4
r °
r−1
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
7
Spearman's Footrule
 Key idea:
 Measure the sum of absolute differences
between ranks
DSF r , 
r=∑
i=1
c
∣r xi −
rxi ∣=∑
i=1
c
∣i−Ri∣
D
E
C
B
A

r
D
C
A
B
E
r
=∑
i=1
c
d xi
r , 
r
dA=∣1−4∣=3
d B=0
dc=1
d D=0
dE=2
∑xi
d xi
=30102=6
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
8
Spearman Distance
 Key idea:
 Measure the sum of absolute differences
between ranks
 Value range:
→ Spearman Rank Correlation Coefficient
DS r , 
r=∑
i=1
c
rxi −
rxi2
=∑
i=1
c
i−Ri2
D
E
C
B
A

r
D
C
A
B
E
r
=∑
i=1
c
d xi
r , 
r2
dA=∣1−4∣=3
d B=0
dc=1
d D=0
dE=2
∑xi
d xi
2
=32
012
022
=14
squared
min DS r , 
r=0
max DS r , 
r=∑
i=1
c
c−i−i
2
=
c⋅c
2
−1
3
1−
6⋅DS r , 
r
c⋅c2
−1
∈[−1,1]
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
9
Kendall's Distance
 Key idea:
 number of item pairs that are inverted in the
predicted ranking
 Value range:
→ Kendall's tau
D
E
C
B
A

r
D
C
A
B
E
r
1−
4⋅D r , 
r
c⋅c−1
∈[−1,1]
D r , 
r =∣ {i , j∣rxirx j ∧ 
rxi 
r x j}∣
D r , 
r = 4
min D r , 
r=0
max D r , 
r=
c⋅c−1
2
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
10
AGENDA
1. Preference Learning Tasks
2. Loss Functions
a. Evaluation of Rankings
b. Weighted Measures
c. Evaluation of Bipartite Rankings
d. Evaluation of Partial Rankings
3. Preference Learning Techniques
4. Conclusions
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
11
Weighted Ranking Errors
 The previous ranking functions give equal weight to all ranking positions
 i.e., differences in the first ranking positions have the same effect as differences
in the last ranking positions
 In many applications this is not desirable
 ranking of search results
 ranking of product recommendations
 ranking of labels for classification
 ...
D
C
E
B
A
E
C
D
B
A
E
C
D
B
A
E
C
D
A
B
D ,  = D , 
Higher ranking
positions should
be given more
weight
⇒
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
12
Position Error
 Key idea:
 in many applications we are interested in
providing a ranking where the target item
appears a high as possible in the
predicted ranking
 e.g. ranking a set of actions for the next step
in a plan
 Error is the number of wrong items that
are predicted before the target item
 Note:
 equivalent to Spearman's footrule with all
non-target weights set to 0
D
E
C
B
A

r
D
C
A
B
E
r
DPE r , 
r=
rarg minx∈X r x−1
DPE r , 
r=2
DPE r , 
r=∑
i=1
c
wi⋅d xi
r , 
r
wi = 〚xi=arg minx∈X r x〛
with
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
13
Discounted Error
 Higher ranks in the target position get
a higher weight than lower ranks
D
E
C
B
A

r r
D
A
C
B
E
DDRr , 
r=∑
i=1
c
wi⋅d xi
r , 
r
wi =
1
logrxi1
with
DDR r , 
r= 3
log 2
0
1
log 4
0
2
log6
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
14
(Normalized) Discounted Cumulative Gain
 a “positive” version of discounted error:
Discounted Cumulative Gain (DCG)
 Maximum possible value:
 the predicted ranking is correct,
i.e.
 Ideal Discounted Cumulative Gain (IDCG)
 Normalized DCG (NDCG)
D
E
C
B
A

r r
D
A
C
B
E
DCGr , 
r=∑
i=1
c
c−Ri
logi1
∀ i:i=Ri
IDCG=∑
i=1
c
c−i
logi1
NDCGr , 
r=
DCGr , 
r
IDCG
NDCGr , 
r=
1
log 2

3
log 3

4
log 4

2
log5

0
log6
4
log 2

3
log 3

2
log 4

1
log5

0
log6
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
15
AGENDA
1. Preference Learning Tasks
2. Loss Functions
a. Evaluation of Rankings
b. Weighted Measures
c. Evaluation of Bipartite Rankings
d. Evaluation of Partial Rankings
3. Preference Learning Techniques
4. Conclusions
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
16
Bipartite Rankings
 The target ranking is not totally ordered
but a bipartite graph
 The two partitions may be viewed as
preference levels L = {0, 1}
 all c1 items of level 1 are preferred over all
c0 items of level 0
 We now have fewer preferences
 for a total order:
 for a bipartite graph:
Bipartite Rankings
D
E
C
B
A

r
D
C
A
B
E
r
c
2
⋅c−1
c1⋅c−c1
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
17
Evaluating Partial Target Rankings
 Many Measures can be directly adapted from
total target rankings to partial target rankings
 Recall: Kendall's distance
 number of item pairs that are inverted
in the target ranking
 can be directly used
 in case of normalization, we have to
consider that fewer items satisfy r(xi) < r(xj)
 Area under the ROC curve (AUC)
 the AUC is the fraction of pairs of (p,n) for
which the predicted score s(p) > s(n)
 Mann Whitney statistic is the absolute number
 This is 1 - normalized Kendall's distance
for a bipartite preference graph with L = {p,n}
D r , 
r =∣ {i , j∣r xirx j ∧ 
rxi
rx j}∣
D
E
C
B
A

r
D
C
A
B
E
r
D r , 
r = 2
AUCr , 
r = 4
6
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
18
Evaluating Multipartite Rankings
 Multipartite rankings:
 like Bipartite rankings
 but the target ranking r consists of
multiple relevance levels L = {1 … l},
where l < c
 total ranking is a special case where each
level has exactly one item
 # of preferences
 ci is the number of items in level I
 C-Index [Gnen & Heller, 2005]
 straight-forward generalization of AUC
 fraction of pairs (xi ,xj) for which
D
E
C
B
A

r
D
C
A
B
E
r
=∑
i , j
ci⋅c j ≤
c2
2
⋅1−
1
l

l il  j ∧ 
rxi
rx j 
D r , 
r = 3
C-Indexr , 
r = 5
8
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
19
Evaluating Multipartite Rankings
C-Index
 the C-index can be rewritten as a weighted sum of pairwise AUCs:
where and are the rankings and restricted to levels i and j.
Jonckheere-Terpstra statistic
 is an unweighted sum of pairwise AUCs:
 equivalent to well-known multi-class extension of AUC
[Hand & Till, MLJ 2001]
C-Indexr , 
r=
1
∑i , ji
ci⋅c j
∑i , ji
ci⋅c j
⋅AUCri , j , 
ri , j 
m-AUC=
2
l⋅l−1
∑i , ji
AUCri , j , 
ri , j 
Note:
C-Index and m-AUC
can be optimized by
optimization of
pairwise AUCs
Note:
C-Index and m-AUC
can be optimized by
optimization of
pairwise AUCs
ri , j 
ri , j r 
r
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
20
Normalized Discounted Cumulative Gain
[Jarvelin & Kekalainen, 2002]
D
E
C
B
A

r
D
C
A
B
E
r
 The original formulation of (normalized)
discounted cumulative gain refers to
this setting
 the sum of the true (relevance)
levels of the items
 each item weighted by its rank
in the predicted ranking
 Examples:
 retrieval of relevant or irrelevant pages
 2 relevance levels
 movie recommendation
 5 relevance levels
DCGr , 
r=∑
i=1
c
li
logi1
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
21
AGENDA
1. Preference Learning Tasks
2. Loss Functions
a. Evaluation of Rankings
b. Weighted Measures
c. Evaluation of Bipartite Rankings
d. Evaluation of Partial Rankings
3. Preference Learning Techniques
4. Conclusions
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
22
Evaluating Partial Structures in the Predicted
Ranking
 For fixed types of partial structures, we have conventional measures
 bipartite graphs → binary classification
 accuracy, recall, precision, F1, etc.
 can also be used when the items are labels!
 e.g., accuracy on the set of labels for multilabel classification
 multipartite graphs → ordinal classification
 multiclass classification measures (accuracy, error, etc.)
 regression measures (sum of squared errors, etc.)
 For general partial structures
 some measures can be directly used on the reduced set of target preferences
 Kendall's distance, Gamma coefficient
 we can also use set measures on the set of binary preferences
 both, the source and the target ranking consist of a set of binary preferences
 e.g. Jaccard Coefficient
 size of interesection over size of union of the binary preferences in both sets
¿
¿
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
23
Gamma Coefficient
 Key idea: normalized difference between
 number of correctly ranked pairs
(Kendall's distance)
 number of incorrectly ranked pairs
 Gamma Coefficient
[Goodman & Kruskal, 1979]
 Identical to Kendall's tau
if both rankings are total
 i.e., if
D
E
C
B
A

r
D
C
A
B
E
r
d=Dr , 
r

d =∣ {i , j∣ rxirx j  ∧ 
rxi
rx j}∣
r , 
r=
d− 
d
d 
d
∈[−1,1]
r , 
r=2−1
21
=
1
3
d 
d =
c⋅c−1
2
DS-2011 Tutorial on Preference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz
24
References
 Cheng W., Rademaker M., De Baets B., Hüllermeier E.: Predicting Partial Orders: Ranking with Abstention. Proceedings ECML/PKDD-10(1): 215-230
(2010)
 Fürnkranz J., Hüllermeier E., Vanderlooy S.: Binary Decomposition Methods for Multipartite Ranking. Proceedings ECML/PKDD-09(1): 359-374 (2009)
 Gnen M., Heller G.: Concordance probability and discriminatory power in proportional hazards regression. Biometrika 92(4):965–970 (2005)
 Goodman, L., Kruskal, W.: Measures of Association for Cross Classifications. Springer-Verlag, New York (1979)
 Hand D.J., Till R.J.: A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2):171-
186 (2001)
 Jarvelin K., Kekalainen J.: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20(4): 422–446 (2002)
 Jonckheere, A. R.: A distribution-free k-sample test against ordered alternatives. Biometrika: 133–145 (1954)
 Kendall, M. A New Measure of Rank Correlation. Biometrika 30 (1-2): 81–89 (1938)
 Mann H. B.,Whitney D. R. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics,
18:50–60 (1947)
 Spearman C. The proof and measurement of association between two things. American Journal of Psychology, 15:72–101 (1904)
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
AGENDA	
  
1. Preference	
  Learning	
  Tasks	
  	
  
2. Performance	
  Assessment	
  and	
  Loss	
  FuncEons	
  	
  
3. Preference	
  Learning	
  Techniques	
  	
  
a. Learning	
  UElity	
  FuncEons	
  
b. Learning	
  Preference	
  RelaEons	
  
c. Structured	
  Output	
  PredicEon	
  
d. Model-­‐Based	
  Preference	
  Learning	
  
e. Local	
  Preference	
  AggregaEon	
  
4. Conclusions	
  
1
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Two	
  Ways	
  of	
  Represen>ng	
  Preferences	
  
2
	
  	
  
	
  
§ U>lity-­‐based	
  approach:	
  EvaluaEng	
  single	
  alternaEves	
  
§ Rela>onal	
  approach:	
  Comparing	
  pairs	
  of	
  alternaEves	
  
weak	
  preference	
  
strict	
  preference	
  
indifference	
  
incomparability	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
U>lity	
  Func>ons	
  
3
§ A	
  u>lity	
  func>on	
  assigns	
  a	
  uElity	
  degree	
  (typically	
  a	
  real	
  number	
  or	
  an	
  
ordinal	
  degree)	
  to	
  each	
  alternaEve.	
  
§ Learning	
  such	
  a	
  funcEon	
  essenEally	
  comes	
  down	
  to	
  solving	
  an	
  (ordinal)	
  
regression	
  problem.	
  
§ OWen	
  addi>onal	
  condi>ons,	
  e.g.,	
  due	
  to	
  bounded	
  uElity	
  ranges	
  or	
  
monotonicity	
  properEes	
  (à	
  learning	
  monotone	
  models)	
  
§ A	
  u>lity	
  func>on	
  induces	
  a	
  ranking	
  (total	
  order),	
  but	
  not	
  the	
  other	
  way	
  
around!	
  	
  
§ But	
  it	
  can	
  not	
  represent	
  more	
  general	
  relaEons,	
  e.g.,	
  a	
  par>al	
  order!	
  
§ The	
  feedback	
  can	
  be	
  direct	
  (exemplary	
  uElity	
  degrees	
  given)	
  or	
  indirect	
  
(inequality	
  induced	
  by	
  order	
  relaEon):	
  
absolute	
  feedback	
   relaEve	
  feedback	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Predic>ng	
  U>li>es	
  on	
  Ordinal	
  Scales	
  
4
(Graded)	
  mulElabel	
  classificaEon	
  
CollaboraEve	
  filtering	
  
ExploiEng	
  dependencies	
  
(correlaEons)	
  between	
  items	
  
(labels,	
  products,	
  …)	
  
à	
  see	
  work	
  in	
  MLC	
  and	
  RecSys	
  communiEes	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Learning	
  U>lity	
  Func>ons	
  from	
  Indirect	
  Feedback	
  
5
§ A	
  (latent)	
  uElity	
  funcEon	
  can	
  also	
  be	
  used	
  to	
  solve	
  ranking	
  problems,	
  
such	
  as	
  instance,	
  object	
  or	
  label	
  ranking	
  
à	
  ranking	
  by	
  (es>mated)	
  u>lity	
  degrees	
  (scores)	
  
Instance	
  ranking	
  
Absolute	
  preferences	
  given,	
  so	
  in	
  
principle	
  an	
  ordinal	
  regression	
  
problem.	
  However,	
  the	
  goal	
  is	
  to	
  
maximize	
  ranking	
  instead	
  of	
  
classificaEon	
  performance.	
  	
  
Object	
  ranking	
  
Find	
  a	
  uElity	
  funcEon	
  that	
  agrees	
  
as	
  much	
  as	
  possible	
  with	
  the	
  
preference	
  informaEon	
  in	
  the	
  
sense	
  that,	
  for	
  most	
  examples,	
  	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Ranking	
  versus	
  Classifica>on	
  
6
posiEve	
   negaEve	
  
A	
  ranker	
  can	
  be	
  turned	
  into	
  a	
  classifier	
  via	
  thresholding:	
  
A	
  good	
  classifier	
  is	
  not	
  necessarily	
  a	
  good	
  ranker:	
  
2	
  classificaEon	
  but	
  
10	
  ranking	
  errors	
  
à	
  	
  	
  learning	
  AUC-­‐op'mizing	
  scoring	
  classifiers	
  !	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
RankSVM	
  and	
  Related	
  Methods	
  (Bipar>te	
  Case)	
  
§ The	
  idea	
  is	
  to	
  minimize	
  a	
  convex	
  upper	
  bound	
  on	
  the	
  empirical	
  ranking	
  
error	
  over	
  a	
  class	
  of	
  (kernelized)	
  ranking	
  funcEons:	
  
7
convex	
  upper	
  bound	
  on	
  
regularizer	
  
check	
  for	
  all	
  posiEve/
negaEve	
  pairs	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
RankSVM	
  and	
  Related	
  Methods	
  (Bipar>te	
  Case)	
  
§ The	
  biparEte	
  RankSVM	
  algorithm	
  [Herbrich	
  et	
  al.	
  2000,	
  Joachimes	
  2002]:	
  
8
hinge	
  loss	
  
regularizer	
  
reproducing	
  kernel	
  
Hilbert	
  space	
  (RKHS)	
  with	
  
kernel	
  K
à	
  learning	
  comes	
  down	
  to	
  solving	
  a	
  QP	
  problem	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
RankSVM	
  and	
  Related	
  Methods	
  (Bipar>te	
  Case)	
  
§ The	
  biparEte	
  RankBoost	
  algorithm	
  [Freund	
  et	
  al.	
  2003]:	
  
9
class	
  of	
  linear	
  
combinaEons	
  of	
  base	
  
funcEons	
  
à	
  learning	
  by	
  means	
  of	
  boosEng	
  techniques	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Learning	
  U>lity	
  Func>ons	
  for	
  Label	
  Ranking	
  
10
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Label	
  Ranking:	
  Reduc>on	
  to	
  Binary	
  Classifica>on	
  [Har-­‐Peled	
  et	
  al.	
  2002]	
  	
  
11
Each	
  pairwise	
  comparison	
  is	
  turned	
  into	
  a	
  binary	
  classifica>on	
  
example	
  in	
  a	
  high-­‐dimensional	
  space!	
  
posiEve	
  example	
  in	
  the	
  new	
  instance	
  space	
  
(m	
  x	
  k)-­‐dimensional	
  weight	
  vector	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
AGENDA	
  
1. Preference	
  Learning	
  Tasks	
  	
  
2. Performance	
  Assessment	
  and	
  Loss	
  FuncEons	
  	
  
3. Preference	
  Learning	
  Techniques	
  	
  
a. Learning	
  UElity	
  FuncEons	
  
b. Learning	
  Preference	
  Rela>ons	
  
c. Structured	
  Output	
  PredicEon	
  
d. Model-­‐Based	
  Preference	
  Learning	
  
e. Local	
  Preference	
  AggregaEon	
  
4. Conclusions	
  
12
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Learning	
  Binary	
  Preference	
  Rela>ons	
  
§ Learning	
  binary	
  preferences	
  (in	
  the	
  form	
  of	
  predicates	
  P(x,y))	
  is	
  oWen	
  
simpler,	
  especially	
  if	
  the	
  training	
  informaEon	
  is	
  given	
  in	
  this	
  form,	
  too.	
  	
  
§ However,	
  it	
  implies	
  an	
  addiEonal	
  step,	
  namely	
  extrac>ng	
  a	
  ranking	
  from	
  a	
  
(predicted)	
  preference	
  relaEon.	
  
§ This	
  step	
  is	
  not	
  always	
  trivial,	
  since	
  a	
  predicted	
  preference	
  relaEon	
  may	
  
exhibit	
  inconsistencies	
  and	
  may	
  not	
  suggest	
  a	
  unique	
  ranking	
  in	
  an	
  
unequivocal	
  way.	
  
13
1	
   1	
   0	
   0	
  
0	
   0	
   1	
   0	
  
0	
   1	
   0	
   0	
  
1	
   0	
   1	
   1	
  
1	
   1	
   1	
   0	
  
inference	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Object	
  Ranking:	
  Learning	
  to	
  Order	
  Things	
  [Cohen	
  et	
  al.	
  99]	
  
§ In	
  a	
  first	
  step,	
  a	
  binary	
  preference	
  func>on	
  PREF	
  is	
  constructed;	
  PREF
(x,y) 2	
  [0,1]	
  is	
  a	
  measure	
  of	
  the	
  certainty	
  that	
  x	
  should	
  be	
  ranked	
  before	
  
y,	
  and	
  PREF(x,y)=1- PREF(y,x).	
  
§ This	
  funcEon	
  is	
  expressed	
  as	
  a	
  linear	
  combinaEon	
  of	
  base	
  preference	
  
funcEons:	
  
§ The	
  weights	
  can	
  be	
  learned,	
  e.g.,	
  by	
  means	
  of	
  the	
  weighted	
  majority	
  
algorithm	
  [Lijlestone	
  &	
  Warmuth	
  94].	
  	
  
§ In	
  a	
  second	
  step,	
  a	
  total	
  order	
  is	
  derived,	
  which	
  is	
  as	
  much	
  as	
  possible	
  in	
  
agreement	
  with	
  the	
  binary	
  preference	
  relaEon.	
  	
  
14
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Object	
  Ranking:	
  Learning	
  to	
  Order	
  Things	
  [Cohen	
  et	
  al.	
  99]	
  
§ The	
  weighted	
  feedback	
  arc	
  set	
  problem:	
  Find	
  a	
  permutaEon	
  ¼	
  such	
  that	
  
	
  
	
  
	
  
becomes	
  minimal.	
  
15
0.7	
  
0.9	
  
0.6	
  
0.6	
  
0.8	
  
0.5	
  
0.3	
  
0.1	
   0.6	
  
0.4	
  
0.5	
  
0.8	
  
cost	
  =	
  0.1+0.6+0.8+0.5+0.3+0.4	
  =	
  2.7	
  	
  
0.1	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Object	
  Ranking:	
  Learning	
  to	
  Order	
  Things	
  [Cohen	
  et	
  al.	
  99]	
  
§ Since	
  this	
  is	
  an	
  NP-­‐hard	
  problem,	
  it	
  is	
  solved	
  heurisEcally.	
  	
  
16
Input:	
  	
  
Output:	
  
let	
  
for	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  do	
  
while 	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  do	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  let	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  let	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  for 	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  do	
  
endwhile	
  
§ The	
  algorithm	
  successively	
  chooses	
  nodes	
  having	
  maximal	
  „net-­‐flow“	
  within	
  the	
  
remaining	
  subgraph.	
  	
  
§ 	
  It	
  can	
  be	
  shown	
  to	
  provide	
  a	
  2-­‐approxima>on	
  to	
  the	
  opEmal	
  soluEon.	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Label	
  Ranking:	
  Learning	
  by	
  Pairwise	
  Comparison	
  (LPC)	
  [Hüllermeier	
  et	
  al.	
  2008]	
  
1
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
2
X1	
   X2	
   X3	
   X4	
   preferences	
   class	
  
0.34	
   0	
   10	
   174	
   A	
  Â	
  B,	
  B	
  Â	
  C,	
  C	
  Â	
  D	
   1	
  
1.45	
   0	
   32	
   277	
   B	
  Â	
  C	
  
1.22	
   1	
   46	
   421	
   B	
  Â	
  D,	
  B	
  Â	
  A,	
  C	
  Â	
  D,	
  A	
  Â	
  C	
   0	
  
0.74	
   1	
   25	
   165	
   C	
  Â	
  A,	
  C	
  Â	
  D,	
  A	
  Â	
  B	
   1	
  
0.95	
   1	
   72	
   273	
   B	
  Â	
  D,	
  A	
  Â	
  D,	
  	
  
1.04	
   0	
   33	
   158	
   D	
  Â	
  A,	
  A	
  Â	
  B,	
  C	
  Â	
  B,	
  A	
  Â	
  C	
   1	
  
Label	
  Ranking:	
  Learning	
  by	
  Pairwise	
  Comparison	
  (LPC)	
  [Hüllermeier	
  et	
  al.	
  2008]	
  
Training	
  data	
  (for	
  the	
  label	
  pair	
  A	
  and	
  B):	
  
X1	
   X2	
   X3	
   X4	
   class	
  
0.34	
   0	
   10	
   174	
   1	
  
1.22	
   1	
   46	
   421	
   0	
  
0.74	
   1	
   25	
   165	
   1	
  
1.04	
   0	
   33	
   158	
   1	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
3
At	
  predicUon	
  Ume,	
  a	
  query	
  instance	
  is	
  submiYed	
  to	
  all	
  models,	
  and	
  the	
  
predicUons	
  are	
  combined	
  into	
  a	
  binary	
  preference	
  relaUon:	
  
A	
   B	
   C	
   D	
  
A	
   0.3	
   0.8	
   0.4	
  
B	
   0.7	
   0.7	
   0.9	
  
C	
   0.2	
   0.3	
   0.3	
  
D	
   0.6	
   0.1	
   0.7	
  
Label	
  Ranking:	
  Learning	
  by	
  Pairwise	
  Comparison	
  (LPC)	
  [Hüllermeier	
  et	
  al.	
  2008]	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
4
At	
  predicUon	
  Ume,	
  a	
  query	
  instance	
  is	
  submiYed	
  to	
  all	
  models,	
  and	
  the	
  
predicUons	
  are	
  combined	
  into	
  a	
  binary	
  preference	
  relaUon:	
  
A	
   B	
   C	
   D	
  
A	
   0.3	
   0.8	
   0.4	
   1.5	
  
B	
   0.7	
   0.7	
   0.9	
   2.3	
  
C	
   0.2	
   0.3	
   0.3	
   0.8	
  
D	
   0.6	
   0.1	
   0.7	
   1.4	
  
From	
  this	
  relaUon,	
  a	
  ranking	
  is	
  derived	
  by	
  means	
  of	
  a	
  ranking	
  procedure.	
  
In	
  the	
  simplest	
  case,	
  this	
  is	
  done	
  by	
  sorUng	
  the	
  labels	
  according	
  to	
  their	
  
sum	
  of	
  weighted	
  votes.	
  	
  
B Â A	
  Â D	
  Â C
Label	
  Ranking:	
  Learning	
  by	
  Pairwise	
  Comparison	
  (LPC)	
  [Hüllermeier	
  et	
  al.	
  2008]	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
AGENDA	
  
1. Preference	
  Learning	
  Tasks	
  	
  
2. Performance	
  Assessment	
  and	
  Loss	
  FuncUons	
  	
  
3. Preference	
  Learning	
  Techniques	
  	
  
a. Learning	
  UUlity	
  FuncUons	
  
b. Learning	
  Preference	
  RelaUons	
  
c. Structured	
  Output	
  PredicMon	
  
d. Model-­‐Based	
  Preference	
  Learning	
  
e. Local	
  Preference	
  AggregaUon	
  
4. Conclusions	
  
5
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Structured	
  Output	
  PredicMon	
  [Bakir	
  et	
  al.	
  2007]	
  
§ Rankings,	
  mulUlabel	
  classificaUons,	
  etc.	
  can	
  be	
  seen	
  as	
  specific	
  types	
  of	
  
structured	
  (as	
  opposed	
  to	
  scalar)	
  outputs.	
  	
  
§ DiscriminaUve	
  structured	
  predicUon	
  algorithms	
  infer	
  a	
  joint	
  scoring	
  
funcMon	
  on	
  input-­‐output	
  pairs	
  and,	
  for	
  a	
  given	
  input,	
  predict	
  the	
  output	
  
that	
  maximises	
  this	
  scoring	
  funcUon.	
  
§ Joint	
  feature	
  map	
  and	
  scoring	
  funcUon	
  	
  
§ The	
  learning	
  problem	
  consists	
  of	
  esUmaUng	
  the	
  weight	
  vector,	
  e.g.,	
  using	
  
structural	
  risk	
  minimizaUon.	
  	
  
§ PredicUon	
  requires	
  solving	
  a	
  decoding	
  problem:	
  
6
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
§ Preferences	
  are	
  expressed	
  through	
  inequaliMes	
  on	
  inner	
  products:	
  
§ The	
  potenUally	
  huge	
  number	
  of	
  constraints	
  cannot	
  be	
  handled	
  explicitly	
  
and	
  calls	
  for	
  specific	
  techniques	
  (such	
  as	
  cucng	
  plane	
  opUmizaUon)	
  
7
loss	
  funcUon	
  
Structured	
  Output	
  PredicMon	
  [Bakir	
  et	
  al.	
  2007]	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
AGENDA	
  
1. Preference	
  Learning	
  Tasks	
  	
  
2. Performance	
  Assessment	
  and	
  Loss	
  FuncUons	
  	
  
3. Preference	
  Learning	
  Techniques	
  	
  
a. Learning	
  UUlity	
  FuncUons	
  
b. Learning	
  Preference	
  RelaUons	
  
c. Structured	
  Output	
  PredicUon	
  
d. Model-­‐Based	
  Preference	
  Learning	
  
e. Local	
  Preference	
  AggregaUon	
  
4. Conclusions	
  
8
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Model-­‐Based	
  Methods	
  for	
  Ranking	
  
§ Model-­‐based	
  approaches	
  to	
  ranking	
  proceed	
  from	
  specific	
  assumpUons	
  
about	
  the	
  possible	
  rankings	
  (representaMon	
  bias)	
  or	
  make	
  use	
  of	
  
probabilisMc	
  models	
  for	
  rankings	
  (parametrized	
  probability	
  distribuUons	
  
on	
  the	
  set	
  of	
  rankings).	
  
§ In	
  the	
  following,	
  we	
  shall	
  see	
  examples	
  of	
  both	
  type:	
  
- RestricUon	
  to	
  lexicographic	
  preferences	
  
- CondiUonal	
  preference	
  networks	
  (CP-­‐nets)	
  
- Label	
  ranking	
  using	
  the	
  PlackeY-­‐Luce	
  model	
  	
  	
  see	
  our	
  talk	
  tomorrow	
  
9
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Learning	
  Lexicographic	
  Preference	
  Models	
  [Yaman	
  et	
  al.	
  2008]	
  	
  
§ Suppose	
  that	
  objects	
  are	
  represented	
  as	
  feature	
  vectors	
  of	
  length	
  m,	
  and	
  
that	
  each	
  aYribute	
  has	
  k	
  values.	
  
§ For	
  n=km	
  objects,	
  there	
  are	
  n!	
  permutaUons	
  (rankings).	
  	
  
§ A	
  lexicographic	
  order	
  is	
  uniquely	
  determined	
  by	
  
- a	
  total	
  order	
  of	
  the	
  aYributes	
  
- a	
  total	
  order	
  of	
  each	
  aYribute	
  domain	
  
§ Example:	
  Four	
  binary	
  aYributes	
  (m=4,	
  k=2)	
  
- there	
  are	
  16! ¼	
  2	
  ·	
  1013	
  rankings	
  	
  
- but	
  only	
  (24)	
  ·	
  4!	
  =	
  384	
  of	
  them	
  can	
  be	
  expressed	
  in	
  terms	
  of	
  a	
  
lexicographic	
  order	
  
§ [Yaman	
  et	
  al.	
  2008]	
  present	
  a	
  learning	
  algorithm	
  that	
  explictly	
  maintains	
  
the	
  version	
  space,	
  i.e.,	
  the	
  aYribute-­‐orders	
  compaUble	
  with	
  all	
  pairwise	
  
preferences	
  seen	
  so	
  far	
  (assuming	
  binary	
  aYributes	
  with	
  1	
  preferred	
  to	
  0).	
  
PredicUons	
  are	
  derived	
  based	
  on	
  the	
  „votes“	
  of	
  the	
  consistent	
  models.	
  
10
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Learning	
  CondiMonal	
  Preference	
  (CP)	
  Networks	
  [Chevaleyre	
  et	
  al.	
  2010]	
  
11
main	
  dish	
  
drink	
   restaurant	
  
meat > veggie > fish
meat: red wine > white wine
veggie: red wine > white wine
fish: white wine > red wine
meat: Italian > Chinese
veggie: Chinese > Italian
fish: Chinese > Italian
Compact	
  representaUon	
  of	
  a	
  
parUal	
  order	
  relaUon,	
  exploiUng	
  
condiUonal	
  independence	
  of	
  
preferences	
  on	
  aYribute	
  values.	
  
Induces	
  parUal	
  order	
  relaUon,	
  e.g.,	
  
(meat, red wine, Italian) > (meat, white wine, Chinese)
(fish, white wine, Chinese) > (fish, red wine, Chinese)
(meat, white wine, Italian) ? (meat, red wine, Chinese)
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Learning	
  CondiMonal	
  Preference	
  (CP)	
  Networks	
  [Chevaleyre	
  et	
  al.	
  2010]	
  
12
main	
  dish	
  
drink	
   restaurant	
  
meat > veggie > fish
meat: red wine > white wine
veggie: red wine > white wine
fish: white wine > red wine
meat: Italian > Chinese
veggie: Chinese > Italian
fish: Chinese > Italian
Compact	
  representaUon	
  of	
  a	
  
parUal	
  order	
  relaUon,	
  exploiUng	
  
condiUonal	
  independence	
  of	
  
preferences	
  on	
  aYribute	
  values.	
  
(meat, red wine, Italian) > (veggie, red wine, Italian)
(fish, whited wine, Chinease) > (veggie, red wine, Chinease)
(veggie, whited wine, Chinease) > (veggie, red wine, Italian)
… … …
Training	
  data	
  (possibly	
  noisy):	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
AGENDA	
  
1. Preference	
  Learning	
  Tasks	
  	
  
2. Performance	
  Assessment	
  and	
  Loss	
  FuncUons	
  	
  
3. Preference	
  Learning	
  Techniques	
  	
  
a. Learning	
  UUlity	
  FuncUons	
  
b. Learning	
  Preference	
  RelaUons	
  
c. Structured	
  Output	
  PredicUon	
  
d. Model-­‐Based	
  Preference	
  Learning	
  
e. Local	
  Preference	
  AggregaUon	
  	
  see	
  our	
  talk	
  tomorrow	
  
4. Conclusions	
  
13
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Summary	
  of	
  Main	
  Algorithmic	
  Principles	
  
§ ReducMon	
  of	
  ranking	
  to	
  (binary)	
  classificaUon	
  (e.g.,	
  constraint	
  
classificaUon,	
  LPC)	
  
§ Direct	
  opMmizaMon	
  of	
  (regularized)	
  smooth	
  approximaUon	
  of	
  ranking	
  
losses	
  (RankSVM,	
  RankBoost,	
  …)	
  
§ Structured	
  output	
  predicMon,	
  learning	
  joint	
  scoring	
  („matching“)	
  
funcUon	
  
§ Learning	
  parametrized	
  probabilisMc	
  ranking	
  models	
  (e.g.,	
  PlackeY-­‐Luce)	
  
§ Restricted	
  model	
  classes,	
  ficng	
  (parametrized)	
  determinisUc	
  models	
  
(e.g.,	
  lexicographic	
  orders)	
  
§ Lazy	
  learning,	
  local	
  preference	
  aggregaUon	
  (lazy	
  learning)	
  
14
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  3	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
References	
  
15
§ G.	
  Bakir,	
  T.	
  Hofmann,	
  B.	
  Schölkopf,	
  A.	
  Smola,	
  B.	
  Taskar	
  and	
  S.	
  Vishwanathan.	
  Predic'ng	
  structured	
  data.	
  MIT	
  Press,	
  2007.	
  
§ W.	
  Cheng,	
  K.	
  Dembczynski	
  and	
  E.	
  Hüllermeier.	
  Graded	
  Mul'label	
  Classifica'on:	
  The	
  Ordinal	
  Case.	
  ICML-­‐2010,	
  Haifa,	
  Israel,	
  2010.	
  
§ W.	
  Cheng,	
  K.	
  Dembczynski	
  and	
  E.	
  Hüllermeier.	
  Label	
  ranking	
  using	
  the	
  Placke=-­‐Luce	
  model.	
  ICML-­‐2010,	
  Haifa,	
  Israel,	
  2010.	
  
§ W.	
  Cheng	
  and	
  E.	
  Hüllermeier.	
  Predic'ng	
  par'al	
  orders:	
  Ranking	
  with	
  absten'on.	
  ECML/PKDD-­‐2010,	
  Barcelona,	
  2010.	
  
§ W.	
  Cheng,	
  C.	
  Hühn	
  and	
  E.	
  Hüllermeier.	
  Decision	
  tree	
  and	
  instance-­‐based	
  learning	
  for	
  label	
  ranking.	
  ICML-­‐2009.	
  
§ Y.	
  Chevaleyre,	
  F.	
  Koriche,	
  J.	
  Lang,	
  J.	
  Mengin,	
  B.	
  Zanucni.	
  Learning	
  ordinal	
  preferences	
  on	
  mul'a=ribute	
  domains:	
  The	
  case	
  of	
  CP-­‐nets.	
  In:	
  J.	
  
Fürnkranz	
  and	
  E.	
  Hüllermeier	
  (eds.)	
  Preference	
  Learning,	
  Springer-­‐Verlag,	
  2010.	
  	
  
§ W.W.	
  Cohen,	
  R.E.	
  Schapire	
  and	
  Y.	
  Singer.	
  Learning	
  to	
  order	
  things.	
  Journal	
  of	
  ArUficial	
  Intelligence	
  Research,	
  10:243–270,	
  1999.	
  
§ Y.	
  Freund,	
  R.	
  Iyer,	
  R.	
  E.	
  Schapire	
  and	
  Y.	
  Singer.	
  An	
  efficient	
  boos'ng	
  algorithm	
  for	
  combining	
  preferences.	
  Journal	
  of	
  Machine	
  Learning	
  	
  Research,	
  
4:933–969,	
  2003.	
  
§ J.	
  Fürnkranz,	
  E.	
  Hüllermeier,	
  E.	
  Mencia,	
  and	
  K.	
  Brinker.	
  Mul'label	
  Classifica'on	
  via	
  Calibrated	
  Label	
  Ranking.	
  Machine	
  Learning	
  73(2):133-­‐153,	
  
2008.	
  	
  
§ J.	
  Fürnkranz,	
  E.	
  Hüllermeier	
  and	
  S.	
  Vanderlooy.	
  Binary	
  decomposi'on	
  methods	
  for	
  mul'par'te	
  ranking.	
  Proc.	
  ECML-­‐2009,	
  Bled,	
  Slovenia,	
  2009.	
  
§ D.	
  Goldberg,	
  D.	
  Nichols,	
  B.M.	
  Oki	
  and	
  D.	
  Terry.	
  Using	
  collabora've	
  filtering	
  to	
  weave	
  and	
  informa'on	
  tapestry.	
  CommunicaUons	
  of	
  the	
  ACM,	
  35
(12):61–70,	
  1992.	
  
§ S.	
  Har-­‐Peled,	
  D.	
  Roth	
  and	
  D.	
  Zimak.	
  Constraint	
  classifica'on:	
  A	
  new	
  approach	
  to	
  mul'class	
  classifica'on.	
  Proc.	
  ALT-­‐2002.	
  
§ R.	
  Herbrich,	
  T.	
  Graepel	
  and	
  K.	
  Obermayer.	
  Large	
  margin	
  rank	
  boundaries	
  for	
  ordinal	
  regression.	
  Advances	
  in	
  Large	
  Margin	
  Classifiers,	
  2000.	
  
§ E.	
  Hüllermeier,	
  J.	
  Fürnkranz,	
  W.	
  Cheng	
  and	
  K.	
  Brinker.	
  Label	
  ranking	
  by	
  learning	
  pairwise	
  	
  preferences.	
  ArUficial	
  Intelligence,	
  172:1897–1916,	
  
2008.	
  
§ T.	
  Joachims.	
  Op'mizing	
  search	
  engines	
  using	
  clickthrough	
  data.	
  Proc.	
  KDD	
  2002.	
  
§ 	
  N.	
  LiYlestone	
  and	
  M.K.	
  Warmuth.	
  The	
  weighted	
  majority	
  algorithm.	
  InformaUon	
  and	
  ComputaUon,	
  108(2):	
  212–261,	
  1994.	
  
§ G.	
  Tsoumakas	
  and	
  I.	
  Katakis.	
  Mul'label	
  classifica'on:	
  An	
  overview.	
  Int.	
  J.	
  Data	
  Warehouse	
  and	
  Mining,	
  3:1–13,	
  2007.	
  
§ F.	
  Yaman,	
  T.	
  Walsh,	
  M.	
  LiYman	
  and	
  M.	
  desJardins.	
  Democra'c	
  Approxima'on	
  of	
  Lexicographic	
  Preference	
  Models.	
  ICML-­‐2008.	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  5	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
AGENDA	
  
1. Preference	
  Learning	
  Tasks	
  	
  
2. Performance	
  Assessment	
  and	
  Loss	
  FuncEons	
  	
  
3. Preference	
  Learning	
  Techniques	
  	
  
4. Conclusions	
  
1
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  5	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Conclusions	
  
§ Preference	
  learning	
  is	
  an	
  emerging	
  subfield	
  of	
  machine	
  learning,	
  	
  
with	
  many	
  applica:ons	
  and	
  theore:cal	
  challenges.	
  
§ PredicEon	
  of	
  preference	
  models	
  instead	
  of	
  scalar	
  outputs	
  (like	
  in	
  
classificaEon	
  and	
  regression),	
  hitherto	
  with	
  a	
  focus	
  on	
  rankings.	
  
§ Many	
  exisEng	
  machine	
  learning	
  problems	
  can	
  be	
  cast	
  in	
  the	
  framework	
  of	
  
preference	
  learning	
  (à	
  preference	
  learning	
  „in	
  a	
  broad	
  sense“)	
  
§ „Qualita:ve“	
  alterna:ve	
  to	
  convenEonal	
  numerical	
  approaches	
  	
  
- pairwise	
  comparison	
  instead	
  of	
  numerical	
  evaluaEon,	
  	
  
- order	
  relaEons	
  instead	
  of	
  individual	
  assessment.	
  	
  
§ SEll	
  many	
  open	
  problems	
  (unified	
  framework,	
  predicEons	
  more	
  general	
  
than	
  rankings,	
  incorporaEng	
  numerical	
  informaEon,	
  etc.)	
  
§ Interdisciplinary	
  field,	
  connecEons	
  to	
  many	
  other	
  areas.	
  
2
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  5	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Connec:ons	
  to	
  Other	
  Fields	
  
3
Preference	
  
Learning	
  
Recommender	
  
Systems	
  
MulElabel	
  
ClassificaEon	
  
Learning	
  
Monotone	
  
Models	
  
Structured	
  
Output	
  
PredicEon	
  
Ranking	
  in	
  
InformaEon	
  
Retrieval	
  
Ordinal	
  
ClassificaEon	
  
OperaEons	
  
Research	
  
MulEple	
  Criteria	
  
Decision	
  Making	
  
Social	
  
Choice	
  
Economics	
  &	
  
Decison	
  Theory	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  5	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Edited	
  Book	
  on	
  Preference	
  Learning	
  
4
Preference	
  Learning:	
  An	
  IntroducEon	
  
A	
  Preference	
  OpEmizaEon	
  based	
  Unifying	
  Framework	
  for	
  Supervised	
  Learning	
  Problems	
  
Part	
  I	
  –	
  Label	
  Ranking	
  
Label	
  Ranking	
  Algorithms:	
  A	
  Survey	
  
Preference	
  Learning	
  and	
  Ranking	
  by	
  Pairwise	
  Comparison	
  
Decision	
  Tree	
  Modeling	
  for	
  Ranking	
  Data	
  	
  
Co-­‐regularized	
  Least-­‐Squares	
  for	
  Label	
  Ranking	
  
Part	
  II	
  –	
  Instance	
  Ranking	
  
A	
  Survey	
  on	
  ROC-­‐Based	
  Ordinal	
  Regression	
  
Ranking	
  Cases	
  with	
  ClassificaEon	
  Rules	
  
Part	
  III	
  –	
  Object	
  Ranking	
  
A	
  Survey	
  and	
  Empirical	
  Comparison	
  of	
  Object	
  Ranking	
  Methods	
  
Dimension	
  ReducEon	
  for	
  Object	
  Ranking	
  
Learning	
  of	
  Rule	
  Ensembles	
  for	
  MulEple	
  A_ribute	
  Ranking	
  Problems	
  
Part	
  IV	
  –	
  Preferences	
  in	
  Mul:aOribute	
  Domains	
  
Learning	
  Lexicographic	
  Preference	
  Models	
  
Learning	
  Ordinal	
  Preferences	
  on	
  MulEa_ribute	
  Domains:	
  the	
  Case	
  of	
  CP-­‐nets	
  
Choice-­‐Based	
  Conjoint	
  Analysis:	
  ClassificaEon	
  vs.	
  Discrete	
  Choice	
  Models	
  
Learning	
  AggregaEon	
  Operators	
  for	
  Preference	
  Modeling	
  
Part	
  V	
  –	
  Preferences	
  in	
  Informa:on	
  Retrieval	
  
EvaluaEng	
  Search	
  Engine	
  Relevance	
  with	
  Click-­‐Based	
  Metrics	
  
Learning	
  SVM	
  Ranking	
  FuncEon	
  from	
  User	
  Feedback	
  Using	
  Document	
  Metadata	
  and	
  AcEve	
  Learning	
  in	
  the	
  Biomedical	
  
Domain	
  
Part	
  VI	
  –	
  Preferences	
  in	
  Recommender	
  Systems	
  
Learning	
  Preference	
  Models	
  in	
  Recommender	
  Systems	
  
CollaboraEve	
  Preference	
  Learning	
  
Discerning	
  Relevant	
  Model	
  Features	
  in	
  a	
  Content-­‐Based	
  CollaboraEve	
  Recommender	
  System	
  
J.	
  Fürnkranz	
  &	
  
	
  E.	
  Hüllermeier	
  (eds.)	
  
Preference	
  Learning	
  
Springer-­‐Verlag	
  2010	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  5	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Edited	
  Book	
  on	
  Preference	
  Learning	
  
5
Preference	
  Learning:	
  An	
  Introduc:on	
  
A	
  Preference	
  OpEmizaEon	
  based	
  Unifying	
  Framework	
  for	
  Supervised	
  Learning	
  Problems	
  
Part	
  I	
  –	
  Label	
  Ranking	
  
Label	
  Ranking	
  Algorithms:	
  A	
  Survey	
  
Preference	
  Learning	
  and	
  Ranking	
  by	
  Pairwise	
  Comparison	
  
Decision	
  Tree	
  Modeling	
  for	
  Ranking	
  Data	
  	
  
Co-­‐regularized	
  Least-­‐Squares	
  for	
  Label	
  Ranking	
  
Part	
  II	
  –	
  Instance	
  Ranking	
  
A	
  Survey	
  on	
  ROC-­‐Based	
  Ordinal	
  Regression	
  
Ranking	
  Cases	
  with	
  ClassificaEon	
  Rules	
  
Part	
  III	
  –	
  Object	
  Ranking	
  
A	
  Survey	
  and	
  Empirical	
  Comparison	
  of	
  Object	
  Ranking	
  Methods	
  
Dimension	
  ReducEon	
  for	
  Object	
  Ranking	
  
Learning	
  of	
  Rule	
  Ensembles	
  for	
  MulEple	
  A_ribute	
  Ranking	
  Problems	
  
Part	
  IV	
  –	
  Preferences	
  in	
  Mul:aOribute	
  Domains	
  
Learning	
  Lexicographic	
  Preference	
  Models	
  
Learning	
  Ordinal	
  Preferences	
  on	
  MulEa_ribute	
  Domains:	
  the	
  Case	
  of	
  CP-­‐nets	
  
Choice-­‐Based	
  Conjoint	
  Analysis:	
  ClassificaEon	
  vs.	
  Discrete	
  Choice	
  Models	
  
Learning	
  AggregaEon	
  Operators	
  for	
  Preference	
  Modeling	
  
Part	
  V	
  –	
  Preferences	
  in	
  Informa:on	
  Retrieval	
  
EvaluaEng	
  Search	
  Engine	
  Relevance	
  with	
  Click-­‐Based	
  Metrics	
  
Learning	
  SVM	
  Ranking	
  FuncEon	
  from	
  User	
  Feedback	
  Using	
  Document	
  Metadata	
  and	
  AcEve	
  Learning	
  in	
  the	
  Biomedical	
  
Domain	
  
Part	
  VI	
  –	
  Preferences	
  in	
  Recommender	
  Systems	
  
Learning	
  Preference	
  Models	
  in	
  Recommender	
  Systems	
  
CollaboraEve	
  Preference	
  Learning	
  
Discerning	
  Relevant	
  Model	
  Features	
  in	
  a	
  Content-­‐Based	
  CollaboraEve	
  Recommender	
  System	
  
J.	
  Fürnkranz	
  &	
  
	
  E.	
  Hüllermeier	
  (eds.)	
  
Preference	
  Learning	
  
Springer-­‐Verlag	
  2010	
  
includes	
  several	
  introduc:ons	
  
and	
  survey	
  ar:cles	
  	
  
DS	
  2011	
  Tutorial	
  on	
  Preference	
  Learning	
  |	
  Part	
  5	
  |	
  J.	
  Fürnkranz	
  &	
  E.	
  Hüllermeier	
  
Preference	
  Learning	
  Website	
  
§ Working	
  groups	
  
§ Sobware	
  
§ Data	
  Sets	
  
§ Workshops	
  
§ Tutorials	
  
§ Books	
  
§ ...	
  
6
http://www.preference-learning.org/

PL-Tutorial-DS-11_machine learning .pdf

  • 1.
    Preference  Learning:     A  Tutorial  Introduc5on   Eyke  Hüllermeier   Knowledge  Engineering  &  Bioinforma2cs  Lab   Dept.  of  Mathema2cs  and  Computer  Science   Marburg  University,  Germany   DS  2011,  Espoo,  Finland,  Oct  2011   Johannes  Fürnkranz   Knowledge  Engineering   Dept.  of  Computer  Science   Technical  University  Darmstadt,  Germany  
  • 2.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   What  is  Preference  Learning  ?   § Preference  learning  is  an  emerging  subfield  of  machine  learning     § Roughly  speaking,  it  deals  with  the  learning  of  (predic5ve)  preference   models  from  observed  (or  extracted)  preference  informa2on     MACHINE   LEARNING   PREFERENCE  MODELING   and  DECISION  ANALYSIS   Preference  Learning   computer  science   ar2ficial  intelligence   opera2ons  research   social  sciences  (vo2ng  and  choice  theory)   economics  and  decision  theory   2
  • 3.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Workshops  and  Related  Events   § NIPS–01:  New  Methods  for  Preference  Elicita2on   § NIPS–02:  Beyond  Classifica2on  and  Regression:  Learning  Rankings,   Preferences,  Equality  Predicates,  and  Other  Structures   § KI–03:  Preference  Learning:  Models,  Methods,  Applica2ons   § NIPS–04:  Learning  With  Structured  Outputs   § NIPS–05:  Workshop  on  Learning  to  Rank   § IJCAI–05:  Advances  in  Preference  Handling   § SIGIR  07–10:  Workshop  on  Learning  to  Rank  for  Informa2on  Retrieval   § ECML/PDKK  08–10:  Workshop  on  Preference  Learning   § NIPS–09:  Workshop  on  Advances  in  Ranking   § American  Ins2tute  of  Mathema2cs  Workshop  in  Summer  2010:  The   Mathema2cs  of  Ranking   § NIPS-­‐11:  Workshop  on  Choice  Models  and  Preference  Learning   3
  • 4.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Preferences  in  Ar5ficial  Intelligence   User  preferences  play  a  key  role  in  various  fields  of  applica2on:   - recommender  systems,   - adap2ve  user  interfaces,   - adap2ve  retrieval  systems,   - autonomous  agents  (electronic  commerce),   - games,  …   Preferences  in  AI  research:   - preference  representa5on  (CP  nets,  GAU  networks,  logical   representa2ons,  fuzzy  constraints,  …)   - reasoning  with  preferences  (decision  theory,  constraint  sa2sfac2on,   non-­‐monotonic  reasoning,  …)   - preference  acquisi5on  (preference  elicita2on,  preference   learning,  ...)   More  generally,  „preferences“  is  a  key  topic  in  current  AI  research   4
  • 5.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   AGENDA   1. Preference  Learning  Tasks     2. Performance  Assessment  and  Loss  Func2ons     3. Preference  Learning  Techniques     4. Conclusions   5
  • 6.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Preference  Learning   Preference  learning  problems  can  be  dis2nguished  along  several   problem  dimensions,  including   § representa5on  of  preferences,  type  of  preference  model:     - u2lity  func2on  (ordinal,  numeric),     - preference  rela2on  (par2al  order,  ranking,  …),     - logical  representa2on,  …   § descrip5on  of  individuals/users  and  alterna5ves/items:     - iden2fier,  feature  vector,  structured  object,  …   § type  of  training  input:     - direct  or  indirect  feedback,     - complete  or  incomplete  rela2ons,     - u2li2es,  …   § …   6
  • 7.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Preference  Learning   7 Preferences   absolute   rela5ve   A B   C   D 1   1   0   0   A   B   C   D   .9   .8   .1   .3   binary   gradual   total  order   par5al  order   Â A Â Â B C D Â A B C D Â ordinal   numeric   A   B   C   D   +   +   -­‐   0   à  (ordinal)  regression   à  classifica2on/ranking   assessing   comparing  
  • 8.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Structure  of  this  Overview   (1) Preference  learning  as  an  extension  of  conven5onal  supervised  learning:   Learn  a  mapping       that  maps  instances  to  preference  models  (à  structured/complex  output   predic2on).   (2) Other  selngs  (object  ranking,  instance  ranking,  CF,  …)   8
  • 9.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Structure  of  this  Overview   (1) Preference  learning  as  an  extension  of  conven5onal  supervised  learning:   Learn  a  mapping       that  maps  instances  to  preference  models  (à  structured/complex  output   predic2on).       Instances  are  typically  (though  not  necessarily)  characterized  in  terms  of  a   feature  vector.     The  output  space  consists  of  preference  models  over  a  fixed  set  of   alterna2ves  (classes,  labels,  …)  represented  in  terms  of  an  iden2fier   à  extensions  of  mul--­‐class  classifica-on   9
  • 10.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Mul5label  Classifica5on  [Tsoumakas  &  Katakis  2007]   10 X1   X2   X3   X4   A   B   C   D   0.34   0   10   174   0   1   1   0   1.45   0   32   277   0   1   0   1   1.22   1   46   421   0   0   0   1   0.74   1   25   165   0   1   1   1   0.95   1   72   273   1   0   1   0   1.04   0   33   158   1   1   1   0   0.92   1   81   382   0   1   0   1   Training   0.92   1   81   382   1   1   0   1   Binary   preferences  on  a   fixed  set  of  items:   liked  or  disliked   Predic5on   Ground  truth   LOSS  
  • 11.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Mul5label  Ranking   11 X1   X2   X3   X4   A   B   C   D   0.34   0   10   174   0   1   1   0   1.45   0   32   277   0   1   0   1   1.22   1   46   421   0   0   0   1   0.74   1   25   165   0   1   1   1   0.95   1   72   273   1   0   1   0   1.04   0   33   158   1   1   1   0   0.92   1   81   382   4   1   3   2   Training   Predic5on   0.92   1   81   382   1   1   0   1   Ground  truth   Â B Â Â D C A Binary   preferences  on  a   fixed  set  of  items:   liked  or  disliked   A  ranking  of  all   items   LOSS  
  • 12.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Graded  Mul5label  Classifica5on  [Cheng  et  al.  2010]   12 X1   X2   X3   X4   A   B   C   D   0.34   0   10   174   -­‐-­‐   +   ++   0   1.45   0   32   277   0   ++   -­‐-­‐   +   1.22   1   46   421   -­‐-­‐   -­‐-­‐   0   +   0.74   1   25   165   0   +   +   ++   0.95   1   72   273   +   0   ++   -­‐-­‐   1.04   0   33   158   +   +   ++   -­‐-­‐   0.92   1   81   382   -­‐-­‐   +   0   ++   Training   Predic5on   0.92   1   81   382   0   ++   -­‐-­‐   +   Ground  truth   Ordinal   preferences  on  a   fixed  set  of  items:   liked,  disliked,  or   something  in-­‐ between       LOSS  
  • 13.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Graded  Mul5label  Ranking   13 X1   X2   X3   X4   A   B   C   D   0.34   0   10   174   -­‐-­‐   +   ++   0   1.45   0   32   277   0   ++   -­‐-­‐   +   1.22   1   46   421   -­‐-­‐   -­‐-­‐   0   +   0.74   1   25   165   0   +   +   ++   0.95   1   72   273   +   0   ++   -­‐-­‐   1.04   0   33   158   +   +   ++   -­‐-­‐   0.92   1   81   382   4   1   3   2   Training   Predic5on   0.92   1   81   382   0   ++   -­‐-­‐   +   Ground  truth   Â B Â Â D C A A  ranking  of  all   items   LOSS   Ordinal   preferences  on  a   fixed  set  of  items:   liked,  disliked,  or   something  in-­‐ between      
  • 14.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Label  Ranking  [Hüllermeier  et  al.  2008]   14 X1   X2   X3   X4   Preferences   0.34   0   10   174   A  Â  B,  B  Â  C,  C  Â  D   1.45   0   32   277   B  Â  C   1.22   1   46   421   B  Â  D,  A  Â  D,  C  Â  D,  A  Â  C   0.74   1   25   165   C  Â  A,  C  Â  D,  A  Â  B   0.95   1   72   273   B  Â  D,  A  Â  D   1.04   0   33   158   D  Â  A,  A  Â  B,  C  Â  B,  A  Â  C   0.92   1   81   382   4   1   3   2   Training   Predic5on   0.92   1   81   382   2   1   3   4   Ground  truth   Â B Â Â D C A A  ranking  of  all   labels   Instances  are   associated  with   pairwise   preferences   between  labels.   LOSS  
  • 15.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Calibrated  Label  Ranking  [Fürnkranz  et  al.  2008]   15 Combining  absolute  and  rela2ve  evalua2on:    a  b c   d    e f   g relevant   posi2ve   liked   irrelevant   nega2ve   disliked   Preferences   absolute   rela5ve  
  • 16.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Structure  of  this  Overview   (1) Preference  Learning  as  an  extension  of  conven2onal  supervised  learning:   Learn  a  mapping       that  maps  instances  to  preference  models  (à  structured  output   predic2on).   (2) Other  sedngs  (no  clear  dis2nc2on  between  input/output  space)        object  ranking,  instance  ranking,  collabora2ve  filtering     16
  • 17.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Object  Ranking  [Cohen  et  al.  99]   17 Training   Pairwise   preferences   between  objects   (instances).   Predic5on  (ranking  a  new  set  of  objects)   Ground  truth  (ranking  or  top-­‐ranking  or  subset  of  relevant  objects)  
  • 18.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Instance  Ranking  [Fürnkranz  et  al.  2009]   18 Training   X1   X2   X3   X4   class   0.34   0   10   174   -­‐-­‐   1.45   0   32   277   0   0.74   1   25   165   ++   …   …   …   …   …   0.95   1   72   273   +   Predic5on  (ranking  a  new  set  of  objects)   +   0   ++   ++   -­‐-­‐   +   0   +   -­‐-­‐   0   0   -­‐-­‐   -­‐-­‐   Ground  truth  (ordinal  classes)   Absolute   preferences  on   an  ordinal  scale.  
  • 19.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Instance  Ranking  [Fürnkranz  et  al.  2009]   19 predicted  ranking,  e.g.,   through  sor2ng  by   es2mated    score   most  likely  good   most  likely  bad   ranking  error   Extension  of  AUC  maximiza2on  to  the  polytomous  case,  in  which   instances  are  rated  on  an  ordinal  scale  such  as  {  bad, medium, good}   Query  set  of  instances  to   be  ranked  (true  labels   are  unknown).  
  • 20.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Collabora5ve  Filtering  [Goldberg  et  al.  1992]   20 P1   P2   P3   …   P38   …   P88   P89   P90   U1   1   4   …   …   3   U2   2   2   …   …   1   …   …   …   U46   ?   2   ?   …   ?   …   ?   ?   4   …   …   …   U98   5   …   …   4   U99   1   …   …   2   1:  very  bad,      2:  bad,      3:  fair,      4:  good,      5:  excellent   U   S   E   R   S    P  R  O  D  U  C  T  S   Inputs  and  outputs  as  iden2fiers,  absolute  preferences  in  terms  of  ordinal  degrees.  
  • 21.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Dyadic  Predic5on  [Menon  &  Elkan  2010]   21 P1   P2   P3   …   P38   …   P88   P89   P90   U1   1   4   …   …   3   U2   2   2   …   …   1   …   …   …   U46   ?   2   ?   …   ?   …   ?   ?   4   …   …   …   U98   5   …   …   4   U99   1   …   …   2   ?   ?   1   5   ?   ?   0   4   ?   ?   0   6   ?   ?   1   5   ?   ?   1   7   ?   ?   0   6   ?   ?   1   6   ?   ?   ?   ?   ?   ?   ?   ?   ?   10   14   45   32   52   61   16   33   53   Addi2onal  side-­‐informa2on:  observed  features  +   latent  features  of  users  and  items  
  • 22.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Preference  Learning  Tasks   22 task   input   output   training   predic5on   ground  truth   collabora2ve   filtering   iden2fier   iden5fier   absolute   ordinal   absolute   ordinal   absolute   ordinal   mul2label   classifica2on   feature   iden5fier   absolute   binary   absolute   binary   absolute   binary   mul2label     ranking   feature   iden5fier   absolute   binary   ranking   absolute   binary   graded  mul2label   classifica2on   feature   iden5fier   absolute   ordinal   absolute   ordinal   absolute   ordinal   label     ranking   feature   iden5fier   rela2ve   binary   ranking   ranking   object     ranking   feature   -­‐-­‐   rela2ve   binary   ranking   ranking  or  subset   instance     ranking   feature   iden2fier   absolute   ordinal   ranking   absolute   ordinal   generalized   c lassifica2on   ranking   Two  main  direc2ons:  (1)  ranking  and  variants  (2)  generaliza2ons  of  classifica2on.   representa2on   type  of  preference  informa2on  
  • 23.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   Loss  Func5ons   23 absolute  u2lity  degree   absolute  u2lity  degree   subset  of  preferred  items   subset  of  preferred  items   subset  of  preferred  items   ranking  of  items   fuzzy  subset  of  preferred  items   fuzzy  subset  of  preferred  items   ranking  of  items   ranking  of  items   ranking  of  items   ordered  par22on  of  items   Things  to  be  compared:   standard  comparison  of   scalar  predic2ons   non-­‐standard   c omparisons   predic5on   ground  truth  
  • 24.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  1  |  J.  Fürnkranz  &  E.  Hüllermeier   24 References   § W.  Cheng,  K.  Dembczynski  and  E.  Hüllermeier.  Graded  Mul-label  Classifica-on:  The  Ordinal  Case.  ICML-­‐2010,  Haifa,  Israel,   2010.   § W.  Cheng  and  E.  Hüllermeier.  Predic-ng  par-al  orders:  Ranking  with  absten-on.  ECML/PKDD-­‐2010,  Barcelona,  2010.   § Y.  Chevaleyre,  F.  Koriche,  J.  Lang,  J.  Mengin,  B.  Zanulni.  Learning  ordinal  preferences  on  mul-aCribute  domains:  The  case  of   CP-­‐nets.  In:  J.  Fürnkranz  and  E.  Hüllermeier  (eds.)  Preference  Learning,  Springer-­‐Verlag,  2010.     § W.W.  Cohen,  R.E.  Schapire  and  Y.  Singer.  Learning  to  order  things.  Journal  of  Ar2ficial  Intelligence  Research,  10:243–270,  1999.   § J.  Fürnkranz,  E.  Hüllermeier,  E.  Mencia,  and  K.  Brinker.  Mul-label  Classifica-on  via  Calibrated  Label  Ranking.  Machine  Learning   73(2):133-­‐153,  2008.     § J.  Fürnkranz,  E.  Hüllermeier  and  S.  Vanderlooy.  Binary  decomposi-on  methods  for  mul-par-te  ranking.  Proc.  ECML-­‐2009,  Bled,   Slovenia,  2009.   § D.  Goldberg,  D.  Nichols,  B.M.  Oki  and  D.  Terry.  Using  collabora-ve  filtering  to  weave  and  informa-on  tapestry.   Communica2ons  of  the  ACM,  35(12):61–70,  1992.   § E.  Hüllermeier,  J.  Fürnkranz,  W.  Cheng  and  K.  Brinker.  Label  ranking  by  learning  pairwise    preferences.  Ar2ficial  Intelligence,   172:1897–1916,  2008.   § G.  Tsoumakas  and  I.  Katakis.  Mul--­‐label  classifica-on:  An  overview.  Int.  J.  Data  Warehouse  and  Mining,  3:1–13,  2007.   § A.K.  Menon  and  C.  Elkan.  Predic-ng  labels  for  dyadic  data.  Data  Mining  and  Knowledge  Discovery,  21(2),  2010  
  • 25.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 1 AGENDA 1. Preference Learning Tasks 2. Loss Functions a. Evaluation of Rankings b. Weighted Measures c. Evaluation of Bipartite Rankings d. Evaluation of Partial Rankings 3. Preference Learning Techniques 4. Conclusions
  • 26.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 2 Rank Evaluation Measures  In the following, we do not discriminate between different ranking scenarios  we use the term items for both, objects and labels  All measures are applicable to both scenarii  sometimes have different names according to context  Label Ranking  measure is applied to the ranking of the labels of each examples  averaged over all examples  Object Ranking  measure is applied to the ranking of a set of objects  we may need to average over different sets of objects which have disjoint preference graphs  e.g. different sets of query / answer set pairs in information retrieval
  • 27.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 3  Given:  a set of items X = {x1, …, xc} to rank  Example: X = {A, B, C, D, E} D C E B A Ranking Errors items can be objects or labels
  • 28.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 4 Ranking Errors  Given:  a set of items X = {x1, …, xc} to rank  Example: X = {A, B, C, D, E}  a target ranking  Example: E  B  C  A  D r D C A B E r
  • 29.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 5 Ranking Errors  Given:  a set of items X = {x1, …, xc} to rank  Example: X = {A, B, C, D, E}  a target ranking  Example: E  B  C  A  D  a predicted ranking  Example: A  B  E  C  D  Compute:  a value that measures the distance between the two rankings r D E C B A  r  r D C A B E r d r ,  r d  r ,r
  • 30.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 6 Notation  and are functions from X → ℕ  returning the rank of an item x  the inverse functions r-1 : ℕ → X  return the item at a certain position  as a short-hand for , we also define function R: ℕ → ℕ  R(i) returns the true rank of the i-th item in the predicted ranking r D E C B A  r  r D C A B E r r A=4  r A=1  r −1 1=A r −1 4=A R1=r r−1 1=4 r ° r−1
  • 31.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 7 Spearman's Footrule  Key idea:  Measure the sum of absolute differences between ranks DSF r ,  r=∑ i=1 c ∣r xi − rxi ∣=∑ i=1 c ∣i−Ri∣ D E C B A  r D C A B E r =∑ i=1 c d xi r ,  r dA=∣1−4∣=3 d B=0 dc=1 d D=0 dE=2 ∑xi d xi =30102=6
  • 32.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 8 Spearman Distance  Key idea:  Measure the sum of absolute differences between ranks  Value range: → Spearman Rank Correlation Coefficient DS r ,  r=∑ i=1 c rxi − rxi2 =∑ i=1 c i−Ri2 D E C B A  r D C A B E r =∑ i=1 c d xi r ,  r2 dA=∣1−4∣=3 d B=0 dc=1 d D=0 dE=2 ∑xi d xi 2 =32 012 022 =14 squared min DS r ,  r=0 max DS r ,  r=∑ i=1 c c−i−i 2 = c⋅c 2 −1 3 1− 6⋅DS r ,  r c⋅c2 −1 ∈[−1,1]
  • 33.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 9 Kendall's Distance  Key idea:  number of item pairs that are inverted in the predicted ranking  Value range: → Kendall's tau D E C B A  r D C A B E r 1− 4⋅D r ,  r c⋅c−1 ∈[−1,1] D r ,  r =∣ {i , j∣rxirx j ∧  rxi  r x j}∣ D r ,  r = 4 min D r ,  r=0 max D r ,  r= c⋅c−1 2
  • 34.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 10 AGENDA 1. Preference Learning Tasks 2. Loss Functions a. Evaluation of Rankings b. Weighted Measures c. Evaluation of Bipartite Rankings d. Evaluation of Partial Rankings 3. Preference Learning Techniques 4. Conclusions
  • 35.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 11 Weighted Ranking Errors  The previous ranking functions give equal weight to all ranking positions  i.e., differences in the first ranking positions have the same effect as differences in the last ranking positions  In many applications this is not desirable  ranking of search results  ranking of product recommendations  ranking of labels for classification  ... D C E B A E C D B A E C D B A E C D A B D ,  = D ,  Higher ranking positions should be given more weight ⇒
  • 36.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 12 Position Error  Key idea:  in many applications we are interested in providing a ranking where the target item appears a high as possible in the predicted ranking  e.g. ranking a set of actions for the next step in a plan  Error is the number of wrong items that are predicted before the target item  Note:  equivalent to Spearman's footrule with all non-target weights set to 0 D E C B A  r D C A B E r DPE r ,  r= rarg minx∈X r x−1 DPE r ,  r=2 DPE r ,  r=∑ i=1 c wi⋅d xi r ,  r wi = 〚xi=arg minx∈X r x〛 with
  • 37.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 13 Discounted Error  Higher ranks in the target position get a higher weight than lower ranks D E C B A  r r D A C B E DDRr ,  r=∑ i=1 c wi⋅d xi r ,  r wi = 1 logrxi1 with DDR r ,  r= 3 log 2 0 1 log 4 0 2 log6
  • 38.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 14 (Normalized) Discounted Cumulative Gain  a “positive” version of discounted error: Discounted Cumulative Gain (DCG)  Maximum possible value:  the predicted ranking is correct, i.e.  Ideal Discounted Cumulative Gain (IDCG)  Normalized DCG (NDCG) D E C B A  r r D A C B E DCGr ,  r=∑ i=1 c c−Ri logi1 ∀ i:i=Ri IDCG=∑ i=1 c c−i logi1 NDCGr ,  r= DCGr ,  r IDCG NDCGr ,  r= 1 log 2  3 log 3  4 log 4  2 log5  0 log6 4 log 2  3 log 3  2 log 4  1 log5  0 log6
  • 39.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 15 AGENDA 1. Preference Learning Tasks 2. Loss Functions a. Evaluation of Rankings b. Weighted Measures c. Evaluation of Bipartite Rankings d. Evaluation of Partial Rankings 3. Preference Learning Techniques 4. Conclusions
  • 40.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 16 Bipartite Rankings  The target ranking is not totally ordered but a bipartite graph  The two partitions may be viewed as preference levels L = {0, 1}  all c1 items of level 1 are preferred over all c0 items of level 0  We now have fewer preferences  for a total order:  for a bipartite graph: Bipartite Rankings D E C B A  r D C A B E r c 2 ⋅c−1 c1⋅c−c1
  • 41.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 17 Evaluating Partial Target Rankings  Many Measures can be directly adapted from total target rankings to partial target rankings  Recall: Kendall's distance  number of item pairs that are inverted in the target ranking  can be directly used  in case of normalization, we have to consider that fewer items satisfy r(xi) < r(xj)  Area under the ROC curve (AUC)  the AUC is the fraction of pairs of (p,n) for which the predicted score s(p) > s(n)  Mann Whitney statistic is the absolute number  This is 1 - normalized Kendall's distance for a bipartite preference graph with L = {p,n} D r ,  r =∣ {i , j∣r xirx j ∧  rxi rx j}∣ D E C B A  r D C A B E r D r ,  r = 2 AUCr ,  r = 4 6
  • 42.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 18 Evaluating Multipartite Rankings  Multipartite rankings:  like Bipartite rankings  but the target ranking r consists of multiple relevance levels L = {1 … l}, where l < c  total ranking is a special case where each level has exactly one item  # of preferences  ci is the number of items in level I  C-Index [Gnen & Heller, 2005]  straight-forward generalization of AUC  fraction of pairs (xi ,xj) for which D E C B A  r D C A B E r =∑ i , j ci⋅c j ≤ c2 2 ⋅1− 1 l  l il  j ∧  rxi rx j  D r ,  r = 3 C-Indexr ,  r = 5 8
  • 43.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 19 Evaluating Multipartite Rankings C-Index  the C-index can be rewritten as a weighted sum of pairwise AUCs: where and are the rankings and restricted to levels i and j. Jonckheere-Terpstra statistic  is an unweighted sum of pairwise AUCs:  equivalent to well-known multi-class extension of AUC [Hand & Till, MLJ 2001] C-Indexr ,  r= 1 ∑i , ji ci⋅c j ∑i , ji ci⋅c j ⋅AUCri , j ,  ri , j  m-AUC= 2 l⋅l−1 ∑i , ji AUCri , j ,  ri , j  Note: C-Index and m-AUC can be optimized by optimization of pairwise AUCs Note: C-Index and m-AUC can be optimized by optimization of pairwise AUCs ri , j  ri , j r  r
  • 44.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 20 Normalized Discounted Cumulative Gain [Jarvelin & Kekalainen, 2002] D E C B A  r D C A B E r  The original formulation of (normalized) discounted cumulative gain refers to this setting  the sum of the true (relevance) levels of the items  each item weighted by its rank in the predicted ranking  Examples:  retrieval of relevant or irrelevant pages  2 relevance levels  movie recommendation  5 relevance levels DCGr ,  r=∑ i=1 c li logi1
  • 45.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 21 AGENDA 1. Preference Learning Tasks 2. Loss Functions a. Evaluation of Rankings b. Weighted Measures c. Evaluation of Bipartite Rankings d. Evaluation of Partial Rankings 3. Preference Learning Techniques 4. Conclusions
  • 46.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 22 Evaluating Partial Structures in the Predicted Ranking  For fixed types of partial structures, we have conventional measures  bipartite graphs → binary classification  accuracy, recall, precision, F1, etc.  can also be used when the items are labels!  e.g., accuracy on the set of labels for multilabel classification  multipartite graphs → ordinal classification  multiclass classification measures (accuracy, error, etc.)  regression measures (sum of squared errors, etc.)  For general partial structures  some measures can be directly used on the reduced set of target preferences  Kendall's distance, Gamma coefficient  we can also use set measures on the set of binary preferences  both, the source and the target ranking consist of a set of binary preferences  e.g. Jaccard Coefficient  size of interesection over size of union of the binary preferences in both sets ¿ ¿
  • 47.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 23 Gamma Coefficient  Key idea: normalized difference between  number of correctly ranked pairs (Kendall's distance)  number of incorrectly ranked pairs  Gamma Coefficient [Goodman & Kruskal, 1979]  Identical to Kendall's tau if both rankings are total  i.e., if D E C B A  r D C A B E r d=Dr ,  r  d =∣ {i , j∣ rxirx j  ∧  rxi rx j}∣ r ,  r= d−  d d  d ∈[−1,1] r ,  r=2−1 21 = 1 3 d  d = c⋅c−1 2
  • 48.
    DS-2011 Tutorial onPreference Learning | Part 2 | E. Hüllermeier & J. Fürnkranz 24 References  Cheng W., Rademaker M., De Baets B., Hüllermeier E.: Predicting Partial Orders: Ranking with Abstention. Proceedings ECML/PKDD-10(1): 215-230 (2010)  Fürnkranz J., Hüllermeier E., Vanderlooy S.: Binary Decomposition Methods for Multipartite Ranking. Proceedings ECML/PKDD-09(1): 359-374 (2009)  Gnen M., Heller G.: Concordance probability and discriminatory power in proportional hazards regression. Biometrika 92(4):965–970 (2005)  Goodman, L., Kruskal, W.: Measures of Association for Cross Classifications. Springer-Verlag, New York (1979)  Hand D.J., Till R.J.: A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2):171- 186 (2001)  Jarvelin K., Kekalainen J.: Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20(4): 422–446 (2002)  Jonckheere, A. R.: A distribution-free k-sample test against ordered alternatives. Biometrika: 133–145 (1954)  Kendall, M. A New Measure of Rank Correlation. Biometrika 30 (1-2): 81–89 (1938)  Mann H. B.,Whitney D. R. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18:50–60 (1947)  Spearman C. The proof and measurement of association between two things. American Journal of Psychology, 15:72–101 (1904)
  • 49.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   AGENDA   1. Preference  Learning  Tasks     2. Performance  Assessment  and  Loss  FuncEons     3. Preference  Learning  Techniques     a. Learning  UElity  FuncEons   b. Learning  Preference  RelaEons   c. Structured  Output  PredicEon   d. Model-­‐Based  Preference  Learning   e. Local  Preference  AggregaEon   4. Conclusions   1
  • 50.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Two  Ways  of  Represen>ng  Preferences   2       § U>lity-­‐based  approach:  EvaluaEng  single  alternaEves   § Rela>onal  approach:  Comparing  pairs  of  alternaEves   weak  preference   strict  preference   indifference   incomparability  
  • 51.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   U>lity  Func>ons   3 § A  u>lity  func>on  assigns  a  uElity  degree  (typically  a  real  number  or  an   ordinal  degree)  to  each  alternaEve.   § Learning  such  a  funcEon  essenEally  comes  down  to  solving  an  (ordinal)   regression  problem.   § OWen  addi>onal  condi>ons,  e.g.,  due  to  bounded  uElity  ranges  or   monotonicity  properEes  (à  learning  monotone  models)   § A  u>lity  func>on  induces  a  ranking  (total  order),  but  not  the  other  way   around!     § But  it  can  not  represent  more  general  relaEons,  e.g.,  a  par>al  order!   § The  feedback  can  be  direct  (exemplary  uElity  degrees  given)  or  indirect   (inequality  induced  by  order  relaEon):   absolute  feedback   relaEve  feedback  
  • 52.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Predic>ng  U>li>es  on  Ordinal  Scales   4 (Graded)  mulElabel  classificaEon   CollaboraEve  filtering   ExploiEng  dependencies   (correlaEons)  between  items   (labels,  products,  …)   à  see  work  in  MLC  and  RecSys  communiEes  
  • 53.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Learning  U>lity  Func>ons  from  Indirect  Feedback   5 § A  (latent)  uElity  funcEon  can  also  be  used  to  solve  ranking  problems,   such  as  instance,  object  or  label  ranking   à  ranking  by  (es>mated)  u>lity  degrees  (scores)   Instance  ranking   Absolute  preferences  given,  so  in   principle  an  ordinal  regression   problem.  However,  the  goal  is  to   maximize  ranking  instead  of   classificaEon  performance.     Object  ranking   Find  a  uElity  funcEon  that  agrees   as  much  as  possible  with  the   preference  informaEon  in  the   sense  that,  for  most  examples,    
  • 54.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Ranking  versus  Classifica>on   6 posiEve   negaEve   A  ranker  can  be  turned  into  a  classifier  via  thresholding:   A  good  classifier  is  not  necessarily  a  good  ranker:   2  classificaEon  but   10  ranking  errors   à      learning  AUC-­‐op'mizing  scoring  classifiers  !  
  • 55.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   RankSVM  and  Related  Methods  (Bipar>te  Case)   § The  idea  is  to  minimize  a  convex  upper  bound  on  the  empirical  ranking   error  over  a  class  of  (kernelized)  ranking  funcEons:   7 convex  upper  bound  on   regularizer   check  for  all  posiEve/ negaEve  pairs  
  • 56.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   RankSVM  and  Related  Methods  (Bipar>te  Case)   § The  biparEte  RankSVM  algorithm  [Herbrich  et  al.  2000,  Joachimes  2002]:   8 hinge  loss   regularizer   reproducing  kernel   Hilbert  space  (RKHS)  with   kernel  K à  learning  comes  down  to  solving  a  QP  problem  
  • 57.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   RankSVM  and  Related  Methods  (Bipar>te  Case)   § The  biparEte  RankBoost  algorithm  [Freund  et  al.  2003]:   9 class  of  linear   combinaEons  of  base   funcEons   à  learning  by  means  of  boosEng  techniques  
  • 58.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Learning  U>lity  Func>ons  for  Label  Ranking   10
  • 59.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Label  Ranking:  Reduc>on  to  Binary  Classifica>on  [Har-­‐Peled  et  al.  2002]     11 Each  pairwise  comparison  is  turned  into  a  binary  classifica>on   example  in  a  high-­‐dimensional  space!   posiEve  example  in  the  new  instance  space   (m  x  k)-­‐dimensional  weight  vector  
  • 60.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   AGENDA   1. Preference  Learning  Tasks     2. Performance  Assessment  and  Loss  FuncEons     3. Preference  Learning  Techniques     a. Learning  UElity  FuncEons   b. Learning  Preference  Rela>ons   c. Structured  Output  PredicEon   d. Model-­‐Based  Preference  Learning   e. Local  Preference  AggregaEon   4. Conclusions   12
  • 61.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Learning  Binary  Preference  Rela>ons   § Learning  binary  preferences  (in  the  form  of  predicates  P(x,y))  is  oWen   simpler,  especially  if  the  training  informaEon  is  given  in  this  form,  too.     § However,  it  implies  an  addiEonal  step,  namely  extrac>ng  a  ranking  from  a   (predicted)  preference  relaEon.   § This  step  is  not  always  trivial,  since  a  predicted  preference  relaEon  may   exhibit  inconsistencies  and  may  not  suggest  a  unique  ranking  in  an   unequivocal  way.   13 1   1   0   0   0   0   1   0   0   1   0   0   1   0   1   1   1   1   1   0   inference  
  • 62.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Object  Ranking:  Learning  to  Order  Things  [Cohen  et  al.  99]   § In  a  first  step,  a  binary  preference  func>on  PREF  is  constructed;  PREF (x,y) 2  [0,1]  is  a  measure  of  the  certainty  that  x  should  be  ranked  before   y,  and  PREF(x,y)=1- PREF(y,x).   § This  funcEon  is  expressed  as  a  linear  combinaEon  of  base  preference   funcEons:   § The  weights  can  be  learned,  e.g.,  by  means  of  the  weighted  majority   algorithm  [Lijlestone  &  Warmuth  94].     § In  a  second  step,  a  total  order  is  derived,  which  is  as  much  as  possible  in   agreement  with  the  binary  preference  relaEon.     14
  • 63.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Object  Ranking:  Learning  to  Order  Things  [Cohen  et  al.  99]   § The  weighted  feedback  arc  set  problem:  Find  a  permutaEon  ¼  such  that         becomes  minimal.   15 0.7   0.9   0.6   0.6   0.8   0.5   0.3   0.1   0.6   0.4   0.5   0.8   cost  =  0.1+0.6+0.8+0.5+0.3+0.4  =  2.7     0.1  
  • 64.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Object  Ranking:  Learning  to  Order  Things  [Cohen  et  al.  99]   § Since  this  is  an  NP-­‐hard  problem,  it  is  solved  heurisEcally.     16 Input:     Output:   let   for                              do   while                    do                    let                    let                      for                    do   endwhile   § The  algorithm  successively  chooses  nodes  having  maximal  „net-­‐flow“  within  the   remaining  subgraph.     §  It  can  be  shown  to  provide  a  2-­‐approxima>on  to  the  opEmal  soluEon.  
  • 65.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Label  Ranking:  Learning  by  Pairwise  Comparison  (LPC)  [Hüllermeier  et  al.  2008]   1
  • 66.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   2 X1   X2   X3   X4   preferences   class   0.34   0   10   174   A  Â  B,  B  Â  C,  C  Â  D   1   1.45   0   32   277   B  Â  C   1.22   1   46   421   B  Â  D,  B  Â  A,  C  Â  D,  A  Â  C   0   0.74   1   25   165   C  Â  A,  C  Â  D,  A  Â  B   1   0.95   1   72   273   B  Â  D,  A  Â  D,     1.04   0   33   158   D  Â  A,  A  Â  B,  C  Â  B,  A  Â  C   1   Label  Ranking:  Learning  by  Pairwise  Comparison  (LPC)  [Hüllermeier  et  al.  2008]   Training  data  (for  the  label  pair  A  and  B):   X1   X2   X3   X4   class   0.34   0   10   174   1   1.22   1   46   421   0   0.74   1   25   165   1   1.04   0   33   158   1  
  • 67.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   3 At  predicUon  Ume,  a  query  instance  is  submiYed  to  all  models,  and  the   predicUons  are  combined  into  a  binary  preference  relaUon:   A   B   C   D   A   0.3   0.8   0.4   B   0.7   0.7   0.9   C   0.2   0.3   0.3   D   0.6   0.1   0.7   Label  Ranking:  Learning  by  Pairwise  Comparison  (LPC)  [Hüllermeier  et  al.  2008]  
  • 68.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   4 At  predicUon  Ume,  a  query  instance  is  submiYed  to  all  models,  and  the   predicUons  are  combined  into  a  binary  preference  relaUon:   A   B   C   D   A   0.3   0.8   0.4   1.5   B   0.7   0.7   0.9   2.3   C   0.2   0.3   0.3   0.8   D   0.6   0.1   0.7   1.4   From  this  relaUon,  a  ranking  is  derived  by  means  of  a  ranking  procedure.   In  the  simplest  case,  this  is  done  by  sorUng  the  labels  according  to  their   sum  of  weighted  votes.     B Â A  Â D  Â C Label  Ranking:  Learning  by  Pairwise  Comparison  (LPC)  [Hüllermeier  et  al.  2008]  
  • 69.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   AGENDA   1. Preference  Learning  Tasks     2. Performance  Assessment  and  Loss  FuncUons     3. Preference  Learning  Techniques     a. Learning  UUlity  FuncUons   b. Learning  Preference  RelaUons   c. Structured  Output  PredicMon   d. Model-­‐Based  Preference  Learning   e. Local  Preference  AggregaUon   4. Conclusions   5
  • 70.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Structured  Output  PredicMon  [Bakir  et  al.  2007]   § Rankings,  mulUlabel  classificaUons,  etc.  can  be  seen  as  specific  types  of   structured  (as  opposed  to  scalar)  outputs.     § DiscriminaUve  structured  predicUon  algorithms  infer  a  joint  scoring   funcMon  on  input-­‐output  pairs  and,  for  a  given  input,  predict  the  output   that  maximises  this  scoring  funcUon.   § Joint  feature  map  and  scoring  funcUon     § The  learning  problem  consists  of  esUmaUng  the  weight  vector,  e.g.,  using   structural  risk  minimizaUon.     § PredicUon  requires  solving  a  decoding  problem:   6
  • 71.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   § Preferences  are  expressed  through  inequaliMes  on  inner  products:   § The  potenUally  huge  number  of  constraints  cannot  be  handled  explicitly   and  calls  for  specific  techniques  (such  as  cucng  plane  opUmizaUon)   7 loss  funcUon   Structured  Output  PredicMon  [Bakir  et  al.  2007]  
  • 72.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   AGENDA   1. Preference  Learning  Tasks     2. Performance  Assessment  and  Loss  FuncUons     3. Preference  Learning  Techniques     a. Learning  UUlity  FuncUons   b. Learning  Preference  RelaUons   c. Structured  Output  PredicUon   d. Model-­‐Based  Preference  Learning   e. Local  Preference  AggregaUon   4. Conclusions   8
  • 73.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Model-­‐Based  Methods  for  Ranking   § Model-­‐based  approaches  to  ranking  proceed  from  specific  assumpUons   about  the  possible  rankings  (representaMon  bias)  or  make  use  of   probabilisMc  models  for  rankings  (parametrized  probability  distribuUons   on  the  set  of  rankings).   § In  the  following,  we  shall  see  examples  of  both  type:   - RestricUon  to  lexicographic  preferences   - CondiUonal  preference  networks  (CP-­‐nets)   - Label  ranking  using  the  PlackeY-­‐Luce  model      see  our  talk  tomorrow   9
  • 74.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Learning  Lexicographic  Preference  Models  [Yaman  et  al.  2008]     § Suppose  that  objects  are  represented  as  feature  vectors  of  length  m,  and   that  each  aYribute  has  k  values.   § For  n=km  objects,  there  are  n!  permutaUons  (rankings).     § A  lexicographic  order  is  uniquely  determined  by   - a  total  order  of  the  aYributes   - a  total  order  of  each  aYribute  domain   § Example:  Four  binary  aYributes  (m=4,  k=2)   - there  are  16! ¼  2  ·  1013  rankings     - but  only  (24)  ·  4!  =  384  of  them  can  be  expressed  in  terms  of  a   lexicographic  order   § [Yaman  et  al.  2008]  present  a  learning  algorithm  that  explictly  maintains   the  version  space,  i.e.,  the  aYribute-­‐orders  compaUble  with  all  pairwise   preferences  seen  so  far  (assuming  binary  aYributes  with  1  preferred  to  0).   PredicUons  are  derived  based  on  the  „votes“  of  the  consistent  models.   10
  • 75.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Learning  CondiMonal  Preference  (CP)  Networks  [Chevaleyre  et  al.  2010]   11 main  dish   drink   restaurant   meat > veggie > fish meat: red wine > white wine veggie: red wine > white wine fish: white wine > red wine meat: Italian > Chinese veggie: Chinese > Italian fish: Chinese > Italian Compact  representaUon  of  a   parUal  order  relaUon,  exploiUng   condiUonal  independence  of   preferences  on  aYribute  values.   Induces  parUal  order  relaUon,  e.g.,   (meat, red wine, Italian) > (meat, white wine, Chinese) (fish, white wine, Chinese) > (fish, red wine, Chinese) (meat, white wine, Italian) ? (meat, red wine, Chinese)
  • 76.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Learning  CondiMonal  Preference  (CP)  Networks  [Chevaleyre  et  al.  2010]   12 main  dish   drink   restaurant   meat > veggie > fish meat: red wine > white wine veggie: red wine > white wine fish: white wine > red wine meat: Italian > Chinese veggie: Chinese > Italian fish: Chinese > Italian Compact  representaUon  of  a   parUal  order  relaUon,  exploiUng   condiUonal  independence  of   preferences  on  aYribute  values.   (meat, red wine, Italian) > (veggie, red wine, Italian) (fish, whited wine, Chinease) > (veggie, red wine, Chinease) (veggie, whited wine, Chinease) > (veggie, red wine, Italian) … … … Training  data  (possibly  noisy):  
  • 77.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   AGENDA   1. Preference  Learning  Tasks     2. Performance  Assessment  and  Loss  FuncUons     3. Preference  Learning  Techniques     a. Learning  UUlity  FuncUons   b. Learning  Preference  RelaUons   c. Structured  Output  PredicUon   d. Model-­‐Based  Preference  Learning   e. Local  Preference  AggregaUon    see  our  talk  tomorrow   4. Conclusions   13
  • 78.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   Summary  of  Main  Algorithmic  Principles   § ReducMon  of  ranking  to  (binary)  classificaUon  (e.g.,  constraint   classificaUon,  LPC)   § Direct  opMmizaMon  of  (regularized)  smooth  approximaUon  of  ranking   losses  (RankSVM,  RankBoost,  …)   § Structured  output  predicMon,  learning  joint  scoring  („matching“)   funcUon   § Learning  parametrized  probabilisMc  ranking  models  (e.g.,  PlackeY-­‐Luce)   § Restricted  model  classes,  ficng  (parametrized)  determinisUc  models   (e.g.,  lexicographic  orders)   § Lazy  learning,  local  preference  aggregaUon  (lazy  learning)   14
  • 79.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  3  |  J.  Fürnkranz  &  E.  Hüllermeier   References   15 § G.  Bakir,  T.  Hofmann,  B.  Schölkopf,  A.  Smola,  B.  Taskar  and  S.  Vishwanathan.  Predic'ng  structured  data.  MIT  Press,  2007.   § W.  Cheng,  K.  Dembczynski  and  E.  Hüllermeier.  Graded  Mul'label  Classifica'on:  The  Ordinal  Case.  ICML-­‐2010,  Haifa,  Israel,  2010.   § W.  Cheng,  K.  Dembczynski  and  E.  Hüllermeier.  Label  ranking  using  the  Placke=-­‐Luce  model.  ICML-­‐2010,  Haifa,  Israel,  2010.   § W.  Cheng  and  E.  Hüllermeier.  Predic'ng  par'al  orders:  Ranking  with  absten'on.  ECML/PKDD-­‐2010,  Barcelona,  2010.   § W.  Cheng,  C.  Hühn  and  E.  Hüllermeier.  Decision  tree  and  instance-­‐based  learning  for  label  ranking.  ICML-­‐2009.   § Y.  Chevaleyre,  F.  Koriche,  J.  Lang,  J.  Mengin,  B.  Zanucni.  Learning  ordinal  preferences  on  mul'a=ribute  domains:  The  case  of  CP-­‐nets.  In:  J.   Fürnkranz  and  E.  Hüllermeier  (eds.)  Preference  Learning,  Springer-­‐Verlag,  2010.     § W.W.  Cohen,  R.E.  Schapire  and  Y.  Singer.  Learning  to  order  things.  Journal  of  ArUficial  Intelligence  Research,  10:243–270,  1999.   § Y.  Freund,  R.  Iyer,  R.  E.  Schapire  and  Y.  Singer.  An  efficient  boos'ng  algorithm  for  combining  preferences.  Journal  of  Machine  Learning    Research,   4:933–969,  2003.   § J.  Fürnkranz,  E.  Hüllermeier,  E.  Mencia,  and  K.  Brinker.  Mul'label  Classifica'on  via  Calibrated  Label  Ranking.  Machine  Learning  73(2):133-­‐153,   2008.     § J.  Fürnkranz,  E.  Hüllermeier  and  S.  Vanderlooy.  Binary  decomposi'on  methods  for  mul'par'te  ranking.  Proc.  ECML-­‐2009,  Bled,  Slovenia,  2009.   § D.  Goldberg,  D.  Nichols,  B.M.  Oki  and  D.  Terry.  Using  collabora've  filtering  to  weave  and  informa'on  tapestry.  CommunicaUons  of  the  ACM,  35 (12):61–70,  1992.   § S.  Har-­‐Peled,  D.  Roth  and  D.  Zimak.  Constraint  classifica'on:  A  new  approach  to  mul'class  classifica'on.  Proc.  ALT-­‐2002.   § R.  Herbrich,  T.  Graepel  and  K.  Obermayer.  Large  margin  rank  boundaries  for  ordinal  regression.  Advances  in  Large  Margin  Classifiers,  2000.   § E.  Hüllermeier,  J.  Fürnkranz,  W.  Cheng  and  K.  Brinker.  Label  ranking  by  learning  pairwise    preferences.  ArUficial  Intelligence,  172:1897–1916,   2008.   § T.  Joachims.  Op'mizing  search  engines  using  clickthrough  data.  Proc.  KDD  2002.   §  N.  LiYlestone  and  M.K.  Warmuth.  The  weighted  majority  algorithm.  InformaUon  and  ComputaUon,  108(2):  212–261,  1994.   § G.  Tsoumakas  and  I.  Katakis.  Mul'label  classifica'on:  An  overview.  Int.  J.  Data  Warehouse  and  Mining,  3:1–13,  2007.   § F.  Yaman,  T.  Walsh,  M.  LiYman  and  M.  desJardins.  Democra'c  Approxima'on  of  Lexicographic  Preference  Models.  ICML-­‐2008.  
  • 80.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  5  |  J.  Fürnkranz  &  E.  Hüllermeier   AGENDA   1. Preference  Learning  Tasks     2. Performance  Assessment  and  Loss  FuncEons     3. Preference  Learning  Techniques     4. Conclusions   1
  • 81.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  5  |  J.  Fürnkranz  &  E.  Hüllermeier   Conclusions   § Preference  learning  is  an  emerging  subfield  of  machine  learning,     with  many  applica:ons  and  theore:cal  challenges.   § PredicEon  of  preference  models  instead  of  scalar  outputs  (like  in   classificaEon  and  regression),  hitherto  with  a  focus  on  rankings.   § Many  exisEng  machine  learning  problems  can  be  cast  in  the  framework  of   preference  learning  (à  preference  learning  „in  a  broad  sense“)   § „Qualita:ve“  alterna:ve  to  convenEonal  numerical  approaches     - pairwise  comparison  instead  of  numerical  evaluaEon,     - order  relaEons  instead  of  individual  assessment.     § SEll  many  open  problems  (unified  framework,  predicEons  more  general   than  rankings,  incorporaEng  numerical  informaEon,  etc.)   § Interdisciplinary  field,  connecEons  to  many  other  areas.   2
  • 82.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  5  |  J.  Fürnkranz  &  E.  Hüllermeier   Connec:ons  to  Other  Fields   3 Preference   Learning   Recommender   Systems   MulElabel   ClassificaEon   Learning   Monotone   Models   Structured   Output   PredicEon   Ranking  in   InformaEon   Retrieval   Ordinal   ClassificaEon   OperaEons   Research   MulEple  Criteria   Decision  Making   Social   Choice   Economics  &   Decison  Theory  
  • 83.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  5  |  J.  Fürnkranz  &  E.  Hüllermeier   Edited  Book  on  Preference  Learning   4 Preference  Learning:  An  IntroducEon   A  Preference  OpEmizaEon  based  Unifying  Framework  for  Supervised  Learning  Problems   Part  I  –  Label  Ranking   Label  Ranking  Algorithms:  A  Survey   Preference  Learning  and  Ranking  by  Pairwise  Comparison   Decision  Tree  Modeling  for  Ranking  Data     Co-­‐regularized  Least-­‐Squares  for  Label  Ranking   Part  II  –  Instance  Ranking   A  Survey  on  ROC-­‐Based  Ordinal  Regression   Ranking  Cases  with  ClassificaEon  Rules   Part  III  –  Object  Ranking   A  Survey  and  Empirical  Comparison  of  Object  Ranking  Methods   Dimension  ReducEon  for  Object  Ranking   Learning  of  Rule  Ensembles  for  MulEple  A_ribute  Ranking  Problems   Part  IV  –  Preferences  in  Mul:aOribute  Domains   Learning  Lexicographic  Preference  Models   Learning  Ordinal  Preferences  on  MulEa_ribute  Domains:  the  Case  of  CP-­‐nets   Choice-­‐Based  Conjoint  Analysis:  ClassificaEon  vs.  Discrete  Choice  Models   Learning  AggregaEon  Operators  for  Preference  Modeling   Part  V  –  Preferences  in  Informa:on  Retrieval   EvaluaEng  Search  Engine  Relevance  with  Click-­‐Based  Metrics   Learning  SVM  Ranking  FuncEon  from  User  Feedback  Using  Document  Metadata  and  AcEve  Learning  in  the  Biomedical   Domain   Part  VI  –  Preferences  in  Recommender  Systems   Learning  Preference  Models  in  Recommender  Systems   CollaboraEve  Preference  Learning   Discerning  Relevant  Model  Features  in  a  Content-­‐Based  CollaboraEve  Recommender  System   J.  Fürnkranz  &    E.  Hüllermeier  (eds.)   Preference  Learning   Springer-­‐Verlag  2010  
  • 84.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  5  |  J.  Fürnkranz  &  E.  Hüllermeier   Edited  Book  on  Preference  Learning   5 Preference  Learning:  An  Introduc:on   A  Preference  OpEmizaEon  based  Unifying  Framework  for  Supervised  Learning  Problems   Part  I  –  Label  Ranking   Label  Ranking  Algorithms:  A  Survey   Preference  Learning  and  Ranking  by  Pairwise  Comparison   Decision  Tree  Modeling  for  Ranking  Data     Co-­‐regularized  Least-­‐Squares  for  Label  Ranking   Part  II  –  Instance  Ranking   A  Survey  on  ROC-­‐Based  Ordinal  Regression   Ranking  Cases  with  ClassificaEon  Rules   Part  III  –  Object  Ranking   A  Survey  and  Empirical  Comparison  of  Object  Ranking  Methods   Dimension  ReducEon  for  Object  Ranking   Learning  of  Rule  Ensembles  for  MulEple  A_ribute  Ranking  Problems   Part  IV  –  Preferences  in  Mul:aOribute  Domains   Learning  Lexicographic  Preference  Models   Learning  Ordinal  Preferences  on  MulEa_ribute  Domains:  the  Case  of  CP-­‐nets   Choice-­‐Based  Conjoint  Analysis:  ClassificaEon  vs.  Discrete  Choice  Models   Learning  AggregaEon  Operators  for  Preference  Modeling   Part  V  –  Preferences  in  Informa:on  Retrieval   EvaluaEng  Search  Engine  Relevance  with  Click-­‐Based  Metrics   Learning  SVM  Ranking  FuncEon  from  User  Feedback  Using  Document  Metadata  and  AcEve  Learning  in  the  Biomedical   Domain   Part  VI  –  Preferences  in  Recommender  Systems   Learning  Preference  Models  in  Recommender  Systems   CollaboraEve  Preference  Learning   Discerning  Relevant  Model  Features  in  a  Content-­‐Based  CollaboraEve  Recommender  System   J.  Fürnkranz  &    E.  Hüllermeier  (eds.)   Preference  Learning   Springer-­‐Verlag  2010   includes  several  introduc:ons   and  survey  ar:cles    
  • 85.
    DS  2011  Tutorial  on  Preference  Learning  |  Part  5  |  J.  Fürnkranz  &  E.  Hüllermeier   Preference  Learning  Website   § Working  groups   § Sobware   § Data  Sets   § Workshops   § Tutorials   § Books   § ...   6 http://www.preference-learning.org/