Fcv poster parikh

Rela-ve
A0ributes

Devi
Parikh
(TTIC)
and
Kristen
Grauman
(UT
Aus0n)

1.
Main
Idea
4.
Rela-ve
Zero-‐shot
Learning
6.
Datasets
8.
Zero-‐shot
Learning
Results

Mo-va-on:
Proposed
idea:
Rela-ve
A0ributes
Learnt
rela-ve
a0ributes

  Outdoor
Scene
Recogni-on
(OSR):
2688
images,
8
categories:
coast
(C),
forest

(F),
highway
(H),
inside-‐city
(I),
mountain
(M),
open-‐country
(O),
street
(S)
and

Baselines:

tall-‐building
(T),
gist
features;

 
Direct
AMribute
Predic0on
(DAP)

Categorical
(binary)
aMributes
are
 Richer
communica0on
between

Young:
…
Smiling:
∼
  Public
Figure
Face
(PubFig):
800
images,
8
categories:
Alex
Rodriguez
(A),
Clive

[Lampert
et
al.
2009]
(binary)

restric0ve
and
can
be
unnatural
humans
and
machines
Owen
(C),
Hugh
Laurie
(H),
Jared
Leto
(J),
Miley
Cyrus
(M),
ScarleM
Johansson

(S),
Viggo
Mortensen
(V)
and
Zac
Efron
(Z),
gist
and
color
features

 Describe
images
or
categories
rela0vely
Binary Relative
c(x) = argmax
ˆ P (am |x)
c

e.g.
“dogs
are
furrier
than
giraffes”,
 Training:
Images
from
S
seen
and
descrip0ons
of
U
OSR TI S HC OMF m
“find
less
congested
downtown
Chicago
unseen
categories
natural
open
00001 11 1
00011 11 0
T≺I∼S≺H≺C∼O∼M∼F
T∼F≺I∼S≺M≺H∼C∼O
 
Classifier
instead
of
ranker
(SRA)

scene
than
”
 Tes0ng:
Categorize
image
into
N
(=S+U)
categories
perspective 11110 00 0 O≺C≺M∼F≺H≺I≺S≺T
Number
of
unseen
categories

OSR PubFig
large-objects 11100 00 0 F≺O∼M≺I∼S≺H∼C≺T 80
OSR
PubFig

 Learn
a
ranking
func0on
for
each
Unseen
categories
Rela-ve
a0ributes
space

diagonal-plane
close-depth
11110 00 0
11110 00 1
F≺O∼M≺C≺I∼S≺H≺T
C≺M≺O≺T∼I∼S∼H∼F 60
60

Accuracy

Accuracy
Natural
?
Not
Natural
aMribute
PubFig ACHJ MS V Z 40 40
Young:
S
C
H
M
Z

S

Masculine-looking 11110 01 1 S≺M≺Z≺V≺J≺A≺H≺C
Enables
new
applica-ons
White 01111 11 1 A≺C≺H≺Z≺J≺S≺M≺V 20 20

Smiling

 Novel
zero-‐shot
learning
from
aMribute
Smiling:
M
Z

M

Young
Smiling
00001 10 1
11101 10 1
V≺H≺C≺J≺A≺S≺Z≺M
J≺V≺H≺A∼C≺S∼Z≺M
0
DAP
0 1 2 3 4 5
SRA
0
Proposed
0 1 2 3 4 5
comparisons
 Need
not
use
all
aMributes
C
Chubby 10000 00 0 V≺J≺H≺C≺Z≺M≺S≺A # unseen categories # unseen categories
H
Z
Visible-forehead 11101 11 0 J≺Z≺M≺S≺A∼C∼H∼V classical
recogni0on
problem
binary
~
rela0ve
supervision

 Precise
automa0cally
generated
textual
Bushy-eyebrows 01010 00 0 M≺S≺Z≺V≺H≺A≺C≺J
Smiling
?
Not
Smiling
 Need
not
relate
to
all
S
Narrow-eyes 01100 01 1 M≺J≺S≺A≺H≺C≺V≺Z Amt.
of
labeled
data
to
learn
a0ributes

OSR PubFig
descrip0ons
of
images

Youth
Pointy-nose
Big-lips
00100 00 1
10001 10 0
A≺C≺J∼M∼V≺S≺Z≺H
H≺J≺V≺Z≺C≺M≺A≺S 60 60

Infer
image
category
using
max-‐likelihood

Accuracy

Accuracy
2.
Learning
Rela-ve
A0ributes

Round-face 10001 10 0 H≺V≺J≺C≺Z≺A≺S≺M 40
40

( ), }, S :{ 5.
Describing
Images
Rela-vely
7.
Image
Descrip-on
Results

20 20

For
each
aMribute
am , Supervision
is
Om : { ...
{ m ∼ }} ,
...
…
Human
subject
experiment:
Which
image
is?

0
1 2
DAP
5 15
SRA
0
1
Proposed
2 5 15
Learnt
rela-ve
a0ributes
Density:
# labeled pairs # labeled pairs
Learn
a
scoring
func0on

rm (xi ) = T
that
best
sa0sfies
constraints:

% correct image in top choices
More
chubby
than
More
smiling
than
More
VisFHead
than

w m xi 100
Binary
baseline
supervision
can
give
unique
ordering
on
all
classes

Relative
Amount
of
descrip-on

80 OSR PubFig
∀(i, j) ∈ Om : T
w m xi T
w m xj ∀(i, j) ∈ Sm : T
w m xi = T
wm xj Auto
-‐
generate
textual
descrip-on
of:
Less
chubby
than
Less
smiling
than
Less
VisFHead
than
60 60 60

Max-‐margin
learning
to
rank
formula-on

Accuracy

Accuracy
40
min 1 T 2 2 2 Rela-ve
a0ributes
space
1/8
dataset
40 40
1 T 2 2 2 ||wm ||2 + C
Adapted
objec0ve
ξij + γij 20

min ||wm ||2 + C ξij + γij 2 20 20

2

2 2 from
[Joachims,
2002]

T Density
T ?
?
?

0
1 2 3
DAP SRA Proposed
C ξij + γij T s.t wm (xi − xj ) ≥ 1 − ξij , ∀(i, j) ∈ Om ; |wm (xi − xj )| ≤ γij , ∀(i, j) ∈ Sm ;
T Example
descrip-ons

# top choices 0
6 5 4 3 2 1
0
11109 8 7 6 5 4 3 2 1
s.t wm (xi − xj ) ≥ 1 − ξij , ∀(i, j) ∈ Om ; |wm (xi − xj )| ≤ γij , ∀(i, j) ∈ Sm ;C
C
H
H
H
C
F
H
H
M
F
F
I
F
Image
Binary
descrip0ons
Rela0ve
descrip0ons
# att to describe unseen # att to describe unseen
T ξij ≥ 0; γij ≥ 0 An
aMribute
is
more
discrimina0ve
when
used
rela0vely

≥ 1 − ξij , ∀(i, j) ∈ Omij |wm (xi − xj )| ≤ γij , ∀(i, j) ∈ Sm ;
ξ ; ≥ 0; γij ≥ 0 Rela-ve
descrip-on:
not
natural,
not
open,
more
natural
than
tallbuilding;
less
natural
than
forest;
more
open
than

perspec0ve

tallbuilding;
less
open
than
coast;
more
perspec0ve
than
tallbuilding;
OSR PubFig
not
natural,
not
open,

more
natural
than
insidecity;
less
natural
than
highway;
more
open
than
Quality
of
descrip-on

street;
less
open
than
coast;
more
perspec0ve
than
highway;
less

3.
Ranking
Func-on
vs.
Binary
Classifier
Score
perspec0ve
60 60
perspec0ve
than
insidecity

“more
dense
than

,
less
dense
than

”

Accuracy

Accuracy
natural,
open,
more
natural
than
tallbuilding;
less
natural
than
mountain;
more
open

perspec0ve
than
mountain;
less
perspec0ve
than
opencountry;

40 40
wb
How
do
learned
wm
“more
dense
than
Highways,
less
dense
than
Forests”
White,
not
Smiling,

VisibleForehead

more
White
than
AlexRodriguez;
more
Smiling
than
JaredLeto;
less

Smiling
than
ZacEfron;
more
VisibleForehead
than
JaredLeto;
less

20 20
VisibleForehead
than
MileyCyrus

ranking
func0ons
%
correctly
ordered
pairs
Classifier
Ranker

Not
dense:
Dense:
White,
not
Smiling,

more
White
than
AlexRodriguez;
less
White
than
MileyCyrus;
less
Smiling
DAP SRA Proposed
Outdoor
scenes
80%
89%
than
HughLaurie;
more
VisibleForehead
than
ZacEfron;
less
0 0
differ
from
classifier
Whereas
conven0onal
not
VisibleForehead

VisibleForehead
than
MileyCyrus
1 2 3 1 2 3
Celebrity
faces
67%
82%
not
Young,
more
Young
than
CliveOwen;
less
Young
than
ScarleMJohansson;
more
Looseness of constraints Looseness of constraints
outputs?
Binary
descrip-on:
“not
dense”
BushyEyebrows,

RoundFace

BushyEyebrows
than
ZacEfron;
less
BushyEyebrows
than
AlexRodriguez;

more
RoundFace
than
CliveOwen;
less
RoundFace
than
ZacEfron

Rela0ve
aMributes
jointly
carve
out
space
for
unseen
category

Fcv poster parikh

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (13)

More from zukun

More from zukun (20)

Recently uploaded

Recently uploaded (20)

Fcv poster parikh