Scenes From Video Workshop Talk

What’s so good about
pieces, Lego and understanding?
Anton van den Hengel
Australian Centre for Visual Technologies (ACVT)
The University of Adelaide
South Australia

It has been a theme …

"the perception of solid objects is a process which can be based on the
properties of three-dimensional transformations and the laws of nature”
Larry Roberts (1965)

Structure and semantics interact

Structure and geometry interact

Developmental changes in response to
drought
The escape response of Clipper under drought is reflected in
an earlier time of absolute maximum growth

46 d after sowing

Absolute growth rate [mm2 d-1]

7000
6000

5000

well watered

4000

39 d after sowing

3000
2000

drought

1000
0

30

35

40
45
50
Time after sowing [d]

55

60

65
Boris Parent, ACPFG

Morphological changes in response to
drought
Relative ratio of shoot area / height

The reduced number of tillers under drought is
reflected in the area/height ratio
3
2.8
2.6

well watered

2.4
2.2
2
1.8

1.6
1.4

drought

1.2

Barley cv Clipper

1

30

40

50

Time after sowing [d]

60
Boris Parent, ACPFG

Deep reasoning
•
•
•

Try to explain as much as possible
Fine-grained and detailed
Deep semantics
•

•

And the implied constraints

Shape is only an intermediate step

Silhouettes
•

We’re only interested in shape (at least for now)

Deconstruction

•
•

•

Render all possible building blocks in every possible
position, and recover its silhouette
Then reconstruct object silhouettes from templates
Requires enough camera information to achieve this

Template shapes
•

nTemplates = nShapes x nPositions x nRotations

•

So there are lots of them
But they are sparsely used

•

Sparse recovery

•
•
•
•

alpha a vector of binary template coefficients
Pi a matrix with one template silhouette per
column
y the silhouette of the shape to be recovered
NP hard and fragile

Sparse recovery – L_1 norm

•

But there may still be millions of templates, and
they’re enormous (|Pixels| x |Images|)

Sparse recovery – Random
projections

•

Random projection by DxS matrix Phi
D << S
• Phi is sparsely sampled from N(0,1)
•

•

But there are still too many templates

Sparse recovery - Cropping
•
•

Eliminate templates with a footprint that extends
significantly beyond that of the object
Reduces the number of templates by at least an
order of magnitude
•

Down to tens to tens of thousands of templates

Binarising the solution

•
•

Solutions are not binary
Randomly generate binary hypotheses from nonbinary alpha
•

Evaluate using an accurate composition model

Fraction of True Leaves Recovered

Results
Max
Search
Viable

0.9
0.8
0.7
0.6

200

400

600

Number of Templates

800

1000

Fraction of Pixels Explained

Results
0.08
0.06
0.04
Max
Search

0.02
0
0

0.01

0.02

0.03

0.04

Noise Level (Fraction of Pixels Changed)

0.05

0.06

Composition problems


Not a true model of
silhouette formation
 So doesn’t deal well with

template overlap
 Working on this by
subtracting overlaps,
graph-based approaches


Somewhat overcome by…

Inequality
•

Isn’t physically accurate for foreground pixels, so
split
•

Background (0) pixels

•

And foreground pixels

Practicality again

•

Only interested in the number of pixels outside
the object silhouette, not the location
So not

•

but

•

Practicality again
•

Want to ensure that

•

Need to project to a lower dimension

•

But Phi_I must have only positive elements

A better model of composition
•

Left with

Constraints - Intersection
•

Form J where every row represents a constraint
•

If templates i and k intersect then insert a row in J with
only elements i and k set to 1

Constraints - Support
•

Form K where every row represents a constraint
If template i needs support t set K_ii = t
• If template j provides s support to j then K_ij = -s
•

Measurement benefit tails off
Accuracy vs noise for varying numbers of measurements

Accuracy (fraction of true blocks recovered)

1
49
441
1225
2401
3969
5929
8281
11025

0.9

0.8

0.7

0.6

0.5

0.4

0

0.05

0.1
0.15
0.2
0.25
0.3
Noise level (added to camera extrinsics)

0.35

0.4

Limitations
•

One template per value per parameter
•

Fixable?

Scenes From Video Workshop Talk

Recommended

Recommended

More Related Content

Similar to Scenes From Video Workshop Talk

Similar to Scenes From Video Workshop Talk (20)

Recently uploaded

Recently uploaded (20)

Scenes From Video Workshop Talk

Editor's Notes