Wearable Computer Vision
Giovanni Maria Farinella
www.dmi.unict.it/farinella
gfarinella@dmi.unict.it
Bush’s Memex, 1945
“Certainly progress in photography is not going to stop. […]
Let us project this trend ahead to a logical, if not inevitable,
outcome. The camera hound of the future wears on his
forehead a lump a little larger than a walnut.”
https://www.youtube.com/watch?v=c539cK58ees
Wearable Computer Vision: The Goal
Clip from movie Terminator 2 - Judgment day: https://youtu.be/9MeaaCwBW28
Ref: https://www.redsharknews.com/vr_and_ar/item/3539-terminator-2-vision-the-augmented-reality-standard-for-25-years
What my research group is doing?
Three Fundamental Tasks of a First Person Vision System
WHERE?
(localization)
WHAT?
(understanding)
WHAT’S NEXT?
(anticipation)
Shopping Cart Localization – Demo
https://iplab.dmi.unict.it/EgocentricShoppingCartLocalization/#demo
Emiliano Spera, Antonino Furnari, Sebastiano Battiato, Giovanni Maria Farinella (2019). EgoCart: a Benchmark Dataset for Large-
Scale Indoor Image-Based Localization in Retail Stores. IEEE Transactions on Circuits and Systems for Video Technology
Dataset Creation – ‘Classic’ Computer Vision!
Structure from Motion (SfM)
Images 3D Model
Attach estimated 6DOF
pose to each image
camera poses
(P,Q)
Arbitrary Coordinate System (pose/scale)
rotated poses scaled/aligned poses
PCA
E. Spera, A. Furnari, S. Battiato, G. M. Farinella, Egocentric Shopping Cart Localization, International Conference on Pattern Recognition (ICPR), 2018
Three Fundamental Tasks of a First Person Vision System
WHERE?
(localization)
WHAT?
(understanding)
WHAT’S NEXT?
(anticipation)
VEDI - Vision Exploitation for Data Interpretation
Patent Pending – See all Videos Here: https://iplab.dmi.unict.it/VEDI_project/
Where am I?
Visitor
Site ManagerComputer Vision and Machine Learning
• What are the Interesting Sites
for the users with Profile A?
• User that see X observe also Y
• Do we have to re-organize the
museum spaces?
What objects have been seen by the visitors?
How Long?
No need of surveys!
Clustered Paths - Profile A
Clustered Paths - Profile B
RI-VEDI Salient Moments
See Details!
Understanding Visitor Behaviour through
Key Performance Indicators and Visual AnaliticsProviding Services
Localization
Visual Attention
Object Recognition
Augmented Reality
Personal Recommendation
Storage/Memories/Summary
Behaviour Analysis
First Person
Vision
The Role of Data
Where and What
The VALUE from (Big) Visual Data
https://youtu.be/Cu-pCrLHeZw
Three Fundamental Tasks of a First Person Vision System
WHERE?
(localization)
WHAT?
(understanding)
WHAT’S NEXT?
(anticipation)
Where, What and What’s Next?
?
past future
Washing Hands
The Role of Data
https://www.youtube.com/watch?v=Dj6Y3H0ubDw&feature=youtu.be
box
open
turn
cut
put
washtake
pan
tap
pot
board
food
stir
pick
add
pour
close
bowl
bag
plate
spoon
fridge
knife
rinse
get
lid
onion
oil
bin
still
mug
fork
salt
cup
mix
top
flip
jar
tea
bits
v60
leaf
tin
one
foil
keep
tofu
skinning
fry
pin
gas
tip
hot
left
fan
eat
cap
mat
pans
dice
wait
fruit
trays
make tail
bins
hit
power
extra
stem
lift
loafnext
snap
beer
oat
mashermustard
case
tie
lay
hop
emptying
rip
fix
tube bananas
dont
first
cans
jeera
bar
fire
tub
jars
count
replace
well
accessrestart
pits
kiwis
space
line
rise
salsa
find
flours
boat
lick
done
third
plain
number
jugs
play
stalks
app
dial
swap
load
wall
low
air
let
bun
sit
dab
coke
ensure
dip
wood
onoin
fla
vors
co
vers
whisked
waters
actual
fla
t
go
es
redu
ce
started
way
do
ors
big
shaking
mea
surer
tonic
tasting
avoc
ad
os
carrier
jasm
ine
stirrer
groc
ery
scraps
forge
wan
t
an
gle
guide
de
cide
loaves
seed
save
plan
ho
bs
ch
ew
unzip
gallo
zero
fresh
de
seed
blinds
flu
ff
dials
ite
m
shut
snip
de
spil
pico
rock
drum
s
mail
tilt
schedule
bite
no
se
prog
ram
dice
d
carts
realize
mats
loose
runners
no
w
ge
ts
trying mixed
co
rd
int
clam
dishing
game
scourer
co
nn
ec
t
strip
pa
ne
de
pth
plated
breadcrumbs
fuck
spin
trow
stiring
grad
rim
pe
n
temp
cab
sole
try
mess
bo
w
thin
pip
even
fasten
pa
ce
thirty
hearts
books
packs
bu
g
took
self
puckup
leafs
sorry
po
st
ice
pair
large
grill
stare
sort
op
cu
ps
four
work
spate
see
tun
soil
40
Scaling Egocentric Vision
Data Collection
Native
Environment,
Natural
Interactions
Live Narrations
Dense Action
Segments
Active Object
Bounding
Boxes
Benchmark and
Challenges
11M Frames
32 kitchens
Single-person environments
4 cities
May – Nov 2017 – 55 hours
10 nationalities
3 days - all kitchen activities
Annotations Statistics
Annotations Statistics
Annotations – Object Bounding Boxes
Open Challenges
35.13
Object Detection Challenge (34.18)
Action Recognition Challenge (63.59)
Action Anticipation Challenge (35.13)
RULSTM
Antonino Furnari, Giovanni Maria Farinella, What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality
Attention. International Conference on Computer Vision (ICCV) 2019 - ORAL. Code available at: http://iplab.dmi.unict.it/rulstm/
Demo Video: Egocentric Action Anticipation
Antonino Furnari, Giovanni Maria Farinella, What Would You Expect? Anticipating Egocentric Actions with Rolling-Unrolling LSTMs and Modality
Attention. International Conference on Computer Vision (ICCV) 2019 - ORAL. Code available at: http://iplab.dmi.unict.it/rulstm/
Thank you for your attention
Giovanni Maria Farinella
www.dmi.unict.it/farinella
gfarinella@dmi.unict.it

Wearable Computer Vision