SlideShare a Scribd company logo
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Multimedia Annotation
Lecturer: Xavier Giro-i-Nieto
Version 2016/1
1
@DocXavi
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Densely linked slides
2
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Introduction
Xavier Giro-i-Nieto
• Web: https://imatge.upc.edu/web/people/xavier-giro
Associate Professor at Universitat Politecnica de Catalunya (UPC)
3
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Acknowledgements
Alan Smeaton
Cathal Gurrin
Professor at Dublin City University [Page]
Professor at Dublin City University [Page]
Horst Eidenberger
Professor at Vienna University of Technology [Page]
4
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Acknowledgments
5
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Acknowledgments
6
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Outline
1. Motivation
2. Architecture
3. Metadata
4. Manual vs Automatic Annotation
7
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
8
In previous lectures, you have learned how text retrieval works.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
This lecture expands to any type of multimedia documents.
9
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
Exponential increase of generated multimedia content..
10
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
11
...keeping a record of the memorable personal moments...
Pope Francis @ Philippines, 2015 (Source: AP Photo/Bullit Marquez)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
12
Pope Francis @ Ecuador, 2015 (Source: AP)
...keeping a record of the memorable personal moments...
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
13
…(or not).
Pope Francis @ USA, 2015
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
This data growth is motivated by ubiquous mobile access to...
14
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
...the Internet (for visual data transmission)...
15
Source: Cisco Visual Networking INdex (VNI)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
...and people !
16
Person of the Year
(2006)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
And it will keep growing with wearable devices...
17
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
...that will generate a permanent memory record of our lives...
18
Black Mirror, “The entire history of you” (Season 1, Episode 3)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
...so that the challenge is to index and retrieve these data...
19
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
...in the most user friendly fashion.
20
Source: Si Liu, http://dx.doi.org/10.1109/CVPR.2012.6248071 (2012)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
The challenge is the access to very large multimedia
repositories.
21
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Motivation
22
Open question: How to do store and retrieve your photos ?
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Outline
1. Motivation
2. Architecture
3. Metadata
4. Manual Annotation vs Automatic
23
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Architecture
Three basic stages of visual indexing process
Production Upload Retrieval
24
Personal collections
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Architecture
Three basic stages of visual indexing process
Capture Storage Retrieval
Digital multimedia
data recording
Indexing in a database Search based on the
descriptive metadata
25
Professional broadcasting
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Architecture
• Example: CCMA Digiton (TV3, Public Catalan TV)
26
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Architecture
The contents are stored in repositories and indexed in
databases.
Slide credit:: Emili Bonilla
http://gps-tsc.upc.es/imatge/_Xgiro/teaching/thesis/2007-2008/EmiliBonilla.pdf,
Content
ServerClient
Metadata
Multimedia
Network Search
engine
engine
Client
27
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Outline
1. Motivation
2. Architecture
3. Metadata
4. Manual Annotation vs Automatic
28
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
Metadata describe the content and allow the search and retrieval.
29
Client
Metadata
Multimedia
Network
Search
engine
engine
Client
Content
Server
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
• Example: Dublin Core
Source: B. Haslhofer, W. Klas: http://dx.doi.org/10.1145/1667062.1667064
30
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
Source: University of Oregon
31
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
32
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
• Multiple options depending on their semantic Level.
Level Nature Example
High Words Tags, keywords, title, author...
Medium Sensor Geolocation, date, time, size...
Low Perceptual (video) Colour, texture, shape,
(audio) Pitch, frequency, ...
33
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
“negre”
“black”
Text Mean
colour
• Example:
R=0
G=0
B=0
34
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
• Applications: Browsing by geolocation.
35
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Metadata
• Applications: Grouping photos in events with metadata.
36
D. Manchon-Vizuete, Gris-Sarabia, I., and Giró-i-Nieto, X., “Photo Clustering of Social Events by
Extending PhotoTOC to a Rich Context”, in ICMR 2014 Workshop on Social Events in Web Multimedia
(SEWM), Glasgow, Scotland, 2014.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Outline
1. MotivaciĂł
2. Architecture
3. Metadata
4. Manual vs Automatic Annotation
37
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
38
Task: Write down on a paper tags for this photo.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
39
Task: How do you think is this photo seen by a computer ?
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
40
The semantic gap is the difference between a high level and a
low level description of a document:
Human are very good at abstraction
using natural language (words)...
...while computers are really good at
analysing perceptual features.
Semantic
gap
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
Annotation is the process of generating high level metadata
(semantic).
How to generate
semantic metadata ?
Manual
Annotation
Automatic
Annotation
41
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
How to generate
semantic metadata ?
Manual
Annotation
Automatic
Annotation
42
Annotation is the process of generating high level metadata
(semantic).
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Explicit manual annotation
• Eg. Hashtags on Twitter.
43
Manual Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Explicit manual annotation
• Eg. Hashtags on Instagram.
44
Manual Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Explicit manual annotation
• Eg. Hashtags on Flickr.
45
Manual Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Explicit manual annotation
• Eg. Friends tagging on Facebook.
46
Manual Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Explicit manual annotation
• Eg. Dedicated forms to collect structured metadata.
47
Manual Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
+info: http://www.youtube.com/terrassatsc
Explicit manual annotation
• Eg. Dedicated forms to collect structured metadata.
48
Manual Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual Annotation
49
Problem: Manual Annotation is tedious.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual Annotation
50
Annotation can be splitted and assigned to the crowd as...
+info: http://www.crowdmm.org
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual Annotation
51
Annotation can be splitted and assigned to the crowd as…
….micro-tasks for online workers.
+info: https://www.mturk.com/mturk/
http://microworkers.com/
http://pallas-ludens.com/
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual Annotation
52
Annotation can be splitted and assigned to the crowd as…
….micro-games for online players.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual Annotation
53
Ref: Luis von Ahn and Laura Dabbish,
“Labeling images with a computer
game”. (SIGCHI 2004)
Ref: Amaia Salvador et al, “Crowdsourced
Object Segmentation with a Game” (CrowdMM
2013)
Games With A Purpose (GWAP)
Annotation can be splitted and assigned to the crowd as…
….micro-games for online players.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
The annotation is the process of generating Metadata semantics
(high level).
How to generate
semantic metadata ?
Latent
Annotation
Automatic
Annotation
54
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 55
Latent Annotation
Text contained in the same document where the multimedia
content is presented.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 56
Text associated to a publication sharing the multimedia content.
Image
Tex
t
Vide
o
Latent Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 57
Latent Annotation
Comments about the multimedia item.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
• Problem 1: Most multimedia content has no other associated
text.
58
Manual vs Automatic Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Problem 2: Manual annotation may be too expensive for large
amounts of data.
Jean Le Tavernier : “Jean Miélot al seu scriptorium” (1456)
59
Manual vs Automatic Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
• Solution: Automating the annotation process.
Johannes Gutenberg (1398-1468),
inventor of mechanical moveable type
printing
“Printer from XV century”,
work of Jost Amman (1539-1591)
60
Manual vs Automatic Annotation
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
61
The semantic gap is the difference between a high level and a
low level description of a document:
Human are very good at abstraction
using natural language...
...while computers are really good at
analysing perceptual features.
Challenge
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Manual vs Automatic Annotation
The annotation is the process of generating Metadata semantics
(high level).
How to generate
semantic metadata ?
Manual
Annotation
Automatic
Annotation
62
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 63
Automatic Annotation
Manual Annotations
Model
New Image
Automatic
annotation
Annotation
Artificial intelligence algorithms can learn to perform for this task.
Trainer
Detector
Anchor
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 64
Li Fei-Fei, “How we’re teaching computers to understand
pictures” TEDTalks 2014.
Automatic Annotat.: Categories
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 65
Automatic Annotation
Source: Horst Eidenberger, “Handbook of Multimedia Information Retrieval” (2012)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 66
Automatic Annotation: Features
Source: Horst Eidenberger, “Handbook of Multimedia Information Retrieval” (2012)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 67
Automatic Annotation: Features
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 68
Automatic Annotation: Features
Descriptors for text documents: Word histogram
Source: C. Yu, D. Ballard, “A unified model of early word learning” (2004)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 69
Automatic Annotation: Features
Descriptors for text documents: Term Frequency-
Inverse Document Frequency (TF-IDF)
Eg: term “the” is not
representative to
distinguish one type of
document from the
other
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 70
Automatic Annotation: Features
Descriptors for audio documents: Spectrogram
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 71
Automatic Annotation: Features
Descriptors for audio documents: Mel-Frequency
Spectrum Coefficients - MFCC
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 72
Automatic Annotation: Features
Descriptors for image documents:Textures around
interest points (SIFT, HoG, SURF…)
Source: Sivic & Zissermann, “VideoGoogle” (2003)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 73
Automatic Annotation: Features
Instead of designing hand-crafted features (SIFT, SURF…)
and learn a classifier...,
Slide credit: Marc’Aurelio Ranzato (Google)
Deep learning
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 74
Automatic Annotation: Features
Slide credit: Marc’Aurelio Ranzato (Google)
...features are learned from annotated data.
Deep learning
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 75
Automatic Annotation
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning
applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
Machine learning
(Deep learning)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 76
Automatic Annotat.: Categories
Source: Horst Eidenberger, “Handbook of Multimedia Information Retrieval” (2012)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 77
Automatic Annotat.: Categories
• An ontology is a set of related semantic concepts.
• The classification is performed in relation to one/some of them.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 78
Automatic Annotat.: Categories
Text
• Example: Wordnet (http://wordnet.princeton.edu/)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 79
Font: Andrej Karpathy, “What I learned from competing against a computer on ImageNet” (2014)
Automatic Annotation: Classes
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 80
Source: Andrej Karpathy, “What I learned from competing against a computer on ImageNet” (2014)
Automatic Annotat.: Categories
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 81
Automatic Annotat.: Classifier
Source: Horst Eidenberger, “Handbook of Multimedia Information Retrieval” (2012)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 82
Automatic Annotat.: Classifier
The classification is the process of assigning a label to an
observation based on its features.
Features must allow a discrimination between samples from
each category.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 83
Automatic Annotat.: Classifier
Example: Visual detector of the camera viewpoint.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 84
Automatic Annotation
Source: Andrej Karpathy, “What I learned from competing against a computer on ImageNet” (2014)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 85
Automatic Annotation
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional
neural networks." In Advances in neural information processing systems, pp. 1097-1105. 2012
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 86
Automatic Annotation
Object detection
Girshick, Ross, Jeff Donahue, Trevor Darrell, and Jitendra Malik. "Region-based convolutional networks for
accurate object detection and segmentation." Pattern Analysis and Machine Intelligence, IEEE Transactions
on 38, no. 1 (2016): 142-158.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 87
Automatic Annotation
Object segmentation
Source: Pascal Visual Object Challenge
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 88
Automatic Annotation
Face detection and recognition
Farfade, Sachin Sudhakar, Mohammad Saberian, and Li-Jia Li. "Multi-view Face
Detection Using Deep Convolutional Neural Networks." ICMR (2015).
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 89
Automatic Annotation
Activity Recognition
Tran, Du, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. "Learning
spatiotemporal features with 3D convolutional networks." In Proceedings of the IEEE International
Conference on Computer Vision, pp. 4489-4497. 2015
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 90
Automatic Annotation
Learn more with Nat & Lo 20% Google Project:
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Outline
1. Motivation
2. Architecture
3. Metadata
4. Manual Annotation vs Automatic
91
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016)
Bonus: Artificial intelligence
92
Nexi, from MIT Media Lab (Photo: Spencer Lowel)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 93
Big data
Internet of things - IoT
Only learn to see ?
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 94
Only learn to see ?
Personal data
Big data
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 95
Visual saliency prediction
J. Pan, McGuinness, K., Sayrol, E., O'Connor, N., and Giró-i-Nieto, X., “Shallow and
Deep Convolutional Networks for Saliency Prediction”, in IEEE Conference on
Computer Vision and Pattern Recognition, CVPR, In Press.
LSUN Challenge
Only learn to see ?
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 96
Only learn to see ?
Atlas, de Boston Dynamics
Robust motion
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 97
Only learn to see ?
Games (reinforcement learning)
(Google) DeepMind
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 98
Only learn to see ?
Autonomous Driving
Google Self-driving car
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 99
Only learn to see ?
Visual arts
Google Research, “Going deeper into neural networks” - DeepDream (2015)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 100
Only learn to see ?
Google Research, “Going deeper into neural networks” - DeepDream (2015)
Visual arts
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 101
Only learn to see ?
http://turing.deepart.io/
Visual arts
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 102
Only learn to see ?
Music composition
Manuel Araoz, “Training a Recurrent Neural Network to Compose Music” (2016).
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 103
Only learn to see ?
Poetry
Ross Goodwin, Neuralsnap (2016).
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 104
Only learn to see ?
“Scripts” (!?)
Darknet
JON
He leaned close and onions, barefoot from his shoulder. "I am not a purple
girl," he said as he stood over him. "The sight of you sell your father with you a
little choice."
"I say to swear up his sea or a boy of stone and heart, down," Lord Tywin
said. "I love your word or her to me."
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 105
Only learn to see ?
Public Health
Announcement of Google DeepMind Health (24/02/2016)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 106
Only learn to see ?
Nacho Hernandez, “Why artificial intelligence will democratize
healthcare” (TEDx Talk, 2014)
Public health
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 107
Only learn to see ?
Nancy Lublin, “The heartbreaking text that inspired a crisis
helpline” (TED Talk 2015)
Mental health
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 108
Only learn to see ?
Affective computing
Rana el Kalioubi, “This app know how you feel, from the look on
your face”, TEDTalks 2015.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 109
Only learn to see ?
Affective computing
V. Campos, Salvador, A., Jou, B., and Giró-i-Nieto, X., “Diving Deep into
Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment
Prediction”, in 1st International Workshop on Affect and Sentiment in
Multimedia, Brisbane, Australia, 2015.
Visual maps of positive (green) or negative (red) sentiments:
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 110
Only learn to see ?
Affective computing
[video]
Nexi Project,
from MIT Media Lab
(Photos:
Spencer Lowel)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 111
Only learn to see ?
Psychological support and counseling ?
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 112
Only learn to see ?
“Google’s chairman (Eric Schmidth) thinks artificial intelligence
will let scientists solve some of the world’s "hard problems," like
population growth, climate change, human development,
and education.” (Bloomberg Business, 11/01/2016)
[+info @ MIT Technology Review]
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 113
Only learn to see ?
The New York Times: “The Race Is On to Control Artificial
Intelligence, and Tech’s Future” (25/03/2016)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 114
Only learn to see ?
The Economist, “Million-dollar babies” (02/04/2016)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 115
Only learn to see ?
Jeremy Howard, “The wonderful and terrifying implications of
computers that can learn”, TEDTalks 2014.
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 116
Only learn to see ?
Stephen Hawking, “Artificial intelligence could spell out the
human race.” (2014)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 117
Only learn to see ?
Elon Musk (Tesla), one of OpenAI promoters
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 118
Only learn to see ?
From Industry 1.0 to Industry 4.0
Source: DFKI (2011)
ImageProcessingGroups
UniversitatPolitècnicadeCatalunya(UPC)
Xavier Giro-i-Nieto, “Multimedia Annotation”. Dublin City University (04/04/2016) 119
Thanks a lot !
Slides available at:
https://imatge.upc.edu/web/people/xavier-giro
@DocXavi
/ProfessorXavi

More Related Content

Similar to Multimedia annotation (DCU 2016)

OPEN BADGES – THE MISSING LINK IN OPEN EDUCATION
OPEN BADGES – THE MISSING LINK IN OPEN EDUCATIONOPEN BADGES – THE MISSING LINK IN OPEN EDUCATION
OPEN BADGES – THE MISSING LINK IN OPEN EDUCATION
Ilona Buchem
 
APC´s: the new enclosure to knowledge
APC´s: the new enclosure to knowledgeAPC´s: the new enclosure to knowledge
APC´s: the new enclosure to knowledge
CLACSO-Latin American Council of Social Sciences, Open Access
 
APC´s: the new enclosure to knowledge
APC´s: the new enclosure to knowledgeAPC´s: the new enclosure to knowledge
Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)
Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)
Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)
Universitat Politècnica de Catalunya
 
CEO-UAB profile
CEO-UAB profileCEO-UAB profile
Using Web Archives for Studying Cultural Heritage Collaborative Platforms
Using Web Archives for Studying Cultural Heritage Collaborative PlatformsUsing Web Archives for Studying Cultural Heritage Collaborative Platforms
Using Web Archives for Studying Cultural Heritage Collaborative Platforms
Marta Severo
 
Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...
Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...
Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...
EADTU
 
Facilitated MOOC Support - Closed Bubbles in a Sea of Openness
Facilitated MOOC Support - Closed Bubbles in a Sea of OpennessFacilitated MOOC Support - Closed Bubbles in a Sea of Openness
Facilitated MOOC Support - Closed Bubbles in a Sea of Openness
witthaus
 
Space, The Final Frontier: Next Generation Special Collections
Space, The Final Frontier: Next Generation Special CollectionsSpace, The Final Frontier: Next Generation Special Collections
Space, The Final Frontier: Next Generation Special Collections
Elaine Harrington
 
EICS 2019 welcome
EICS 2019 welcomeEICS 2019 welcome
EICS 2019 welcome
Jose Ignacio Panach
 
The Data Deluge: the Role of Research Organisations
The Data Deluge: the Role of Research OrganisationsThe Data Deluge: the Role of Research Organisations
The Data Deluge: the Role of Research Organisations
LEARN Project
 
Angelina Russo - Innovation Island
Angelina Russo - Innovation IslandAngelina Russo - Innovation Island
Angelina Russo - Innovation Island
National Digital Forum
 
Convenient isn't always simple: Digital Visitors and Residents.
Convenient isn't always simple: Digital Visitors and Residents.Convenient isn't always simple: Digital Visitors and Residents.
Convenient isn't always simple: Digital Visitors and Residents.
Lynn Connaway
 
Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)
Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)
Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)
Kolds
 
Open Knowledge in Higher Education (OKHE) - session 2
Open Knowledge in Higher Education (OKHE) - session 2Open Knowledge in Higher Education (OKHE) - session 2
Open Knowledge in Higher Education (OKHE) - session 2
Chris Millson
 
CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...
CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...
CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...
CILIP
 
IEEE and IEEE Education Society - IEEE and LACCEI
IEEE and IEEE Education Society - IEEE and LACCEI IEEE and IEEE Education Society - IEEE and LACCEI
IEEE and IEEE Education Society - IEEE and LACCEI
Manuel Castro
 
How researchers use of Social Media & Scholarly Collaboration Networks
How researchers use of Social Media & Scholarly Collaboration NetworksHow researchers use of Social Media & Scholarly Collaboration Networks
How researchers use of Social Media & Scholarly Collaboration Networks
Keita Bando
 
Nazlin Bhimani - DARTS5 presentation
Nazlin Bhimani - DARTS5 presentationNazlin Bhimani - DARTS5 presentation
Nazlin Bhimani - DARTS5 presentation
ARLGSW
 
Using Open Educational Resources in the Basic Composition Classroom
Using Open Educational Resources in the Basic Composition ClassroomUsing Open Educational Resources in the Basic Composition Classroom
Using Open Educational Resources in the Basic Composition Classroom
Anne Arendt
 

Similar to Multimedia annotation (DCU 2016) (20)

OPEN BADGES – THE MISSING LINK IN OPEN EDUCATION
OPEN BADGES – THE MISSING LINK IN OPEN EDUCATIONOPEN BADGES – THE MISSING LINK IN OPEN EDUCATION
OPEN BADGES – THE MISSING LINK IN OPEN EDUCATION
 
APC´s: the new enclosure to knowledge
APC´s: the new enclosure to knowledgeAPC´s: the new enclosure to knowledge
APC´s: the new enclosure to knowledge
 
APC´s: the new enclosure to knowledge
APC´s: the new enclosure to knowledgeAPC´s: the new enclosure to knowledge
APC´s: the new enclosure to knowledge
 
Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)
Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)
Deep Learning for Computer Vision: Welcome (UPC TelecomBCN 2016)
 
CEO-UAB profile
CEO-UAB profileCEO-UAB profile
CEO-UAB profile
 
Using Web Archives for Studying Cultural Heritage Collaborative Platforms
Using Web Archives for Studying Cultural Heritage Collaborative PlatformsUsing Web Archives for Studying Cultural Heritage Collaborative Platforms
Using Web Archives for Studying Cultural Heritage Collaborative Platforms
 
Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...
Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...
Facilitated MOOC support – closed bubbles in a sea of openness by Gabi Wittha...
 
Facilitated MOOC Support - Closed Bubbles in a Sea of Openness
Facilitated MOOC Support - Closed Bubbles in a Sea of OpennessFacilitated MOOC Support - Closed Bubbles in a Sea of Openness
Facilitated MOOC Support - Closed Bubbles in a Sea of Openness
 
Space, The Final Frontier: Next Generation Special Collections
Space, The Final Frontier: Next Generation Special CollectionsSpace, The Final Frontier: Next Generation Special Collections
Space, The Final Frontier: Next Generation Special Collections
 
EICS 2019 welcome
EICS 2019 welcomeEICS 2019 welcome
EICS 2019 welcome
 
The Data Deluge: the Role of Research Organisations
The Data Deluge: the Role of Research OrganisationsThe Data Deluge: the Role of Research Organisations
The Data Deluge: the Role of Research Organisations
 
Angelina Russo - Innovation Island
Angelina Russo - Innovation IslandAngelina Russo - Innovation Island
Angelina Russo - Innovation Island
 
Convenient isn't always simple: Digital Visitors and Residents.
Convenient isn't always simple: Digital Visitors and Residents.Convenient isn't always simple: Digital Visitors and Residents.
Convenient isn't always simple: Digital Visitors and Residents.
 
Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)
Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)
Session 1: "Tour d'Europe" (European MOOC Summit, EPFL, June 2013)
 
Open Knowledge in Higher Education (OKHE) - session 2
Open Knowledge in Higher Education (OKHE) - session 2Open Knowledge in Higher Education (OKHE) - session 2
Open Knowledge in Higher Education (OKHE) - session 2
 
CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...
CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...
CILIP Conference 2020: The 'Digital Pivot' - the role of librarnas and knowle...
 
IEEE and IEEE Education Society - IEEE and LACCEI
IEEE and IEEE Education Society - IEEE and LACCEI IEEE and IEEE Education Society - IEEE and LACCEI
IEEE and IEEE Education Society - IEEE and LACCEI
 
How researchers use of Social Media & Scholarly Collaboration Networks
How researchers use of Social Media & Scholarly Collaboration NetworksHow researchers use of Social Media & Scholarly Collaboration Networks
How researchers use of Social Media & Scholarly Collaboration Networks
 
Nazlin Bhimani - DARTS5 presentation
Nazlin Bhimani - DARTS5 presentationNazlin Bhimani - DARTS5 presentation
Nazlin Bhimani - DARTS5 presentation
 
Using Open Educational Resources in the Basic Composition Classroom
Using Open Educational Resources in the Basic Composition ClassroomUsing Open Educational Resources in the Basic Composition Classroom
Using Open Educational Resources in the Basic Composition Classroom
 

More from Universitat Politècnica de Catalunya

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Universitat Politècnica de Catalunya
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
Universitat Politècnica de Catalunya
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Universitat Politècnica de Catalunya
 
The Transformer - Xavier GirĂł - UPC Barcelona 2021
The Transformer - Xavier GirĂł - UPC Barcelona 2021The Transformer - Xavier GirĂł - UPC Barcelona 2021
The Transformer - Xavier GirĂł - UPC Barcelona 2021
Universitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Universitat Politècnica de Catalunya
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Universitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
Universitat Politècnica de Catalunya
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier GirĂł - UPC Barcelona 2021
The Transformer - Xavier GirĂł - UPC Barcelona 2021The Transformer - Xavier GirĂł - UPC Barcelona 2021
The Transformer - Xavier GirĂł - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier GirĂł - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier GirĂł - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
marufrahmanstratejm
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 

Recently uploaded (20)

HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
Public CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptxPublic CyberSecurity Awareness Presentation 2024.pptx
Public CyberSecurity Awareness Presentation 2024.pptx
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 

Multimedia annotation (DCU 2016)