SlideShare a Scribd company logo
1 of 99
Download to read offline
@DocXavi
Module 4 - Lecture 7
Video Analysis with
RNNs
2 February 2017
Xavier Giró-i-Nieto
[http://pagines.uab.cat/mcv/]
Acknowledgments
2
Alberto Montes
Santi Pascual Amaia Salvador
More details:
D2L2, “Recurrent Neural Networks I”
D2L3, “Recurrent Neural Networks II”
Linked slides
Outline
1. Recurrent Neural Networks
2. Activity Recognition
3. Object Tracking
4. Speech and Video
5. Learn more
4
5
Previously… A Perceptron
J. Alammar, “A visual and interactive guide to the Basics of Neural Networks” (2016)
6
Previously… A Perceptron
J. Alammar, “A visual and interactive guide to the Basics of Neural Networks” (2016)
F. Van Veen, “The Neural Network Zoo” (2016)
7
Two Perceptrons + Softmax classifier
J. Alammar, “A visual and interactive guide to the Basics of Neural Networks” (2016)
8
Three perceptrons + Softmax classifier
TensorFlow, “MNIST for ML beginners”
9
TensorFlow, “MNIST for ML beginners”
Three perceptrons + Softmax classifier
10
TensorFlow, “MNIST for ML beginners”
Three perceptrons + Softmax classifier
11
Neural Network = Multi Layer Perceptron
2 layers of perceptrons
F. Van Veen, “The Neural Network Zoo” (2016)
12
Deep Feed Forward (DFF)
F. Van Veen, “The Neural Network Zoo” (2016)
13Slide credit: Santi Pascual
If we have a sequence of samples...
predict sample x[t+1] knowing previous values {x[t], x[t-1], x[t-2], …, x[t-τ]}
Deep Feed Forward (DFF)
14Slide credit: Santi Pascual
Deep Feed Forward (DFF)
Feed Forward approach:
● static window of size L
● slide the window time-step wise
...
...
...
x[t+1]
x[t-L], …, x[t-1], x[t]
x[t+1]
L
15Slide credit: Santi Pascual
Deep Feed Forward (DFF)
Feed Forward approach:
● static window of size L
● slide the window time-step wise
...
...
...
x[t+2]
x[t-L+1], …, x[t], x[t+1]
...
...
...
x[t+1]
x[t-L], …, x[t-1], x[t]
x[t+2]
L
16Slide credit: Santi Pascual
Deep Feed Forward (DFF)
16
Feed Forward approach:
● static window of size L
● slide the window time-step wise
x[t+3]
L
...
...
...
x[t+3]
x[t-L+2], …, x[t+1], x[t+2]
...
...
...
x[t+2]
x[t-L+1], …, x[t], x[t+1]
...
...
...
x[t+1]
x[t-L], …, x[t-1], x[t]
Deep Feed Forward (DFF)
...
...
...
x1, x2, …, xL
Problems for the feed forward + static window approach:
● What’s the matter increasing L? → Fast growth of num of parameters!
● Decisions are independent between time-steps!
○ The network doesn’t care about what happened at previous time-step, only present window
matters → doesn’t look good
● Cumbersome padding when there are not enough samples to fill L size
○ Can’t work with variable sequence lengths
x1, x2, …, xL, …, x2L
...
...
x1, x2, …, xL, …, x2L, …, x3L
...
...
... ...
Slide credit: Santi Pascual
18Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”
The hidden layers and the
output depend from previous
states of the hidden layers
Recurrent Neural Network (RNN)
19
Recurrent Neural Network (RNN)
Solution: Build specific connections capturing the temporal evolution → Shared weights
in time
● Give volatile memory to the network
Feed-Forward
Fully-Connected
20
Recurrent Neural Network (RNN)
Solution: Build specific connections capturing the temporal evolution → Shared weights
in time
● Give volatile memory to the network
Feed-Forward
Fully-ConnectedFully connected in
time: recurrent matrix
21
time
time
Rotation
90o
Front View Side View
Rotation
90o
Recurrent Neural Network (RNN)
22
Recurrent Neural Network (RNN)
Hence we have two data flows: Forward in space + time propagation: 2 projections per
layer activation.
BEWARE: We have extra depth now! Every time-step is an extra level of depth (as a
deeper stack of layers in a feed-forward fashion!)
Slide credit: Santi Pascual
23
Recurrent Neural Network (RNN)
Slide credit: Santi Pascual
Hence we have two data flows: Forward in space + time propagation: 2 projections per
layer activation
○ Last time-step includes the context of our decisions recursively
24
Recurrent Neural Network (RNN)
Slide credit: Santi Pascual
Hence we have two data flows: Forward in space + time propagation: 2 projections per
layer activation
○ Last time-step includes the context of our decisions recursively
25
Recurrent Neural Network (RNN)
Slide credit: Santi Pascual
Hence we have two data flows: Forward in space + time propagation: 2 projections per
layer activation
○ Last time-step includes the context of our decisions recursively
26
Recurrent Neural Network (RNN)
Slide credit: Santi Pascual
Hence we have two data flows: Forward in space + time propagation: 2 projections per
layer activation
○ Last time-step includes the context of our decisions recursively
27
Recurrent Neural Network (RNN)
Slide credit: Santi Pascual
Back Propagation Through Time (BPTT): The training method has to take into account the
time operations → a cost function E is defined to train our RNN, and in this case the total
error at the output of the network is the sum of the errors at each time-step:
T: max amount of time-steps to do back-prop. In
Keras this is specified when defining the “input
shape” to the RNN layer, by means of:
(batch size, sequence length (T), input_dim)
28
Recurrent Neural Network (RNN)
Slide credit: Santi Pascual
Main problems:
● Long-term memory (remembering quite far time-steps) vanishes quickly because of
the recursive operation with U
● During training gradients explode/vanish easily because of depth-in-time →
Exploding/Vanishing gradients!
29
Bidirectional RNN (BRNN)
Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks”
Must learn weights w2,
w3, w4 & w5; in addition to
w1 & w6.
30
Long Short-Term Memory (LSTM)
Jürgen Schmidhuber @
NIPS 2016 Barcelona
31
Long Short-Term Memory (LSTM)
Jürgen Schmidhuber @
NIPS 2016 Barcelona
32
Long Short-Term Memory (LSTM)
Jürgen Schmidhuber @
NIPS 2016 Barcelona
33
Long Short-Term Memory (LSTM)
Jürgen Schmidhuber @
NIPS 2016 Barcelona
34
Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9, no.
8 (1997): 1735-1780.
Long Short-Term Memory (LSTM)
35
Long Short-Term Memory (LSTM)
Solutions:
1. Change the way in which past information is kept → create the notion of cell state, a
memory unit that keeps long-term information in a safer way by protecting it
from recursive operations
2. Make every RNN unit able to decide whether the current time-step information
matters or not, to accept or discard (optimized reading mechanism)
3. Make every RNN unit able to forget whatever may not be useful anymore by
clearing that info from the cell state (optimized clearing mechanism)
4. Make every RNN unit able to output the decisions whenever it is ready to do so
(optimized output mechanism)
Gating method
36
Long Short-Term Memory (LSTM)
Solutions:
1. Change the way in which past information is kept → create the notion of cell state, a
memory unit that keeps long-term information in a safer way by protecting it
from recursive operations
2. Make every RNN unit able to decide whether the current time-step information
matters or not, to accept or discard (optimized reading mechanism)
3. Make every RNN unit able to forget whatever may not be useful anymore by
clearing that info from the cell state (optimized clearing mechanism)
4. Make every RNN unit able to output the decisions whenever it is ready to do so
(optimized output mechanism)
Gating method
37
Long Short-Term Memory (LSTM)
An LSTM cell is defined by two groups of neurons plus the cell state (memory unit):
1. Gates
2. Activation units
3. Cell state
Computation
Flow
Slide credit: Santi Pascual
38
Long Short-Term Memory (LSTM)
Three gates are governed by sigmoid units (btw [0,1]) define
the control of in & out information..
Figure: Cristopher Olah, “Understanding LSTM Networks” (2015)
slide credit: Xavi Giro
39
Long Short-Term Memory (LSTM)
Forget Gate:
Concatenate
Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
40
Long Short-Term Memory (LSTM)
Input Gate Layer
New contribution to cell state
Classic neuron
Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
41
Long Short-Term Memory (LSTM)
Update Cell State (memory):
Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
42
Long Short-Term Memory (LSTM)
Output Gate Layer
Output to next layer
Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
43
Long Short-Term Memory (LSTM)
Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
44
Gated Recurrent Unit (GRU)
Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger
Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for
statistical machine translation." AMNLP 2014.
Similar performance as LSTM with less computation.
slide credit: Xavi Giro
45
Recurrent Neural Network (RNN)
More details:
D2L2, “Recurrent Neural Networks I”
D2L3, “Recurrent Neural Networks II”
F. Van Veen, “The Neural Network Zoo” (2016)
Outline
1. Recurrent Neural Networks
2. Activity Recognition
3. Object Tracking
4. Speech and Video
5. Learn more
46
47
Yue-Hei Ng, Joe, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and
George Toderici. "Beyond short snippets: Deep networks for video classification." CVPR 2015
Recognition: Image & Optical Flow CNN + LSTM
BSc
thesis
http://activity-net.org/ 48
Temporal Activity Detection
BSc
thesis
Videos
Activity Classification
Longboarding
49
Temporal Activity Detection
BSc
thesis
Videos
Activity Temporal Localization
Longboarding
50
Temporal Activity Detection
BSc
thesis
Neural Network
Activity
51
Temporal Activity Detection
BSc
thesis
Activity
CNN RNN+
52
Temporal Activity Detection
Figure: Tran, Du, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. "Learning
spatiotemporal features with 3D convolutional networks." CVPR 2015
3D Convolutions over sets of 16 frames...
53
Temporal Activity Detection
BSc
thesis
54
Temporal Activity Detection
BSc
thesis
mAP = 0.5938 mAP = 0.5492 mAP = 0.5635
Deeper networks present overfitting
55
Temporal Activity Detection
BSc
thesis
56
Temporal Activity Detection
BSc
thesis
57
Temporal Activity Detection
BSc
thesis
58
Temporal Activity Detection
BSc
thesis
Ground Truth:
Playing water polo
Prediction:
0.765 Playing water polo
0.202 Swimming
0.007 Springboard diving
59
Temporal Activity Detection
BSc
thesis
Ground Truth:
Hopscotch
Prediction:
0.848 Running a marathon
0.023 Triple jump
0.022 Javelin throw
60
Temporal Activity Detection
BSc
thesis
61
Temporal Activity Detection
A. Montes, Salvador, A., Pascual-deLaPuente, S., and Giró-i-Nieto, X., “Temporal Activity Detection in Untrimmed Videos with
Recurrent Neural Networks”, in 1st NIPS Workshop on Large Scale Computer Vision Systems 2016 (best poster award)
62
A. Montes, Salvador, A., Pascual-deLaPuente, S., and Giró-i-Nieto, X., “Temporal Activity Detection in Untrimmed Videos with
Recurrent Neural Networks”, in 1st NIPS Workshop on Large Scale Computer Vision Systems 2016 (best poster award)
Temporal Activity Detection
Outline
1. Recurrent Neural Networks
2. Activity Recognition
3. Object Tracking
4. Speech and Video
5. Learn more
63
Object tracking: DeepTracking
64
P. Ondruska and I. Posner, “Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks,” in The Thirtieth
AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona USA, 2016. [code]
Object tracking: DeepTracking
65
P. Ondruska and I. Posner, “Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks,” in The Thirtieth
AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona USA, 2016. [code]
Object tracking: DeepTracking
66
Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." In Advances in neural information
processing systems, pp. 809-817. 2013 [Project page with code]
Object tracking: ROLO
67
Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent Convolutional Neural
Networks for Visual Object Tracking." arXiv preprint arXiv:1607.05781 (2016)
Object tracking: ROLO
68
Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent Convolutional Neural
Networks for Visual Object Tracking." arXiv preprint arXiv:1607.05781 (2016)
Outline
1. Recurrent Neural Networks
2. Activity Recognition
3. Object Tracking
4. Speech and Video
5. Learn more
69
70
Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017
Speech and Video: Vid2Speech
CNN
(VGG)
Frame from a
slient video
Audio feature
71Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017
Speech and Video: Vid2Speech
No
end-to-end
72
Speech and Video: Vid2Speech
Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017
73
Assael, Yannis M., Brendan Shillingford, Shimon Whiteson, and Nando de Freitas. "LipNet: Sentence-level
Lipreading." arXiv preprint arXiv:1611.01599 (2016).
Speech and Video: LipNet
74
Assael, Yannis M., Brendan Shillingford, Shimon Whiteson, and Nando de Freitas. "LipNet: Sentence-level
Lipreading." arXiv preprint arXiv:1611.01599 (2016).
Speech and Video: LipNet
75
Chung, Joon Son, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. "Lip reading sentences in the
wild." arXiv preprint arXiv:1611.05358 (2016).
Speech and Video: Watch, Listen, Attend & Spell
76
Chung, Joon Son, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. "Lip reading sentences in the
wild." arXiv preprint arXiv:1611.05358 (2016).
Speech and Video: Watch, Listen, Attend & Spell
Outline
1. Recurrent Neural Networks
2. Activity Recognition
3. Object Tracking
4. Speech and Video
5. Learn more
77
Deep Learning: Software
Keras http://keras.io/
Tensor Flow https://www.tensorflow.org/
Caffe http://caffe.berkeleyvision.org/
Torch (Overfeat) http://torch.ch/
78
Theano http://deeplearning.net/software/theano/
MatconvNet (VLFeat) http://www.vlfeat.org/matconvnet/
CNTK (Mcrosoft) http://www.cntk.ai/
MxNet: https://github.com/dmlc/mxnet
Learn more
79
Jordi Pont-Tuset (ETH Zurich), “One-shot Video
Segmentation” on Monday 13 February at 12pm at IRI UPC
Learn more
80
[course site]
[Summer 2016] [Summer 2017]
Stanford course:
CS231n:
Convolutional Neural
Networks for Visual
Recognition
81
Learn more
82
Learn more
http://cs224n.stanford.edu/
Online course:
Deep Learning
Taking machine
learning to the next
level
83
Learn more
ReadCV seminar
Friendly reviews of SoA papers
Spring 2016:
Tuesdays at 11am
ConvNets: Learn more
84
Summer course
Deep Learning for
Computer Vision
June 21-27, 3-7pm
Deep Learning: Learn more
85
● Deep learning methos for vision (CVPR 2012)
● Tutorial on deep learning for vision (CVPR 2014)
● Kyunghyun Cho, “Deep Learning: Past, Present & Future”
ConvNets: Learn more
86
ConvNets: Learn more
87
“Machine learning” sub-Reddit.
ConvNets: Learn more
88
ConvNets: Learn more
89
Check profile requirements for Summer internship (disclaimer: offered to Phd students by default)
Company Avg Salary / hour Avg Salary / month
Yahoo $43 ($43x160=$6,880)
Apple $37 ($37x160=$5,920)
Google $29.54-$31.32 $7,151
Facebook $22.92 $6,150-$7,378
Microsoft $22.63 $6,506-$7,171
Source: Glassdoor.com (internships in California. No stipends included)
Learn more
90
Video: Cristian Canton’s talk “From Catalonia to America: notes on how to achieve a successful post-Phd
career ”@ ACMCV 2015 & UPC
Li Fei-Fei, “How we’re teaching
computers to understand pictures”
TEDTalks 2014.
Learn more
91
Jeremy Howard, “The wonderful
and terrifying implications of
computers that can learn”,
TEDTalks 2014.
Learn more
92
Learn more
93
● Neil Lawrence, OpenAI won’t benefit humanity without open data sharing
(The Guardian, 14/12/2015)
Sports: Do you know them ?
94
Deep Learning: Do you know them ?
95
Antonio Torralba, MIT
(former UPC)
...and MANY MORE I am missing in the page (apologies).
Oriol Vinyals, Google
(former UPC)
Jose M Álvarez, NICTA
(former URL & UAB)
Joan Bruna, Berkeley
(former UPC)
96
Where you are studying
VisioCat dinner
@ CVPR 2015
97
Where you are studying
VisioCat dinner
@ CVPR 2016
Is Computer
Vision solved ?
ConvNets: Discussion
98
Thank you !
Slides available on and .
https://imatge.upc.edu/web/people/xavier-giro
http://bitsearch.blogspot.com
https://twitter.com/DocXavi
https://www.facebook.com/ProfessorXavi
xavier.giro@upc.edu
99

More Related Content

What's hot

Neural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An IntroNeural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An IntroFariz Darari
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention NetworksTaeoh Kim
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term MemoryYan Xu
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function홍배 김
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkYan Xu
 
Physical and Logical Clocks
Physical and Logical ClocksPhysical and Logical Clocks
Physical and Logical ClocksDilum Bandara
 
Clock driven scheduling
Clock driven schedulingClock driven scheduling
Clock driven schedulingKamal Acharya
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Raja Chiky
 
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?Thomas Hjelde Thoresen
 
Computational Complexity: Introduction-Turing Machines-Undecidability
Computational Complexity: Introduction-Turing Machines-UndecidabilityComputational Complexity: Introduction-Turing Machines-Undecidability
Computational Complexity: Introduction-Turing Machines-UndecidabilityAntonis Antonopoulos
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Sujit Pal
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataYao-Chieh Hu
 
Attention in Deep Learning
Attention in Deep LearningAttention in Deep Learning
Attention in Deep Learning健程 杨
 

What's hot (20)

Neural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An IntroNeural Networks and Deep Learning: An Intro
Neural Networks and Deep Learning: An Intro
 
LSTM Basics
LSTM BasicsLSTM Basics
LSTM Basics
 
CNN Attention Networks
CNN Attention NetworksCNN Attention Networks
CNN Attention Networks
 
Long Short Term Memory
Long Short Term MemoryLong Short Term Memory
Long Short Term Memory
 
Rnn & Lstm
Rnn & LstmRnn & Lstm
Rnn & Lstm
 
AlexNet
AlexNetAlexNet
AlexNet
 
The world of loss function
The world of loss functionThe world of loss function
The world of loss function
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Rnn and lstm
Rnn and lstmRnn and lstm
Rnn and lstm
 
Physical and Logical Clocks
Physical and Logical ClocksPhysical and Logical Clocks
Physical and Logical Clocks
 
LSTM
LSTMLSTM
LSTM
 
Clock driven scheduling
Clock driven schedulingClock driven scheduling
Clock driven scheduling
 
Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014Introduction to Data streaming - 05/12/2014
Introduction to Data streaming - 05/12/2014
 
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
Temporal Convolutional Networks - Dethroning RNN's for sequence modelling?
 
Primality
PrimalityPrimality
Primality
 
Computational Complexity: Introduction-Turing Machines-Undecidability
Computational Complexity: Introduction-Turing Machines-UndecidabilityComputational Complexity: Introduction-Turing Machines-Undecidability
Computational Complexity: Introduction-Turing Machines-Undecidability
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
RNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential DataRNN & LSTM: Neural Network for Sequential Data
RNN & LSTM: Neural Network for Sequential Data
 
Attention in Deep Learning
Attention in Deep LearningAttention in Deep Learning
Attention in Deep Learning
 

Similar to Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelona 2017)

Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Universitat Politècnica de Catalunya
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Universitat Politècnica de Catalunya
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksSang Jun Lee
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applicationsSungjoon Choi
 
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.pptManiMaran230751
 
240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxddddddddddddddddssuser2624f71
 
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCTed Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCMLconf
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesAbhijitVenkatesh1
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningJunaid Bhat
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & OpportunityiTrain
 
Future semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTMFuture semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTMKyuri Kim
 
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Universitat Politècnica de Catalunya
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxSagarTekwani4
 
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Universitat Politècnica de Catalunya
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural NetworksDean Wyatte
 
  Brain-Inspired Computation based on Spiking Neural Networks ...
 Brain-Inspired Computation based on Spiking Neural Networks               ... Brain-Inspired Computation based on Spiking Neural Networks               ...
  Brain-Inspired Computation based on Spiking Neural Networks ...Jorge Pires
 

Similar to Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelona 2017) (20)

Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
Recurrent Neural Networks (D2L2 2017 UPC Deep Learning for Computer Vision)
 
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
Recurrent Neural Networks (D2L8 Insight@DCU Machine Learning Workshop 2017)
 
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
Recurrent Neural Networks I (D2L2 Deep Learning for Speech and Language UPC 2...
 
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
Recurrent Neural Networks RNN - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
Deep Learning for Computer Vision: Recurrent Neural Networks (UPC 2016)
 
Lecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural NetworksLecture 7: Recurrent Neural Networks
Lecture 7: Recurrent Neural Networks
 
RNN and its applications
RNN and its applicationsRNN and its applications
RNN and its applications
 
14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt14889574 dl ml RNN Deeplearning MMMm.ppt
14889574 dl ml RNN Deeplearning MMMm.ppt
 
240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd240219_RNN, LSTM code.pptxdddddddddddddddd
240219_RNN, LSTM code.pptxdddddddddddddddd
 
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYCTed Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
Ted Willke, Senior Principal Engineer, Intel Labs at MLconf NYC
 
RNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantagesRNN and LSTM model description and working advantages and disadvantages
RNN and LSTM model description and working advantages and disadvantages
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep Learning: Application & Opportunity
Deep Learning: Application & OpportunityDeep Learning: Application & Opportunity
Deep Learning: Application & Opportunity
 
Future semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTMFuture semantic segmentation with convolutional LSTM
Future semantic segmentation with convolutional LSTM
 
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
Recurrent Neural Networks (DLAI D7L1 2017 UPC Deep Learning for Artificial In...
 
recurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptxrecurrent_neural_networks_april_2020.pptx
recurrent_neural_networks_april_2020.pptx
 
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
Recurrent Neural Networks II (D2L3 Deep Learning for Speech and Language UPC ...
 
Skip RNN: Learning to Skip State Updates in RNNs (ICLR 2018)
Skip RNN: Learning to Skip State Updates in RNNs (ICLR 2018)Skip RNN: Learning to Skip State Updates in RNNs (ICLR 2018)
Skip RNN: Learning to Skip State Updates in RNNs (ICLR 2018)
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
 
  Brain-Inspired Computation based on Spiking Neural Networks ...
 Brain-Inspired Computation based on Spiking Neural Networks               ... Brain-Inspired Computation based on Spiking Neural Networks               ...
  Brain-Inspired Computation based on Spiking Neural Networks ...
 

More from Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

More from Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Recently uploaded

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 

Recently uploaded (20)

9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 

Video Analysis with Recurrent Neural Networks (Master Computer Vision Barcelona 2017)

  • 1. @DocXavi Module 4 - Lecture 7 Video Analysis with RNNs 2 February 2017 Xavier Giró-i-Nieto [http://pagines.uab.cat/mcv/]
  • 2. Acknowledgments 2 Alberto Montes Santi Pascual Amaia Salvador More details: D2L2, “Recurrent Neural Networks I” D2L3, “Recurrent Neural Networks II”
  • 4. Outline 1. Recurrent Neural Networks 2. Activity Recognition 3. Object Tracking 4. Speech and Video 5. Learn more 4
  • 5. 5 Previously… A Perceptron J. Alammar, “A visual and interactive guide to the Basics of Neural Networks” (2016)
  • 6. 6 Previously… A Perceptron J. Alammar, “A visual and interactive guide to the Basics of Neural Networks” (2016) F. Van Veen, “The Neural Network Zoo” (2016)
  • 7. 7 Two Perceptrons + Softmax classifier J. Alammar, “A visual and interactive guide to the Basics of Neural Networks” (2016)
  • 8. 8 Three perceptrons + Softmax classifier TensorFlow, “MNIST for ML beginners”
  • 9. 9 TensorFlow, “MNIST for ML beginners” Three perceptrons + Softmax classifier
  • 10. 10 TensorFlow, “MNIST for ML beginners” Three perceptrons + Softmax classifier
  • 11. 11 Neural Network = Multi Layer Perceptron 2 layers of perceptrons F. Van Veen, “The Neural Network Zoo” (2016)
  • 12. 12 Deep Feed Forward (DFF) F. Van Veen, “The Neural Network Zoo” (2016)
  • 13. 13Slide credit: Santi Pascual If we have a sequence of samples... predict sample x[t+1] knowing previous values {x[t], x[t-1], x[t-2], …, x[t-τ]} Deep Feed Forward (DFF)
  • 14. 14Slide credit: Santi Pascual Deep Feed Forward (DFF) Feed Forward approach: ● static window of size L ● slide the window time-step wise ... ... ... x[t+1] x[t-L], …, x[t-1], x[t] x[t+1] L
  • 15. 15Slide credit: Santi Pascual Deep Feed Forward (DFF) Feed Forward approach: ● static window of size L ● slide the window time-step wise ... ... ... x[t+2] x[t-L+1], …, x[t], x[t+1] ... ... ... x[t+1] x[t-L], …, x[t-1], x[t] x[t+2] L
  • 16. 16Slide credit: Santi Pascual Deep Feed Forward (DFF) 16 Feed Forward approach: ● static window of size L ● slide the window time-step wise x[t+3] L ... ... ... x[t+3] x[t-L+2], …, x[t+1], x[t+2] ... ... ... x[t+2] x[t-L+1], …, x[t], x[t+1] ... ... ... x[t+1] x[t-L], …, x[t-1], x[t]
  • 17. Deep Feed Forward (DFF) ... ... ... x1, x2, …, xL Problems for the feed forward + static window approach: ● What’s the matter increasing L? → Fast growth of num of parameters! ● Decisions are independent between time-steps! ○ The network doesn’t care about what happened at previous time-step, only present window matters → doesn’t look good ● Cumbersome padding when there are not enough samples to fill L size ○ Can’t work with variable sequence lengths x1, x2, …, xL, …, x2L ... ... x1, x2, …, xL, …, x2L, …, x3L ... ... ... ... Slide credit: Santi Pascual
  • 18. 18Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” The hidden layers and the output depend from previous states of the hidden layers Recurrent Neural Network (RNN)
  • 19. 19 Recurrent Neural Network (RNN) Solution: Build specific connections capturing the temporal evolution → Shared weights in time ● Give volatile memory to the network Feed-Forward Fully-Connected
  • 20. 20 Recurrent Neural Network (RNN) Solution: Build specific connections capturing the temporal evolution → Shared weights in time ● Give volatile memory to the network Feed-Forward Fully-ConnectedFully connected in time: recurrent matrix
  • 21. 21 time time Rotation 90o Front View Side View Rotation 90o Recurrent Neural Network (RNN)
  • 22. 22 Recurrent Neural Network (RNN) Hence we have two data flows: Forward in space + time propagation: 2 projections per layer activation. BEWARE: We have extra depth now! Every time-step is an extra level of depth (as a deeper stack of layers in a feed-forward fashion!) Slide credit: Santi Pascual
  • 23. 23 Recurrent Neural Network (RNN) Slide credit: Santi Pascual Hence we have two data flows: Forward in space + time propagation: 2 projections per layer activation ○ Last time-step includes the context of our decisions recursively
  • 24. 24 Recurrent Neural Network (RNN) Slide credit: Santi Pascual Hence we have two data flows: Forward in space + time propagation: 2 projections per layer activation ○ Last time-step includes the context of our decisions recursively
  • 25. 25 Recurrent Neural Network (RNN) Slide credit: Santi Pascual Hence we have two data flows: Forward in space + time propagation: 2 projections per layer activation ○ Last time-step includes the context of our decisions recursively
  • 26. 26 Recurrent Neural Network (RNN) Slide credit: Santi Pascual Hence we have two data flows: Forward in space + time propagation: 2 projections per layer activation ○ Last time-step includes the context of our decisions recursively
  • 27. 27 Recurrent Neural Network (RNN) Slide credit: Santi Pascual Back Propagation Through Time (BPTT): The training method has to take into account the time operations → a cost function E is defined to train our RNN, and in this case the total error at the output of the network is the sum of the errors at each time-step: T: max amount of time-steps to do back-prop. In Keras this is specified when defining the “input shape” to the RNN layer, by means of: (batch size, sequence length (T), input_dim)
  • 28. 28 Recurrent Neural Network (RNN) Slide credit: Santi Pascual Main problems: ● Long-term memory (remembering quite far time-steps) vanishes quickly because of the recursive operation with U ● During training gradients explode/vanish easily because of depth-in-time → Exploding/Vanishing gradients!
  • 29. 29 Bidirectional RNN (BRNN) Alex Graves, “Supervised Sequence Labelling with Recurrent Neural Networks” Must learn weights w2, w3, w4 & w5; in addition to w1 & w6.
  • 30. 30 Long Short-Term Memory (LSTM) Jürgen Schmidhuber @ NIPS 2016 Barcelona
  • 31. 31 Long Short-Term Memory (LSTM) Jürgen Schmidhuber @ NIPS 2016 Barcelona
  • 32. 32 Long Short-Term Memory (LSTM) Jürgen Schmidhuber @ NIPS 2016 Barcelona
  • 33. 33 Long Short-Term Memory (LSTM) Jürgen Schmidhuber @ NIPS 2016 Barcelona
  • 34. 34 Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9, no. 8 (1997): 1735-1780. Long Short-Term Memory (LSTM)
  • 35. 35 Long Short-Term Memory (LSTM) Solutions: 1. Change the way in which past information is kept → create the notion of cell state, a memory unit that keeps long-term information in a safer way by protecting it from recursive operations 2. Make every RNN unit able to decide whether the current time-step information matters or not, to accept or discard (optimized reading mechanism) 3. Make every RNN unit able to forget whatever may not be useful anymore by clearing that info from the cell state (optimized clearing mechanism) 4. Make every RNN unit able to output the decisions whenever it is ready to do so (optimized output mechanism) Gating method
  • 36. 36 Long Short-Term Memory (LSTM) Solutions: 1. Change the way in which past information is kept → create the notion of cell state, a memory unit that keeps long-term information in a safer way by protecting it from recursive operations 2. Make every RNN unit able to decide whether the current time-step information matters or not, to accept or discard (optimized reading mechanism) 3. Make every RNN unit able to forget whatever may not be useful anymore by clearing that info from the cell state (optimized clearing mechanism) 4. Make every RNN unit able to output the decisions whenever it is ready to do so (optimized output mechanism) Gating method
  • 37. 37 Long Short-Term Memory (LSTM) An LSTM cell is defined by two groups of neurons plus the cell state (memory unit): 1. Gates 2. Activation units 3. Cell state Computation Flow Slide credit: Santi Pascual
  • 38. 38 Long Short-Term Memory (LSTM) Three gates are governed by sigmoid units (btw [0,1]) define the control of in & out information.. Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) slide credit: Xavi Giro
  • 39. 39 Long Short-Term Memory (LSTM) Forget Gate: Concatenate Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
  • 40. 40 Long Short-Term Memory (LSTM) Input Gate Layer New contribution to cell state Classic neuron Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
  • 41. 41 Long Short-Term Memory (LSTM) Update Cell State (memory): Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
  • 42. 42 Long Short-Term Memory (LSTM) Output Gate Layer Output to next layer Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
  • 43. 43 Long Short-Term Memory (LSTM) Figure: Cristopher Olah, “Understanding LSTM Networks” (2015) / Slide: Alberto Montes
  • 44. 44 Gated Recurrent Unit (GRU) Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." AMNLP 2014. Similar performance as LSTM with less computation. slide credit: Xavi Giro
  • 45. 45 Recurrent Neural Network (RNN) More details: D2L2, “Recurrent Neural Networks I” D2L3, “Recurrent Neural Networks II” F. Van Veen, “The Neural Network Zoo” (2016)
  • 46. Outline 1. Recurrent Neural Networks 2. Activity Recognition 3. Object Tracking 4. Speech and Video 5. Learn more 46
  • 47. 47 Yue-Hei Ng, Joe, Matthew Hausknecht, Sudheendra Vijayanarasimhan, Oriol Vinyals, Rajat Monga, and George Toderici. "Beyond short snippets: Deep networks for video classification." CVPR 2015 Recognition: Image & Optical Flow CNN + LSTM
  • 53. Figure: Tran, Du, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. "Learning spatiotemporal features with 3D convolutional networks." CVPR 2015 3D Convolutions over sets of 16 frames... 53 Temporal Activity Detection
  • 55. BSc thesis mAP = 0.5938 mAP = 0.5492 mAP = 0.5635 Deeper networks present overfitting 55 Temporal Activity Detection
  • 59. BSc thesis Ground Truth: Playing water polo Prediction: 0.765 Playing water polo 0.202 Swimming 0.007 Springboard diving 59 Temporal Activity Detection
  • 60. BSc thesis Ground Truth: Hopscotch Prediction: 0.848 Running a marathon 0.023 Triple jump 0.022 Javelin throw 60 Temporal Activity Detection
  • 61. BSc thesis 61 Temporal Activity Detection A. Montes, Salvador, A., Pascual-deLaPuente, S., and Giró-i-Nieto, X., “Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks”, in 1st NIPS Workshop on Large Scale Computer Vision Systems 2016 (best poster award)
  • 62. 62 A. Montes, Salvador, A., Pascual-deLaPuente, S., and Giró-i-Nieto, X., “Temporal Activity Detection in Untrimmed Videos with Recurrent Neural Networks”, in 1st NIPS Workshop on Large Scale Computer Vision Systems 2016 (best poster award) Temporal Activity Detection
  • 63. Outline 1. Recurrent Neural Networks 2. Activity Recognition 3. Object Tracking 4. Speech and Video 5. Learn more 63
  • 64. Object tracking: DeepTracking 64 P. Ondruska and I. Posner, “Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks,” in The Thirtieth AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona USA, 2016. [code]
  • 65. Object tracking: DeepTracking 65 P. Ondruska and I. Posner, “Deep Tracking: Seeing Beyond Seeing Using Recurrent Neural Networks,” in The Thirtieth AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona USA, 2016. [code]
  • 66. Object tracking: DeepTracking 66 Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." In Advances in neural information processing systems, pp. 809-817. 2013 [Project page with code]
  • 67. Object tracking: ROLO 67 Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking." arXiv preprint arXiv:1607.05781 (2016)
  • 68. Object tracking: ROLO 68 Ning, Guanghan, Zhi Zhang, Chen Huang, Zhihai He, Xiaobo Ren, and Haohong Wang. "Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking." arXiv preprint arXiv:1607.05781 (2016)
  • 69. Outline 1. Recurrent Neural Networks 2. Activity Recognition 3. Object Tracking 4. Speech and Video 5. Learn more 69
  • 70. 70 Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017 Speech and Video: Vid2Speech CNN (VGG) Frame from a slient video Audio feature
  • 71. 71Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017 Speech and Video: Vid2Speech No end-to-end
  • 72. 72 Speech and Video: Vid2Speech Ephrat, Ariel, and Shmuel Peleg. "Vid2speech: Speech Reconstruction from Silent Video." ICASSP 2017
  • 73. 73 Assael, Yannis M., Brendan Shillingford, Shimon Whiteson, and Nando de Freitas. "LipNet: Sentence-level Lipreading." arXiv preprint arXiv:1611.01599 (2016). Speech and Video: LipNet
  • 74. 74 Assael, Yannis M., Brendan Shillingford, Shimon Whiteson, and Nando de Freitas. "LipNet: Sentence-level Lipreading." arXiv preprint arXiv:1611.01599 (2016). Speech and Video: LipNet
  • 75. 75 Chung, Joon Son, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. "Lip reading sentences in the wild." arXiv preprint arXiv:1611.05358 (2016). Speech and Video: Watch, Listen, Attend & Spell
  • 76. 76 Chung, Joon Son, Andrew Senior, Oriol Vinyals, and Andrew Zisserman. "Lip reading sentences in the wild." arXiv preprint arXiv:1611.05358 (2016). Speech and Video: Watch, Listen, Attend & Spell
  • 77. Outline 1. Recurrent Neural Networks 2. Activity Recognition 3. Object Tracking 4. Speech and Video 5. Learn more 77
  • 78. Deep Learning: Software Keras http://keras.io/ Tensor Flow https://www.tensorflow.org/ Caffe http://caffe.berkeleyvision.org/ Torch (Overfeat) http://torch.ch/ 78 Theano http://deeplearning.net/software/theano/ MatconvNet (VLFeat) http://www.vlfeat.org/matconvnet/ CNTK (Mcrosoft) http://www.cntk.ai/ MxNet: https://github.com/dmlc/mxnet
  • 79. Learn more 79 Jordi Pont-Tuset (ETH Zurich), “One-shot Video Segmentation” on Monday 13 February at 12pm at IRI UPC
  • 80. Learn more 80 [course site] [Summer 2016] [Summer 2017]
  • 81. Stanford course: CS231n: Convolutional Neural Networks for Visual Recognition 81 Learn more
  • 83. Online course: Deep Learning Taking machine learning to the next level 83 Learn more
  • 84. ReadCV seminar Friendly reviews of SoA papers Spring 2016: Tuesdays at 11am ConvNets: Learn more 84
  • 85. Summer course Deep Learning for Computer Vision June 21-27, 3-7pm Deep Learning: Learn more 85
  • 86. ● Deep learning methos for vision (CVPR 2012) ● Tutorial on deep learning for vision (CVPR 2014) ● Kyunghyun Cho, “Deep Learning: Past, Present & Future” ConvNets: Learn more 86
  • 87. ConvNets: Learn more 87 “Machine learning” sub-Reddit.
  • 89. ConvNets: Learn more 89 Check profile requirements for Summer internship (disclaimer: offered to Phd students by default) Company Avg Salary / hour Avg Salary / month Yahoo $43 ($43x160=$6,880) Apple $37 ($37x160=$5,920) Google $29.54-$31.32 $7,151 Facebook $22.92 $6,150-$7,378 Microsoft $22.63 $6,506-$7,171 Source: Glassdoor.com (internships in California. No stipends included)
  • 90. Learn more 90 Video: Cristian Canton’s talk “From Catalonia to America: notes on how to achieve a successful post-Phd career ”@ ACMCV 2015 & UPC
  • 91. Li Fei-Fei, “How we’re teaching computers to understand pictures” TEDTalks 2014. Learn more 91
  • 92. Jeremy Howard, “The wonderful and terrifying implications of computers that can learn”, TEDTalks 2014. Learn more 92
  • 93. Learn more 93 ● Neil Lawrence, OpenAI won’t benefit humanity without open data sharing (The Guardian, 14/12/2015)
  • 94. Sports: Do you know them ? 94
  • 95. Deep Learning: Do you know them ? 95 Antonio Torralba, MIT (former UPC) ...and MANY MORE I am missing in the page (apologies). Oriol Vinyals, Google (former UPC) Jose M Álvarez, NICTA (former URL & UAB) Joan Bruna, Berkeley (former UPC)
  • 96. 96 Where you are studying VisioCat dinner @ CVPR 2015
  • 97. 97 Where you are studying VisioCat dinner @ CVPR 2016
  • 98. Is Computer Vision solved ? ConvNets: Discussion 98
  • 99. Thank you ! Slides available on and . https://imatge.upc.edu/web/people/xavier-giro http://bitsearch.blogspot.com https://twitter.com/DocXavi https://www.facebook.com/ProfessorXavi xavier.giro@upc.edu 99