SlideShare a Scribd company logo
Large-Scale Video Understanding:
YouTube and Beyond
Rahul Sukthankar
Machine Perception, Google Research
https://research.google.com/teams/perception/
AI Frontiers Conference - Nov. 3, 2017
Machine Perception
Really Works!
(better than I expected)
Sample of Perception tech in products
Signals for Image Search ranking, related images, search-by-image, etc.
Sample of Perception tech in products
Cloud Video API Cloud Vision API
Sample of Perception tech in products
(Seth LaForge, Nexus 5X)
HDR+ in Android Camera Mobile Vision API
Sample of Perception tech in products
Organizing Photos image & video
collections and making them
searchable by content
Microvideo tech in
Photos & Motion Stills
De-reflection & tracking
in Photo Scanner
Sample of Perception tech in products
Personalized sticker
packs in Allo
On-device handwriting
input & recognition
OCR for lots of languages
Sample of Perception tech in products
Visual & auditory
annotation & signals on
YouTube
Thumbnail/preview selection &
optimization for YouTube
Non-speech sound captions
on YouTube
Sample of Perception tech in products
Region tracking for custom blurring
tool on YouTube
Mobile creative effects on YouTube
watch, listen, understandcapture a moment improve & manipulate
Useful Applications for Video Technology
Help users create, enhance, organize, and discover videos.
Privacy Region Tracking & Blurring for YouTube
Fun Effects from Tracking (on Mobile) for YouTube
Large-Scale Video
Annotation for YouTube
Large-Scale Video Annotation for YouTube
extract
features
quantize &
aggregate
train model
(e.g., AdaBoost)
training data
Video understanding pipeline as of ~5 years ago
frame
features
video
features
“Roller-blading”
hand-designed
descriptors
codebook
histogram
pixels & sound
samples
Large-Scale Video Annotation for YouTube
extract
features
training data
Modern video understanding pipeline
“Roller-blading”
pixels & sound
samples
Magic box containing many
convolutional, deep, end-to-
end buzzwords :-)
Deep-learned visual features
Inception model
trained on noisy
data (images)
Bottleneck
embedding
layer (1000-d)
Videos with noisy labels
Frame-level Video-level
- Max pooling
- Avg pooling
- VLAD pooling
+80%
mean avg.
precision
40x more compact features
Deep learned visual
features, VLAD coding:
1024-d, 0.272 MAP
Handcrafted audio-
visual features: ~40K-
d, 0.153 MAP
MeanAveragePrecision
Dimensionality
0.40
0.30
0.20
0.10
0
Deep-learned vs. handcrafted features
Personal video search in Google Photos
Lots of videos
Almost no metadata
“Dancing” on the web
“Dancing” in home videos
Domain adaptation: Finding home videos on YouTube
By capture device
vs
By video frame rate
By video orientation
vs
The technology behind personal video search
Video
Trained on web images
Image / photo
annotation model
1
The technology behind personal video search
Video
Trained on web images
Image / photo
annotation model
YouTube frame
annotation model
Trained on video thumbnails
Domain-adapted
frame-level
vision model
1
2
YouTube video
annotation model
Trained on YouTube videos
The technology behind personal video search
Video
Trained on web images
Image / photo
annotation model
YouTube frame
annotation model
Trained on video thumbnails
Domain-adapted
frame-level
vision model
Domain-adapted
video-level
vision model
1
2
3
YouTube video
annotation model
Trained on YouTube videos
The technology behind personal video search
Video
Audio
Trained on web images
Image / photo
annotation model
Trained on YouTube videos
YouTube audio
annotation model
YouTube frame
annotation model
Trained on video thumbnails
Domain-adapted
frame-level
vision model
Domain-adapted
video-level
vision model
Domain-adapted
audio model
1
2
3
4
YouTube video
annotation model
Trained on YouTube videos
toddler
dancing
The technology behind personal video search
Video
Audio
Trained on web images
Image / photo
annotation model
Trained on YouTube videos
YouTube audio
annotation model
YouTube frame
annotation model
Trained on video thumbnails
Domain-adapted
frame-level
vision model
Domain-adapted
video-level
vision model
Domain-adapted
audio model
1
2
3
4
Fusion &
calibration
5
Trained on
home videos
Domain-adapted
personal video
model
Evolution of personal video annotation models
1
2
3
4
Evolution of personal video annotation models
1
2
3
4
Photo annotation model applied on video frames
Evolution of personal video annotation models
Domain adaptation + fusion across frames
1
2
3
4
Photo annotation model applied on video frames
Evolution of personal video annotation models
Fusion across multiple vision models
Domain adaptation + fusion across frames
1
2
3
4
Photo annotation model applied on video frames
Evolution of personal video annotation models
Fusion across multiple audio-visual models
Fusion across multiple vision models
Photo annotation model applied on video frames
Domain adaptation + fusion across frames
1
2
3
4
Evolution of personal video annotation models
1
2
3
4
> 2x recall gain
Learning aesthetics: YouTube Thumbnails
Learning aesthetics: YouTube Thumbnails
YouTube thumbnail
quality model
Learning aesthetics: YouTube Thumbnails
Learning aesthetics: YouTube Thumbnails
Improving YouTube video thumbnails with deep neural nets, Google Research Blog, Oct. 2015
Video retargeting (spatial)
Original video. Reframed for a banner aspect ratio.
Video retargeting (temporal)
Video preview:
(duration: 6 secs)
Motion Stabilization
Motion Stills app
Stream One-Up
Motion Still examples: cinemagraphs
Motion Stills examples: gifs / memes
Motion Stills examples: timelapse
Promising Directions for
Future Research:
Learning from Video
Sermanet, Self-Supervised Imitation, Google Brain
Self-Supervised Imitation
Pierre Sermanet* Corey Lynch* Yevgen Chebotar*
Jasmine Hsu Eric Jang Stefan Schaal Sergey Levine
Google Brain + University of Southern California
* equal contribution
Sermanet, Self-Supervised Imitation, Google Brain
Multi-view capture
This image cannot currently be displayed.
Sermanet, Self-Supervised Imitation, Google Brain
Time-Contrastive Networks (TCN)
(source: [Rippel et al 2015])
arxiv.org/abs/1704.06888v2
sermanet.github.io/imitate
Sermanet, Self-Supervised Imitation, Google Brain
Approach (pouring, real)
* RL used: Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning,
Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S. [ICML 17]
Sermanet, Self-Supervised Imitation, Google Brain
Resulting policies
Sermanet, Self-Supervised Imitation, Google Brain
Pose imitation (real robot)
Useful Datasets for Video Understanding
● Large-scale video annotation
○ Sports-1M > 1M videos from ~500 classes [with
Stanford]
○ YouTube-8M ~8M videos from ~4800 classes
● Action recognition in video
○ THUMOS Temporal localization in untrimmed videos [with UCF, INRIA]
○ Kinetics 400+ short clips for 400 actions [with
DeepMind]
○ AVA Spatially localized atomic actions
[with Berkeley, INRIA]
● Object recognition
○ YouTube-BB Spatially localized objects in video (80 classes)
○ Open Images Spatially localized objects in images (600 classes)
Sports-1M: 1.1M videos from 487 sports classes (video classification)
YouTube-8M Video Research Dataset
research.google.com/youtube8m/
THUMOS Challenge Series: Temporal Localization in Untrimmed Videos
YouTube Bounding Boxes: Spatial localization of one object through time
AVA: Spatial localization of an actor performing atomic actions
Atomic action: “Paint”
Open Images v3 - detailed spatial annotations in images
Example validation images
Open Images v3 - detailed spatial annotations in images
Example validation images
● Significant progress in large-scale video annotation for YouTube
● Video understanding has many applications beyond YouTube
● We encourage others to work on video through public datasets
● Many exciting research problems ahead, particularly in learning from video
(I think there’s a lot more progress to be made in video understanding)
Conclusion

More Related Content

Viewers also liked

Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...
Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...
Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...
AI Frontiers
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
AI Frontiers
 
Esp 10 modyul 10 Pagmamahal sa bayan
Esp 10 modyul 10 Pagmamahal sa bayanEsp 10 modyul 10 Pagmamahal sa bayan
Esp 10 modyul 10 Pagmamahal sa bayan
Thelma Singson
 
Yuandong Tian at AI Frontiers: AI in Games: Achievements and Challenges
Yuandong Tian at AI Frontiers: AI in Games: Achievements and ChallengesYuandong Tian at AI Frontiers: AI in Games: Achievements and Challenges
Yuandong Tian at AI Frontiers: AI in Games: Achievements and Challenges
AI Frontiers
 
Ilya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AI
Ilya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AIIlya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AI
Ilya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AI
AI Frontiers
 
Xiaofeng Ren at AI Frontiers: The Quest for Video Understanding
Xiaofeng Ren at AI Frontiers: The Quest for Video UnderstandingXiaofeng Ren at AI Frontiers: The Quest for Video Understanding
Xiaofeng Ren at AI Frontiers: The Quest for Video Understanding
AI Frontiers
 
James Manyika at AI Frontiers: Sizing up the promise of AI
James Manyika at AI Frontiers: Sizing up the promise of AIJames Manyika at AI Frontiers: Sizing up the promise of AI
James Manyika at AI Frontiers: Sizing up the promise of AI
AI Frontiers
 
Magnus Nordin at AI Frontiers: Deep Learning for Game Development
Magnus Nordin at AI Frontiers: Deep Learning for Game DevelopmentMagnus Nordin at AI Frontiers: Deep Learning for Game Development
Magnus Nordin at AI Frontiers: Deep Learning for Game Development
AI Frontiers
 
Roland Memisevic at AI Frontiers: Common sense video understanding at TwentyBN
Roland Memisevic at AI Frontiers: Common sense video understanding at TwentyBNRoland Memisevic at AI Frontiers: Common sense video understanding at TwentyBN
Roland Memisevic at AI Frontiers: Common sense video understanding at TwentyBN
AI Frontiers
 
Tracxn Research - Chatbots Startup Landscape, June 2016
Tracxn Research - Chatbots Startup Landscape, June 2016Tracxn Research - Chatbots Startup Landscape, June 2016
Tracxn Research - Chatbots Startup Landscape, June 2016
Tracxn
 
Tracxn Research - Industrial Internet of Things Report, June 2017
Tracxn Research - Industrial Internet of Things Report, June 2017Tracxn Research - Industrial Internet of Things Report, June 2017
Tracxn Research - Industrial Internet of Things Report, June 2017
Tracxn
 
Frank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AIFrank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AI
AI Frontiers
 

Viewers also liked (12)

Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...
Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...
Omar Tawakol at AI Frontiers: The Rise Of Voice-Activated Assistants In The W...
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
 
Esp 10 modyul 10 Pagmamahal sa bayan
Esp 10 modyul 10 Pagmamahal sa bayanEsp 10 modyul 10 Pagmamahal sa bayan
Esp 10 modyul 10 Pagmamahal sa bayan
 
Yuandong Tian at AI Frontiers: AI in Games: Achievements and Challenges
Yuandong Tian at AI Frontiers: AI in Games: Achievements and ChallengesYuandong Tian at AI Frontiers: AI in Games: Achievements and Challenges
Yuandong Tian at AI Frontiers: AI in Games: Achievements and Challenges
 
Ilya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AI
Ilya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AIIlya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AI
Ilya Gelfenbeyn at AI Frontiers: Successful Exits - Lessons from API.AI
 
Xiaofeng Ren at AI Frontiers: The Quest for Video Understanding
Xiaofeng Ren at AI Frontiers: The Quest for Video UnderstandingXiaofeng Ren at AI Frontiers: The Quest for Video Understanding
Xiaofeng Ren at AI Frontiers: The Quest for Video Understanding
 
James Manyika at AI Frontiers: Sizing up the promise of AI
James Manyika at AI Frontiers: Sizing up the promise of AIJames Manyika at AI Frontiers: Sizing up the promise of AI
James Manyika at AI Frontiers: Sizing up the promise of AI
 
Magnus Nordin at AI Frontiers: Deep Learning for Game Development
Magnus Nordin at AI Frontiers: Deep Learning for Game DevelopmentMagnus Nordin at AI Frontiers: Deep Learning for Game Development
Magnus Nordin at AI Frontiers: Deep Learning for Game Development
 
Roland Memisevic at AI Frontiers: Common sense video understanding at TwentyBN
Roland Memisevic at AI Frontiers: Common sense video understanding at TwentyBNRoland Memisevic at AI Frontiers: Common sense video understanding at TwentyBN
Roland Memisevic at AI Frontiers: Common sense video understanding at TwentyBN
 
Tracxn Research - Chatbots Startup Landscape, June 2016
Tracxn Research - Chatbots Startup Landscape, June 2016Tracxn Research - Chatbots Startup Landscape, June 2016
Tracxn Research - Chatbots Startup Landscape, June 2016
 
Tracxn Research - Industrial Internet of Things Report, June 2017
Tracxn Research - Industrial Internet of Things Report, June 2017Tracxn Research - Industrial Internet of Things Report, June 2017
Tracxn Research - Industrial Internet of Things Report, June 2017
 
Frank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AIFrank Chen at AI Frontiers: Startups and AI
Frank Chen at AI Frontiers: Startups and AI
 

Similar to Rahul Sukthankar at AI Frontiers: Large-Scale Video Understanding: YouTube and Beyond

Video+Language: From Classification to Description
Video+Language: From Classification to DescriptionVideo+Language: From Classification to Description
Video+Language: From Classification to Description
Goergen Institute for Data Science
 
Video + Language
Video + LanguageVideo + Language
Jay Y
Jay Y Jay Y
Jay Y
Hilary Ip
 
Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.
Alexa Dovgopolaya
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2
Jeff Sipko
 
Computer Vision, Deep Learning, OpenCV
Computer Vision, Deep Learning, OpenCVComputer Vision, Deep Learning, OpenCV
Computer Vision, Deep Learning, OpenCV
Farshid Pirahansiah
 
CV machine learning freelancer
CV machine learning freelancerCV machine learning freelancer
CV machine learning freelancer
Majella Elobo Ekassi
 
Inside proposal 16 113 - version 01
Inside proposal 16 113 - version 01Inside proposal 16 113 - version 01
Inside proposal 16 113 - version 01
Kasun Udayanga
 
Inside proposal 16 113
Inside proposal 16 113Inside proposal 16 113
Inside proposal 16 113
Pranavaghanan Murugesh
 
Inside prototype 16-113
Inside prototype   16-113Inside prototype   16-113
Inside prototype 16-113
Kasun Udayanga
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
Goergen Institute for Data Science
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
Goergen Institute for Data Science
 
Prior AI consulting use cases
Prior AI consulting use casesPrior AI consulting use cases
Prior AI consulting use cases
Harendra Singh
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big Data
IRJET Journal
 
Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...
FIAT/IFTA
 
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
IJERD Editor
 
A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives
FIAT/IFTA
 
Media evaluation question 6
Media evaluation question 6Media evaluation question 6
Media evaluation question 6
smithangus
 
Technologies used to create product
Technologies used to create productTechnologies used to create product
Technologies used to create productKirstythomas
 

Similar to Rahul Sukthankar at AI Frontiers: Large-Scale Video Understanding: YouTube and Beyond (20)

Video+Language: From Classification to Description
Video+Language: From Classification to DescriptionVideo+Language: From Classification to Description
Video+Language: From Classification to Description
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
 
Video + Language
Video + LanguageVideo + Language
Video + Language
 
Jay Y
Jay Y Jay Y
Jay Y
 
Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.Face recognition for augmented reality and media management.Viewdle.2011.
Face recognition for augmented reality and media management.Viewdle.2011.
 
Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2Becoming a kinect hacker innovator v2
Becoming a kinect hacker innovator v2
 
Computer Vision, Deep Learning, OpenCV
Computer Vision, Deep Learning, OpenCVComputer Vision, Deep Learning, OpenCV
Computer Vision, Deep Learning, OpenCV
 
CV machine learning freelancer
CV machine learning freelancerCV machine learning freelancer
CV machine learning freelancer
 
Inside proposal 16 113 - version 01
Inside proposal 16 113 - version 01Inside proposal 16 113 - version 01
Inside proposal 16 113 - version 01
 
Inside proposal 16 113
Inside proposal 16 113Inside proposal 16 113
Inside proposal 16 113
 
Inside prototype 16-113
Inside prototype   16-113Inside prototype   16-113
Inside prototype 16-113
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
 
Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?Video + Language: Where Does Domain Knowledge Fit in?
Video + Language: Where Does Domain Knowledge Fit in?
 
Prior AI consulting use cases
Prior AI consulting use casesPrior AI consulting use cases
Prior AI consulting use cases
 
Real-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big DataReal-Time Video Copy Detection in Big Data
Real-Time Video Copy Detection in Big Data
 
Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...Rosinski ibm ai overview with several examples of projects in the media and l...
Rosinski ibm ai overview with several examples of projects in the media and l...
 
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
Content Based Video Retrieval Using Integrated Feature Extraction and Persona...
 
A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives A framework for visual search in broadcasting companies' multimedia archives
A framework for visual search in broadcasting companies' multimedia archives
 
Media evaluation question 6
Media evaluation question 6Media evaluation question 6
Media evaluation question 6
 
Technologies used to create product
Technologies used to create productTechnologies used to create product
Technologies used to create product
 

More from AI Frontiers

Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
AI Frontiers
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
AI Frontiers
 
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
AI Frontiers
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
AI Frontiers
 
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural NetworksTraining at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
AI Frontiers
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
AI Frontiers
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
AI Frontiers
 
Percy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine LearningPercy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine Learning
AI Frontiers
 
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionIlya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
AI Frontiers
 
Mark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber ElevateMark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber Elevate
AI Frontiers
 
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
AI Frontiers
 
Arnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the SkyArnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the Sky
AI Frontiers
 
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
AI Frontiers
 
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
AI Frontiers
 
Sumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for EnterpriseSumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for Enterprise
AI Frontiers
 
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
AI Frontiers
 
Alex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in HealthcareAlex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
AI Frontiers
 
Long Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingLong Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in Gaming
AI Frontiers
 
Melissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & FinanceMelissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & Finance
AI Frontiers
 
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
AI Frontiers
 

More from AI Frontiers (20)

Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
Training at AI Frontiers 2018 - LaiOffer Data Session: How Spark Speedup AI
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 1: Heuristi...
 
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
Training at AI Frontiers 2018 - Ni Lao: Weakly Supervised Natural Language Un...
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-lecture 2: Incremen...
 
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural NetworksTraining at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
Training at AI Frontiers 2018 - Udacity: Enhancing NLP with Deep Neural Networks
 
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
Training at AI Frontiers 2018 - LaiOffer Self-Driving-Car-Lecture 3: Any-Angl...
 
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
Training at AI Frontiers 2018 - Lukasz Kaiser: Sequence to Sequence Learning ...
 
Percy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine LearningPercy Liang at AI Frontiers : Pushing the Limits of Machine Learning
Percy Liang at AI Frontiers : Pushing the Limits of Machine Learning
 
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI missionIlya Sutskever at AI Frontiers : Progress towards the OpenAI mission
Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission
 
Mark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber ElevateMark Moore at AI Frontiers : Uber Elevate
Mark Moore at AI Frontiers : Uber Elevate
 
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
Mario Munich at AI Frontiers : Consumer robotics: embedding affordable AI in ...
 
Arnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the SkyArnaud Thiercelin at AI Frontiers : AI in the Sky
Arnaud Thiercelin at AI Frontiers : AI in the Sky
 
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
Anima Anandkumar at AI Frontiers : Modern ML : Deep, distributed, Multi-dimen...
 
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Set...
 
Sumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for EnterpriseSumit Gupta at AI Frontiers : AI for Enterprise
Sumit Gupta at AI Frontiers : AI for Enterprise
 
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement LearningYuandong Tian at AI Frontiers : Planning in Reinforcement Learning
Yuandong Tian at AI Frontiers : Planning in Reinforcement Learning
 
Alex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in HealthcareAlex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
Alex Ermolaev at AI Frontiers : Major Applications of AI in Healthcare
 
Long Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in GamingLong Lin at AI Frontiers : AI in Gaming
Long Lin at AI Frontiers : AI in Gaming
 
Melissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & FinanceMelissa Goldman at AI Frontiers : AI & Finance
Melissa Goldman at AI Frontiers : AI & Finance
 
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
Li Deng at AI Frontiers : From Modeling Speech/Language to Modeling Financial...
 

Recently uploaded

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 

Recently uploaded (20)

【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 

Rahul Sukthankar at AI Frontiers: Large-Scale Video Understanding: YouTube and Beyond

  • 1. Large-Scale Video Understanding: YouTube and Beyond Rahul Sukthankar Machine Perception, Google Research https://research.google.com/teams/perception/ AI Frontiers Conference - Nov. 3, 2017
  • 3. Sample of Perception tech in products Signals for Image Search ranking, related images, search-by-image, etc.
  • 4. Sample of Perception tech in products Cloud Video API Cloud Vision API
  • 5. Sample of Perception tech in products (Seth LaForge, Nexus 5X) HDR+ in Android Camera Mobile Vision API
  • 6. Sample of Perception tech in products Organizing Photos image & video collections and making them searchable by content Microvideo tech in Photos & Motion Stills De-reflection & tracking in Photo Scanner
  • 7. Sample of Perception tech in products Personalized sticker packs in Allo On-device handwriting input & recognition OCR for lots of languages
  • 8. Sample of Perception tech in products Visual & auditory annotation & signals on YouTube Thumbnail/preview selection & optimization for YouTube Non-speech sound captions on YouTube
  • 9. Sample of Perception tech in products Region tracking for custom blurring tool on YouTube Mobile creative effects on YouTube
  • 10. watch, listen, understandcapture a moment improve & manipulate Useful Applications for Video Technology Help users create, enhance, organize, and discover videos.
  • 11. Privacy Region Tracking & Blurring for YouTube
  • 12. Fun Effects from Tracking (on Mobile) for YouTube
  • 14. Large-Scale Video Annotation for YouTube extract features quantize & aggregate train model (e.g., AdaBoost) training data Video understanding pipeline as of ~5 years ago frame features video features “Roller-blading” hand-designed descriptors codebook histogram pixels & sound samples
  • 15. Large-Scale Video Annotation for YouTube extract features training data Modern video understanding pipeline “Roller-blading” pixels & sound samples Magic box containing many convolutional, deep, end-to- end buzzwords :-)
  • 16. Deep-learned visual features Inception model trained on noisy data (images) Bottleneck embedding layer (1000-d) Videos with noisy labels Frame-level Video-level - Max pooling - Avg pooling - VLAD pooling
  • 17. +80% mean avg. precision 40x more compact features Deep learned visual features, VLAD coding: 1024-d, 0.272 MAP Handcrafted audio- visual features: ~40K- d, 0.153 MAP MeanAveragePrecision Dimensionality 0.40 0.30 0.20 0.10 0 Deep-learned vs. handcrafted features
  • 18. Personal video search in Google Photos Lots of videos Almost no metadata
  • 21. Domain adaptation: Finding home videos on YouTube By capture device vs By video frame rate By video orientation vs
  • 22. The technology behind personal video search Video Trained on web images Image / photo annotation model 1
  • 23. The technology behind personal video search Video Trained on web images Image / photo annotation model YouTube frame annotation model Trained on video thumbnails Domain-adapted frame-level vision model 1 2
  • 24. YouTube video annotation model Trained on YouTube videos The technology behind personal video search Video Trained on web images Image / photo annotation model YouTube frame annotation model Trained on video thumbnails Domain-adapted frame-level vision model Domain-adapted video-level vision model 1 2 3
  • 25. YouTube video annotation model Trained on YouTube videos The technology behind personal video search Video Audio Trained on web images Image / photo annotation model Trained on YouTube videos YouTube audio annotation model YouTube frame annotation model Trained on video thumbnails Domain-adapted frame-level vision model Domain-adapted video-level vision model Domain-adapted audio model 1 2 3 4
  • 26. YouTube video annotation model Trained on YouTube videos toddler dancing The technology behind personal video search Video Audio Trained on web images Image / photo annotation model Trained on YouTube videos YouTube audio annotation model YouTube frame annotation model Trained on video thumbnails Domain-adapted frame-level vision model Domain-adapted video-level vision model Domain-adapted audio model 1 2 3 4 Fusion & calibration 5 Trained on home videos Domain-adapted personal video model
  • 27. Evolution of personal video annotation models 1 2 3 4
  • 28. Evolution of personal video annotation models 1 2 3 4 Photo annotation model applied on video frames
  • 29. Evolution of personal video annotation models Domain adaptation + fusion across frames 1 2 3 4 Photo annotation model applied on video frames
  • 30. Evolution of personal video annotation models Fusion across multiple vision models Domain adaptation + fusion across frames 1 2 3 4 Photo annotation model applied on video frames
  • 31. Evolution of personal video annotation models Fusion across multiple audio-visual models Fusion across multiple vision models Photo annotation model applied on video frames Domain adaptation + fusion across frames 1 2 3 4
  • 32. Evolution of personal video annotation models 1 2 3 4 > 2x recall gain
  • 34. Learning aesthetics: YouTube Thumbnails YouTube thumbnail quality model
  • 36. Learning aesthetics: YouTube Thumbnails Improving YouTube video thumbnails with deep neural nets, Google Research Blog, Oct. 2015
  • 37. Video retargeting (spatial) Original video. Reframed for a banner aspect ratio.
  • 38. Video retargeting (temporal) Video preview: (duration: 6 secs)
  • 41. Motion Still examples: cinemagraphs
  • 42. Motion Stills examples: gifs / memes
  • 44. Promising Directions for Future Research: Learning from Video
  • 45. Sermanet, Self-Supervised Imitation, Google Brain Self-Supervised Imitation Pierre Sermanet* Corey Lynch* Yevgen Chebotar* Jasmine Hsu Eric Jang Stefan Schaal Sergey Levine Google Brain + University of Southern California * equal contribution
  • 46. Sermanet, Self-Supervised Imitation, Google Brain Multi-view capture This image cannot currently be displayed.
  • 47. Sermanet, Self-Supervised Imitation, Google Brain Time-Contrastive Networks (TCN) (source: [Rippel et al 2015]) arxiv.org/abs/1704.06888v2 sermanet.github.io/imitate
  • 48. Sermanet, Self-Supervised Imitation, Google Brain Approach (pouring, real) * RL used: Combining Model-Based and Model-Free Updates for Trajectory-Centric Reinforcement Learning, Chebotar, Y., Hausman, K., Zhang, M., Sukhatme, G., Schaal, S., Levine, S. [ICML 17]
  • 49. Sermanet, Self-Supervised Imitation, Google Brain Resulting policies
  • 50. Sermanet, Self-Supervised Imitation, Google Brain Pose imitation (real robot)
  • 51. Useful Datasets for Video Understanding ● Large-scale video annotation ○ Sports-1M > 1M videos from ~500 classes [with Stanford] ○ YouTube-8M ~8M videos from ~4800 classes ● Action recognition in video ○ THUMOS Temporal localization in untrimmed videos [with UCF, INRIA] ○ Kinetics 400+ short clips for 400 actions [with DeepMind] ○ AVA Spatially localized atomic actions [with Berkeley, INRIA] ● Object recognition ○ YouTube-BB Spatially localized objects in video (80 classes) ○ Open Images Spatially localized objects in images (600 classes)
  • 52. Sports-1M: 1.1M videos from 487 sports classes (video classification)
  • 53. YouTube-8M Video Research Dataset research.google.com/youtube8m/
  • 54. THUMOS Challenge Series: Temporal Localization in Untrimmed Videos
  • 55. YouTube Bounding Boxes: Spatial localization of one object through time
  • 56. AVA: Spatial localization of an actor performing atomic actions Atomic action: “Paint”
  • 57. Open Images v3 - detailed spatial annotations in images Example validation images
  • 58. Open Images v3 - detailed spatial annotations in images Example validation images
  • 59. ● Significant progress in large-scale video annotation for YouTube ● Video understanding has many applications beyond YouTube ● We encourage others to work on video through public datasets ● Many exciting research problems ahead, particularly in learning from video (I think there’s a lot more progress to be made in video understanding) Conclusion