SlideShare a Scribd company logo
1 of 72
Deep Visual Understanding from Deep
Learning
Jitendra Malik
UC Berkeley & Google
Moravec’s argument(1998)
ROBOT: Mere Machine To Transcendent Mind
• 1 neuron = 1000 instructions/sec
• 1 synapse = 1 byte of information
• Human brain then processes 10^14 IPS and
has 10^14 bytes of storage
• In 2000, we have 10^9 IPS and 10^9 bytes on
a desktop machine
• Assuming Moore’s law we obtain human level
computing power in 2025, or with a cluster of
100 nodes in 2015.
Embodied Cognition
Vision
(broadly,perception)
Motor Control
(broadly,planning)
Language
Semantic Reasoning
Ontogeny of Intelligence
• The Cambrian period (543-490 million yrs ago)
led to the emergence of wide variety of
animal life. These animals had vision and
locomotion capabilities.
• Sensory systems provide great benefits only
when accompanied by the ability to move - to
find food, avoid predators etc.
If you don’t need to move you don’t need an eye or a brain!
https://goodheartextremescience.wordpress.com/2010/01/27/meet-the-creature-that-eats-its-own-brain/
7
Hominid evolution in last 5 million years
• Bipedalism freed the hand for tool making.
Dexterous hands coevolved with larger brains.
• Anaxagoras: It is because of his being armed
with hands that man is the most intelligent
animal
Origins of Language (from Trask)
The evolutionary progression
• Vision and Locomotion
• Manipulation
• Language
Successes in AI seem to follow the same order!
Is Object Detection nearly solved?
Hubel and Wiesel (1962) discovered orientation sensitive
neurons in V1
Convolutional Neural Networks (LeCun et al )
Used backpropagation to train the weights in this architecture
• First demonstrated by LeCun et al for handwritten digit recognition(1989)
• Applied in sliding window paradigm for tasks such as face detection in the
1990s.
• However was not competitive on standard computer vision object
detection benchmarks in the 2000s.
• And then Imagenet and Alexnet happened..
R-CNN: Regions with CNN features
Girshick, Donahue, Darrell & Malik (CVPR 2014)
Input
image
Extract region
proposals (~2k / image)
Compute CNN
features
Classify regions
(linear SVM)
This and the Multibox work from Google showed
how to apply these architectures for object detection
Fast R-CNN (Girshick, 2015)
R-CNN with SPP features, no need to warp individual windows
There is also Faster R-CNN
which doesn’t require external proposals
The 3R’s of Vision:
Recognition, Reconstruction & Reorganization
Recognition
ReorganizationReconstruction
Talk at POCV Workshop, CVPR 2012
Fifty years of computer vision 1965-
2015
• 1960s: Beginnings in artificial intelligence, image
processing and pattern recognition
• 1970s: Foundational work on image formation: Horn,
Koenderink, Longuet-Higgins …
• 1980s: Vision as applied mathematics: geometry, multi-
scale analysis, control theory, optimization …
• 1990s: Multiple View Reconstruction well understood
• 2000s: Learning approaches to recognition problems in
full swing. Large datasets are collected and annotated
e.g. ImageNet
• 2010s: Deep Learning becomes popular building off
availability of GPUs and annotated datasets.
Reconstructing the world
… automatically from huge collections of photos
downloaded from the Internet
Over the past 10 years, 3D modeling from images has made huge
advances in scale, quality, and generality. We can reconstruct
scenes…
Snavely, Seitz, Szeliski.
Reconstructing the World from Internet Photo
Collections.
Reconstructing the world
Over the past 10 years, 3D modeling from images has made huge
advances in scale, quality, and generality. We can reconstruct
scenes…
… that vary over time
Matzen & Snavely.
Scene Chronology.
ECCV 2014
Reconstructing the world
Over the past 10 years, 3D modeling from images has made huge
advances in scale, quality, and generality. We can reconstruct
scenes…
… that vary over
time
Matzen & Snavely.
Scene Chronology.
ECCV 2014
Martin-Brualla, Gallup, Seitz.
Time-lapse Mining from
Internet Photos.
SIGGRAPH 2015
Reconstructing the great indoors…
Choi, Zhou, Koltun.
Robust Reconstruction of
Indoor Scenes.
CVPR 2015
… using Depth Cameras
Ikehata, Yan, Furukawa.
Structured Indoor Modeling.
ICCV 2015
… using Semantic
Reconstruction of
Rooms and Objects
pointcloud3Dmeshrendering
ShapeNet (Stanford & Princeton)
Some problems that we can solve…
Block Diagram of the Primate Visual System
Neuroscience & Computer Vision
• A feed-forward view of processing in the ventral
stream with layers of simple and complex cells led to
the neo-cognitron and subsequently convolutional
networks.
• We now know that the ventral stream is much more
complicated with bidirectional as well as feedback
connections.
• I am interested in computer vision tasks where
feedback is key to the solution. This is a very natural
way to capture “context”. Helpful in pose recovery,
instance segmentation etc.
IEF : Carreira, Agrawal, Fragkiadaki & Malik
Social Perception
• Computers today have pitifully low “social
intelligence”
• We need to understand the internal state of
humans as they interact with each other and
the external world
• Examples: emotional state, body language,
current goals.
What we would like to infer…
Will person B put some money into Person C’s tip bag?
Visual Semantic Role Labeling
Gupta & Malik (2015)
What we can’t do (yet)
• The hierarchical structure of human behavior-
movement, goals, actions and events
ACTION = MOVEMENT + GOAL
Events
e.g. A meal at a restaurant
• Classical AI/Cognitive Science Solution –
Schemas (frames, scripts etc.)
• To have a robust, visually grounded solution
we need to learn the equivalent from video +
Knowledge Graph like structures
• Perhaps best tackled in particular domains e.g.
team sports, instructional videos etc.
What has been responsible for recent
AI successes?
• Big Computing
• Big Data
What has been responsible for recent
AI successes?
• Big Computing
• Big Data
• Big Annotation
• Big Simulation
Game scenarios can be simulated, but
it’s not so easy in other settings
Consider infants,
41
42
External
Teacher
Signal
Internal
Teacher
Signal
Self
Supervision
External
Supervision
The Development of Embodied Cognition:
Six Lessons from Babies
Linda Smith & Michael Gasser
The Six Lessons
• Be multi-modal
• Be incremental
• Be physical
• Explore
• Be social
• Use language
• An example: Learning to see by moving, P.
Agrawal, J. Carreira, J. Malik (ICCV 2015)
Consider Poking
45
Same Poke  Different Outcomes
The knowledge of object class explains it!
On Mental Models
If the organism carries a `small-scale model’ of
external reality and of its own possible actions
within its head, it is able to try out various
alternatives, conclude which is the best of them,
react to future situations before they arise, utilize
the knowledge of past events in dealing with the
present and the future, and in every way to react
in a much fuller, safer, and more competent
manner to the emergencies which face it (Craik,
1943,Ch. 5, p.61)
Modern Control theory (Kalman et al) uses a state
space formalism to achieve this.
For acting in novel situations,
Model of the Agent Model of the Environment
How will the environment look in the future
when the agent interacts with it?
48
Force?
Consider the specific case of billiards
What force to apply?
Model of the environment
49
Force?
Consider the specific case of billiards
What force to apply?
Model of the environment
How to apply this force?
Model of the agent
LEARNING VISUAL PREDICTIVE MODELS
OF PHYSICS
FOR PLAYING BILLIARDS
Katerina Fragkiadaki*, Pulkit Agrawal*, Sergey Levine, Jitendra Malik
ICLR, 2016
*Equal contribution
50
Moving Balls World
Factors of change
Table Geometry
Number of balls
Ball Size
Color of Balls/Walls
51
Like the real world, moving ball worlds provides constantly
changing environments
52
Visual Predictive Model of Physics
Neural network
F
Prediction
Module
A model that can predict “future visual states”
(i.e. visual imagination)
Key Idea: Use object-centric
predictions
F
53
F
World-Centric Prediction Object-Centric Prediction
Our Model
54
Xt-3
Xt-2
Xt-1
Xt
CNN LSTM
Force (Ft)
ut
.
.
.
ut+19
(v elocit y )
LEARNING ENVIRONMENT MODEL DIRECTLY FROM VISUAL INPUTS
Assume that the agent can observe and interact with the world.
Assume that agent can track objects.
MODEL PREDICTIONS
70
80
1 Ball
2 Balls
Scaling with number of balls
70
80
2B-on-3B
3B-on-4B
Generalization to novel worlds
Model’s Imagination
55
Only Inputs
Visual of the
first frame
Applied forces
Model’s Imagination – Multiple Balls
56
Only Inputs
Visual of the
first frame
Applied forces
Model’s
Imagination
Model’s Imagination
Ground-
Truth
We successfully model collisions.
Input to the model – visual of the first frame and the applied force (yellow arrow)
Dark  Light Blue: Progression of time
57
58
Object Centric Frame Centric
Train on 2/3 Balls  Generalize to 6 ball worlds
REPLACE
THIS PLOT
59
Now that we have learnt predictive models
We can plan Actions!
60
Visually imagine the effect of different forces !!
Then chose the optimal force which “in imagination” (simulation)
leads to the desired goal state.
Planning Actions
Accuracy of pushing a ball to a desired
position
61
LEARNING TO POKE BY POKING:
EXPERIENTIAL LEARNING OF INTUITIVE
PHYSICS
Pulkit Agrawal*, Ashvin Nair*, Pieter Abbeel, Jitendra Malik,
Sergey Levine
NIPS 2016
*Equal contribution
62
63
Data Collection
Forward
Model
65
Decoder
Zt
vt
Predictor
Zt+1
Image
(Xt)
Action
(ut)
Image
Encoder
Action
Encoder
We donot want to predict
pixels!!
Inverse Model
Multimodality in output space
Our Method: Jointly Learn Forward and Inverse Model
66
Zt
vt
Predictor
Zt+1
Image
(Xt)
Action
(ut)
Image
Encoder
Action
Encoder
Simultaneously learn and
predict in an abstract features
space (Zt)
Forward model regularizes
inverse model !!
Image
Encoder
67
inverse models!
Results
68
I nit ial Final Target
69
I nit ial Final Target
A
B
Results
70
Results
Pinto, L., & Gupta, A., Supersizing self-supervision: Learning to grasp from 50k tries and
700 robot hours, ICRA, 2016.
Levine, S., Pastor, P., Krizhevsky, A., & Quillen, D., Learning Hand-Eye Coordination for
Robotic Grasping with Deep Learning and Large-Scale Data Collection, arXiv:1603.02199,
2016.
Pinto, L., Gandhi, D., Han, Y., Park, Y. L., & Gupta, A., The Curious Robot: Learning Visual
Representations via Physical Interactions, arXiv:1604.01360, 2016.
Related and Contemporary Work
Embodied Cognition
Vision
(broadly,perception)
Motor Control
(broadly,planning)
Language
Semantic Reasoning

More Related Content

What's hot

Ai ml dl_bct and mariners-1
Ai  ml dl_bct and mariners-1Ai  ml dl_bct and mariners-1
Ai ml dl_bct and mariners-1cmmindia2017
 
101 Webinar - Artificial Intelligence, Deep Learning and Geospatial
101 Webinar - Artificial Intelligence, Deep Learning and Geospatial101 Webinar - Artificial Intelligence, Deep Learning and Geospatial
101 Webinar - Artificial Intelligence, Deep Learning and GeospatialGeospatial Media & Communications
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learningleopauly
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOleg Mygryn
 
Deep learning short introduction
Deep learning short introductionDeep learning short introduction
Deep learning short introductionAdwait Bhave
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Amr Rashed
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxChun-Hao Chang
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceLukas Masuch
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learningJörgen Sandig
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItHolberton School
 
Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work IIMohamed Loey
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning TutorialAmr Rashed
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RPoo Kuan Hoong
 
Artificial Intelligence, Machine Learning and Deep Learning with CNN
Artificial Intelligence, Machine Learning and Deep Learning with CNNArtificial Intelligence, Machine Learning and Deep Learning with CNN
Artificial Intelligence, Machine Learning and Deep Learning with CNNmojammel43
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep LearningMyungjin Lee
 
Donner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspectsDonner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspectsVienna Data Science Group
 

What's hot (20)

Andrew Ng, Chief Scientist at Baidu
Andrew Ng, Chief Scientist at BaiduAndrew Ng, Chief Scientist at Baidu
Andrew Ng, Chief Scientist at Baidu
 
Ai ml dl_bct and mariners-1
Ai  ml dl_bct and mariners-1Ai  ml dl_bct and mariners-1
Ai ml dl_bct and mariners-1
 
101 Webinar - Artificial Intelligence, Deep Learning and Geospatial
101 Webinar - Artificial Intelligence, Deep Learning and Geospatial101 Webinar - Artificial Intelligence, Deep Learning and Geospatial
101 Webinar - Artificial Intelligence, Deep Learning and Geospatial
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Deep learning
Deep learningDeep learning
Deep learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep learning short introduction
Deep learning short introductionDeep learning short introduction
Deep learning short introduction
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
From Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptxFrom Conventional Machine Learning to Deep Learning and Beyond.pptx
From Conventional Machine Learning to Deep Learning and Beyond.pptx
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Deep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do ItDeep Learning Class #0 - You Can Do It
Deep Learning Class #0 - You Can Do It
 
Deep Learning - Overview of my work II
Deep Learning - Overview of my work IIDeep Learning - Overview of my work II
Deep Learning - Overview of my work II
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 
Artificial Intelligence, Machine Learning and Deep Learning with CNN
Artificial Intelligence, Machine Learning and Deep Learning with CNNArtificial Intelligence, Machine Learning and Deep Learning with CNN
Artificial Intelligence, Machine Learning and Deep Learning with CNN
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
 
Donner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspectsDonner - Deep Learning - Overview and practical aspects
Donner - Deep Learning - Overview and practical aspects
 
Deep learning
Deep learningDeep learning
Deep learning
 
Deep Learning Survey
Deep Learning SurveyDeep Learning Survey
Deep Learning Survey
 

Viewers also liked

The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive
 
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive
 
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...The Hive
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Grigory Sapunov
 
How the Digital Revolution is Disrupting the TV Industry
How the Digital Revolution is Disrupting the TV Industry How the Digital Revolution is Disrupting the TV Industry
How the Digital Revolution is Disrupting the TV Industry Suman Mishra
 
04 history of cv computer vision, neural networks and pattern recognition - ...
04  history of cv computer vision, neural networks and pattern recognition - ...04  history of cv computer vision, neural networks and pattern recognition - ...
04 history of cv computer vision, neural networks and pattern recognition - ...zukun
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...Edge AI and Vision Alliance
 
AI, behind the scenes
AI, behind the scenesAI, behind the scenes
AI, behind the scenesGwennael Gate
 
Robot, Learning From Data
Robot, Learning From DataRobot, Learning From Data
Robot, Learning From DataSungjoon Choi
 
Sidechain talk
Sidechain talkSidechain talk
Sidechain talkjojva
 

Viewers also liked (20)

The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at Twitter
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
 
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare
 
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
The Hive Think Tank - Design Thinking by Bernie Roth, Professor at Stanford U...
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
 
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure LeskovecThe Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
 
Introduction to Deep Blue & Pubmed
Introduction to Deep Blue & PubmedIntroduction to Deep Blue & Pubmed
Introduction to Deep Blue & Pubmed
 
How the Digital Revolution is Disrupting the TV Industry
How the Digital Revolution is Disrupting the TV Industry How the Digital Revolution is Disrupting the TV Industry
How the Digital Revolution is Disrupting the TV Industry
 
04 history of cv computer vision, neural networks and pattern recognition - ...
04  history of cv computer vision, neural networks and pattern recognition - ...04  history of cv computer vision, neural networks and pattern recognition - ...
04 history of cv computer vision, neural networks and pattern recognition - ...
 
SXSW
SXSWSXSW
SXSW
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
 
AI, behind the scenes
AI, behind the scenesAI, behind the scenes
AI, behind the scenes
 
Robot, Learning From Data
Robot, Learning From DataRobot, Learning From Data
Robot, Learning From Data
 
Sidechain talk
Sidechain talkSidechain talk
Sidechain talk
 

Similar to Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik

Artificial Intelligence and its application
Artificial Intelligence and its applicationArtificial Intelligence and its application
Artificial Intelligence and its applicationFELICIALILIANJ
 
Cognitive Vision - After the hype
Cognitive Vision - After the hypeCognitive Vision - After the hype
Cognitive Vision - After the hypepotaters
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...Tulipp. Eu
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveHuahai Yang
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision Chen Sagiv
 
Materi_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfMateri_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfichsan6
 
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...Albert Y. C. Chen
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep LearningIRJET Journal
 
Computer vision lightning talk castaway week
Computer vision lightning talk castaway weekComputer vision lightning talk castaway week
Computer vision lightning talk castaway weekChristopher Decker
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction Wael Badawy
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Takrim Ul Islam Laskar
 
The Future of Neuroimaging: A 3D Exploration of TBI
The Future of Neuroimaging: A 3D Exploration of TBIThe Future of Neuroimaging: A 3D Exploration of TBI
The Future of Neuroimaging: A 3D Exploration of TBIHunter Whitney
 
ChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressedChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressedBrian Fisher
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
 

Similar to Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik (20)

Artificial Intelligence and its application
Artificial Intelligence and its applicationArtificial Intelligence and its application
Artificial Intelligence and its application
 
Cognitive Vision - After the hype
Cognitive Vision - After the hypeCognitive Vision - After the hype
Cognitive Vision - After the hype
 
Educational machine intelligence
Educational machine intelligenceEducational machine intelligence
Educational machine intelligence
 
What is Media in MIT Media Lab, Why 'Camera Culture'
What is Media in MIT Media Lab, Why 'Camera Culture'What is Media in MIT Media Lab, Why 'Camera Culture'
What is Media in MIT Media Lab, Why 'Camera Culture'
 
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
HiPEAC 2019 Workshop - Real-Time Modelling Visual Scenes with Biological Insp...
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
 
Introduction talk to Computer Vision
Introduction talk to Computer Vision Introduction talk to Computer Vision
Introduction talk to Computer Vision
 
Materi_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdfMateri_01_VK_2223_3.pdf
Materi_01_VK_2223_3.pdf
 
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep Learning
 
Computer vision lightning talk castaway week
Computer vision lightning talk castaway weekComputer vision lightning talk castaway week
Computer vision lightning talk castaway week
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
uploadscribd.pptx
uploadscribd.pptxuploadscribd.pptx
uploadscribd.pptx
 
Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.Facial emotion detection on babies' emotional face using Deep Learning.
Facial emotion detection on babies' emotional face using Deep Learning.
 
upload3.pptx
upload3.pptxupload3.pptx
upload3.pptx
 
The Future of Neuroimaging: A 3D Exploration of TBI
The Future of Neuroimaging: A 3D Exploration of TBIThe Future of Neuroimaging: A 3D Exploration of TBI
The Future of Neuroimaging: A 3D Exploration of TBI
 
Chemnitz dec2014
Chemnitz dec2014Chemnitz dec2014
Chemnitz dec2014
 
ChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressedChemnitzDec2014.key.compressed
ChemnitzDec2014.key.compressed
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
 

More from The Hive

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie MuirheadThe Hive
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...The Hive
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTThe Hive
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18The Hive
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseThe Hive
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...The Hive
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell AutomationThe Hive
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroThe Hive
 
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive
 
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank:  Rocking the Database World with RocksDBThe Hive Think Tank:  Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive
 
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive
 
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...The Hive
 

More from The Hive (19)

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve Omohundro
 
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
 
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank:  Rocking the Database World with RocksDBThe Hive Think Tank:  Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
 
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
 
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
 
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
Advanced Visual Analytics and Real-time Analytics at Platform scale by Brian ...
 

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 

Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik

  • 1. Deep Visual Understanding from Deep Learning Jitendra Malik UC Berkeley & Google
  • 2.
  • 3. Moravec’s argument(1998) ROBOT: Mere Machine To Transcendent Mind • 1 neuron = 1000 instructions/sec • 1 synapse = 1 byte of information • Human brain then processes 10^14 IPS and has 10^14 bytes of storage • In 2000, we have 10^9 IPS and 10^9 bytes on a desktop machine • Assuming Moore’s law we obtain human level computing power in 2025, or with a cluster of 100 nodes in 2015.
  • 4.
  • 6. Ontogeny of Intelligence • The Cambrian period (543-490 million yrs ago) led to the emergence of wide variety of animal life. These animals had vision and locomotion capabilities. • Sensory systems provide great benefits only when accompanied by the ability to move - to find food, avoid predators etc.
  • 7. If you don’t need to move you don’t need an eye or a brain! https://goodheartextremescience.wordpress.com/2010/01/27/meet-the-creature-that-eats-its-own-brain/ 7
  • 8. Hominid evolution in last 5 million years • Bipedalism freed the hand for tool making. Dexterous hands coevolved with larger brains. • Anaxagoras: It is because of his being armed with hands that man is the most intelligent animal
  • 9. Origins of Language (from Trask)
  • 10. The evolutionary progression • Vision and Locomotion • Manipulation • Language Successes in AI seem to follow the same order!
  • 11. Is Object Detection nearly solved?
  • 12. Hubel and Wiesel (1962) discovered orientation sensitive neurons in V1
  • 13.
  • 14. Convolutional Neural Networks (LeCun et al ) Used backpropagation to train the weights in this architecture • First demonstrated by LeCun et al for handwritten digit recognition(1989) • Applied in sliding window paradigm for tasks such as face detection in the 1990s. • However was not competitive on standard computer vision object detection benchmarks in the 2000s. • And then Imagenet and Alexnet happened..
  • 15. R-CNN: Regions with CNN features Girshick, Donahue, Darrell & Malik (CVPR 2014) Input image Extract region proposals (~2k / image) Compute CNN features Classify regions (linear SVM) This and the Multibox work from Google showed how to apply these architectures for object detection
  • 16. Fast R-CNN (Girshick, 2015) R-CNN with SPP features, no need to warp individual windows There is also Faster R-CNN which doesn’t require external proposals
  • 17.
  • 18. The 3R’s of Vision: Recognition, Reconstruction & Reorganization Recognition ReorganizationReconstruction Talk at POCV Workshop, CVPR 2012
  • 19. Fifty years of computer vision 1965- 2015 • 1960s: Beginnings in artificial intelligence, image processing and pattern recognition • 1970s: Foundational work on image formation: Horn, Koenderink, Longuet-Higgins … • 1980s: Vision as applied mathematics: geometry, multi- scale analysis, control theory, optimization … • 1990s: Multiple View Reconstruction well understood • 2000s: Learning approaches to recognition problems in full swing. Large datasets are collected and annotated e.g. ImageNet • 2010s: Deep Learning becomes popular building off availability of GPUs and annotated datasets.
  • 20. Reconstructing the world … automatically from huge collections of photos downloaded from the Internet Over the past 10 years, 3D modeling from images has made huge advances in scale, quality, and generality. We can reconstruct scenes… Snavely, Seitz, Szeliski. Reconstructing the World from Internet Photo Collections.
  • 21. Reconstructing the world Over the past 10 years, 3D modeling from images has made huge advances in scale, quality, and generality. We can reconstruct scenes… … that vary over time Matzen & Snavely. Scene Chronology. ECCV 2014
  • 22. Reconstructing the world Over the past 10 years, 3D modeling from images has made huge advances in scale, quality, and generality. We can reconstruct scenes… … that vary over time Matzen & Snavely. Scene Chronology. ECCV 2014 Martin-Brualla, Gallup, Seitz. Time-lapse Mining from Internet Photos. SIGGRAPH 2015
  • 23. Reconstructing the great indoors… Choi, Zhou, Koltun. Robust Reconstruction of Indoor Scenes. CVPR 2015 … using Depth Cameras Ikehata, Yan, Furukawa. Structured Indoor Modeling. ICCV 2015 … using Semantic Reconstruction of Rooms and Objects pointcloud3Dmeshrendering
  • 24. ShapeNet (Stanford & Princeton)
  • 25.
  • 26. Some problems that we can solve…
  • 27.
  • 28. Block Diagram of the Primate Visual System
  • 29. Neuroscience & Computer Vision • A feed-forward view of processing in the ventral stream with layers of simple and complex cells led to the neo-cognitron and subsequently convolutional networks. • We now know that the ventral stream is much more complicated with bidirectional as well as feedback connections. • I am interested in computer vision tasks where feedback is key to the solution. This is a very natural way to capture “context”. Helpful in pose recovery, instance segmentation etc.
  • 30.
  • 31.
  • 32. IEF : Carreira, Agrawal, Fragkiadaki & Malik
  • 33. Social Perception • Computers today have pitifully low “social intelligence” • We need to understand the internal state of humans as they interact with each other and the external world • Examples: emotional state, body language, current goals.
  • 34. What we would like to infer… Will person B put some money into Person C’s tip bag?
  • 35. Visual Semantic Role Labeling Gupta & Malik (2015)
  • 36. What we can’t do (yet) • The hierarchical structure of human behavior- movement, goals, actions and events ACTION = MOVEMENT + GOAL
  • 37. Events e.g. A meal at a restaurant • Classical AI/Cognitive Science Solution – Schemas (frames, scripts etc.) • To have a robust, visually grounded solution we need to learn the equivalent from video + Knowledge Graph like structures • Perhaps best tackled in particular domains e.g. team sports, instructional videos etc.
  • 38. What has been responsible for recent AI successes? • Big Computing • Big Data
  • 39. What has been responsible for recent AI successes? • Big Computing • Big Data • Big Annotation • Big Simulation
  • 40. Game scenarios can be simulated, but it’s not so easy in other settings
  • 43. The Development of Embodied Cognition: Six Lessons from Babies Linda Smith & Michael Gasser
  • 44. The Six Lessons • Be multi-modal • Be incremental • Be physical • Explore • Be social • Use language • An example: Learning to see by moving, P. Agrawal, J. Carreira, J. Malik (ICCV 2015)
  • 45. Consider Poking 45 Same Poke  Different Outcomes The knowledge of object class explains it!
  • 46. On Mental Models If the organism carries a `small-scale model’ of external reality and of its own possible actions within its head, it is able to try out various alternatives, conclude which is the best of them, react to future situations before they arise, utilize the knowledge of past events in dealing with the present and the future, and in every way to react in a much fuller, safer, and more competent manner to the emergencies which face it (Craik, 1943,Ch. 5, p.61) Modern Control theory (Kalman et al) uses a state space formalism to achieve this.
  • 47. For acting in novel situations, Model of the Agent Model of the Environment How will the environment look in the future when the agent interacts with it?
  • 48. 48 Force? Consider the specific case of billiards What force to apply? Model of the environment
  • 49. 49 Force? Consider the specific case of billiards What force to apply? Model of the environment How to apply this force? Model of the agent
  • 50. LEARNING VISUAL PREDICTIVE MODELS OF PHYSICS FOR PLAYING BILLIARDS Katerina Fragkiadaki*, Pulkit Agrawal*, Sergey Levine, Jitendra Malik ICLR, 2016 *Equal contribution 50
  • 51. Moving Balls World Factors of change Table Geometry Number of balls Ball Size Color of Balls/Walls 51 Like the real world, moving ball worlds provides constantly changing environments
  • 52. 52 Visual Predictive Model of Physics Neural network F Prediction Module A model that can predict “future visual states” (i.e. visual imagination)
  • 53. Key Idea: Use object-centric predictions F 53 F World-Centric Prediction Object-Centric Prediction
  • 54. Our Model 54 Xt-3 Xt-2 Xt-1 Xt CNN LSTM Force (Ft) ut . . . ut+19 (v elocit y ) LEARNING ENVIRONMENT MODEL DIRECTLY FROM VISUAL INPUTS Assume that the agent can observe and interact with the world. Assume that agent can track objects. MODEL PREDICTIONS 70 80 1 Ball 2 Balls Scaling with number of balls 70 80 2B-on-3B 3B-on-4B Generalization to novel worlds
  • 55. Model’s Imagination 55 Only Inputs Visual of the first frame Applied forces
  • 56. Model’s Imagination – Multiple Balls 56 Only Inputs Visual of the first frame Applied forces
  • 57. Model’s Imagination Model’s Imagination Ground- Truth We successfully model collisions. Input to the model – visual of the first frame and the applied force (yellow arrow) Dark  Light Blue: Progression of time 57
  • 58. 58 Object Centric Frame Centric Train on 2/3 Balls  Generalize to 6 ball worlds REPLACE THIS PLOT
  • 59. 59 Now that we have learnt predictive models We can plan Actions!
  • 60. 60 Visually imagine the effect of different forces !! Then chose the optimal force which “in imagination” (simulation) leads to the desired goal state. Planning Actions
  • 61. Accuracy of pushing a ball to a desired position 61
  • 62. LEARNING TO POKE BY POKING: EXPERIENTIAL LEARNING OF INTUITIVE PHYSICS Pulkit Agrawal*, Ashvin Nair*, Pieter Abbeel, Jitendra Malik, Sergey Levine NIPS 2016 *Equal contribution 62
  • 64.
  • 66. Our Method: Jointly Learn Forward and Inverse Model 66 Zt vt Predictor Zt+1 Image (Xt) Action (ut) Image Encoder Action Encoder Simultaneously learn and predict in an abstract features space (Zt) Forward model regularizes inverse model !! Image Encoder
  • 68. Results 68 I nit ial Final Target
  • 69. 69 I nit ial Final Target A B Results
  • 71. Pinto, L., & Gupta, A., Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours, ICRA, 2016. Levine, S., Pastor, P., Krizhevsky, A., & Quillen, D., Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, arXiv:1603.02199, 2016. Pinto, L., Gandhi, D., Han, Y., Park, Y. L., & Gupta, A., The Curious Robot: Learning Visual Representations via Physical Interactions, arXiv:1604.01360, 2016. Related and Contemporary Work