SlideShare a Scribd company logo
1 of 19
Download to read offline
Foundation and core

   Rick Szeliski &
  Andrew Zisserman
The Need ….

George Santayana:

   "Those who cannot remember the past are condemned to
     repeat it"


• We see in other communities their rediscovery of results and
  methods already well known in computer vision, e.g. in multi-
  media retrieval

• It is even worse if this happens in our own community
Outline


1. Some successes and areas where there has been
   significant progress


2. 20 things every CV researcher should know


3. Dissemination
Multiple View Geometry
One of the success stories of Computer Vision

•    1990
     estimate epipolar geometry of two images using a calibration rig

•    2000
     automatic estimation of cameras and structure from 1000s of frames of
     uncontrolled video footage



Two advances made this possible:
    1. Understanding the (projective) geometry of multiple views
       •    fundamental matrix, trifocal tensor …
    2. Automatic estimation of this geometry from images
       •    development of robust estimation algorithms
Computer vision in the movies
Match mover/camera tracking




 PhotoSynth/PhotoTourism
Kinect
What works reasonably well?
Significant progress and applications:
•   Face detection/recognition
•   OCR, (isolated) hand writing recognition
•   Industrial inspection
•   Affine covariant detectors
•   Gaze tracking
•   Multi-view stereo reconstruction
•   Tracking isolated people in video
•   Interactive segmentation, e.g. medical
•   Image denoising and correction
•   Image stitching, HDR
•   Fingerprint recognition
•    Duplicate image/video search
•    Instance recognition for weakly textured object
What works reasonably well continued?

• Active range finding
• Optical flow
• Autonomous driving with LIDAR
20 things everyone should know
• Data mining community. 2005/2006. Identified
   – 10 challenging problems, and
   – 10 most influential algorithms.


• The result was:
   – C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive
     Bayes, and CART
   – Top 10 algorithms in data mining, Wu et al., Knowl Inf Syst, 2008


• In Computer Vision??
   – There is so much: so many areas, so many disciplines, so many
     application areas
What 20 techniques should all computer vision
 researchers know (short list)?
 1. Image formation and optics        11. Clustering
 2. Image processing, filtering,      12. Viola-Jones
    Fourier analysis                  13. Bayesian techniques
                                      14. Machine learning
 3. Pyramids and wavelets
                                      15. RANSAC and robust techniques
 4. Feature extraction                16. Numerical methods
 5. Image matching                    17. Optimization
 6. Bag of words                      18. Range finding, active illumination
 7. Optical flow                      19. Algorithms
                                      20. Graph cuts
 8. Structure from motion
                                      21. Dynamic programming
 9. Multi view stereo                 22. Complexity analysis
 10.Segmentation                      23. MATLAB and C++. and
                                          assembly (optional: GPU
                                          programming)
                                      24. Communication and
                                          presentation skills
Following slides collated from poll
Image and features
•   NCC
•   Interest point operators
•   Scale invariant and affine invariant detectors & descriptors
•   Scale space
•   Image processing, filtering, Fourier analysis
•   Pyramids and wavelets
•   Edge detection
•   Restoration e.g. deblurring, super-resolution
    – Linear, e.g. Wiener filter
    – MRF
    – Non-local means/BM3D/bilateral filter
Segmentation, grouping and tracking

•   Segmentation
    – Normalized cuts


•   Grouping
    – Hough transforms


•   Clustering
    – K-means
    – Mean-shift
    – Pedro-clustering


•   Tracking
    – Kalman filter
    – Particle filter
Multi-view: stereo, SFM, flow
• RANSAC and other robust techniques
• Geometry:
   – epipolar geometry (projective and affine)
   – planar homographies
   – Affine camera
• Geometry estimators
   – 8 point algorithm for F
   – 4 point algorithm for H
• Factorization
• Bundle-adjustment
• Flow
   – Horn & Schunck L2
   – Lucas-Kanade
   – L1 regularized
Recognition

•   Bag of visual words
•   HOG, SIFT, GIST
•   Spatial pyramid
•   Spatial configurations/Pictorial structures
•   Sliding window/jumping window
•   Cascades
Others
Machine Learning
    –   Adaboost
    –   kNN
    –   SVM
    –   Random forest
    –   PCA, ICA, CCA
    –   EM
    –   MIL/Latent-SVM
    –   Regularization
    –   HMM
    –   Graphical & Bayesian models
Optimization
    –   Classical linear and non-linear
    –   Graph operations
    –   Dynamic programming/message passing for MAP, max-marginals
    –   Graph cuts for binary variable MAP

• Texture synthesis
Who can help?
Dissemination

• Accessible Wikipedia articles for generalists, with high-quality
  links
• Separate Wiki: CVOnline ???
• Summer schools (especially US), recorded
• Tutorial videos
• Online programming assignments (Ted procedures),
• Tutorial courses
• Industry panels @ conferences (will people attend?)
• Introductory video [Serge]
Courses that your computer vision students
do/should take
• Typically, they take computer vision directly (see core
  techniques), sometimes after graphics or image processing.

• Ideally, would like them to take (pre-requisites or co-requisites):
1.Linear algebra, numerical methods, optimization
2.Statistics, Bayesian and robust methods, machine learning
3.Computer graphics and image processing
4.Optics, image formation, sensor design
5.Visual perception
6.Programming: parallel languages (MATLAB), efficient
  languages (C++ and assembly)
7.Technical writing and presentation (all graduate students,
  possibly UGs as well)
Are current computer vision textbooks sufficient?
What is missing?


• No good up-to-date introductory book
• Need algorithm descriptions + pseudo code
• Video lectures, Khan academy (voice-over-pen/tablet
  drawing),
 http://www.ai-class.com/, http://robots.stanford.edu/cs221/
• (On-line) books supplemented by online tools, exercises,
  examples, resources, videos
• Social media + textbooks
• Ads to support free articles (??)

More Related Content

Similar to Core Computer Vision Techniques and Concepts

Fcv core sawhney
Fcv core sawhneyFcv core sawhney
Fcv core sawhneyzukun
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryTanvir Moin
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR InteractionMark Billinghurst
 
Application of image processing.ppt
Application of image processing.pptApplication of image processing.ppt
Application of image processing.pptDevesh448679
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...Edge AI and Vision Alliance
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introductioncairo university
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfQualcomm Research
 
Introductory Level of SLAM Seminar
Introductory Level of SLAM SeminarIntroductory Level of SLAM Seminar
Introductory Level of SLAM SeminarDong-Won Shin
 
Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Sara Granados Cabeza
 
Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템
Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템
Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템Wookjin Choi
 
Synthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsSynthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsPrabindh Sundareson
 
FastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM WorkshopFastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM WorkshopDong-Won Shin
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Aritra Sarkar
 
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Digipolis Antwerpen
 
SEU presentatie Greet deruyter
SEU presentatie Greet deruyterSEU presentatie Greet deruyter
SEU presentatie Greet deruyterUGent-amrp
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 

Similar to Core Computer Vision Techniques and Concepts (20)

Fcv core sawhney
Fcv core sawhneyFcv core sawhney
Fcv core sawhney
 
Overview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear IndustryOverview of Computer Vision For Footwear Industry
Overview of Computer Vision For Footwear Industry
 
2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction2022 COMP4010 Lecture4: AR Interaction
2022 COMP4010 Lecture4: AR Interaction
 
Application of image processing.ppt
Application of image processing.pptApplication of image processing.ppt
Application of image processing.ppt
 
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro..."High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
"High-resolution 3D Reconstruction on a Mobile Processor," a Presentation fro...
 
PPT s01-machine vision-s2
PPT s01-machine vision-s2PPT s01-machine vision-s2
PPT s01-machine vision-s2
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introduction
 
TARDEC Presentation 2
TARDEC Presentation 2TARDEC Presentation 2
TARDEC Presentation 2
 
unit-1-intro
 unit-1-intro unit-1-intro
unit-1-intro
 
Understanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdfUnderstanding the world in 3D with AI.pdf
Understanding the world in 3D with AI.pdf
 
ICS1020 CV
ICS1020 CVICS1020 CV
ICS1020 CV
 
Introductory Level of SLAM Seminar
Introductory Level of SLAM SeminarIntroductory Level of SLAM Seminar
Introductory Level of SLAM Seminar
 
Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...Efficient architecture to condensate visual information driven by attention ...
Efficient architecture to condensate visual information driven by attention ...
 
Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템
Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템
Insight toolkit을 이용한 삼차원 흉부 CT 영상분석 및 폐결절 검출 시스템
 
Synthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in RoboticsSynthetic Data and Graphics Techniques in Robotics
Synthetic Data and Graphics Techniques in Robotics
 
FastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM WorkshopFastCampus 2018 SLAM Workshop
FastCampus 2018 SLAM Workshop
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
 
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
Meetup 21/9/2017 - Image Recogonition: onmisbaar voor een slimme stad?
 
SEU presentatie Greet deruyter
SEU presentatie Greet deruyterSEU presentatie Greet deruyter
SEU presentatie Greet deruyter
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 

Core Computer Vision Techniques and Concepts

  • 1. Foundation and core Rick Szeliski & Andrew Zisserman
  • 2. The Need …. George Santayana: "Those who cannot remember the past are condemned to repeat it" • We see in other communities their rediscovery of results and methods already well known in computer vision, e.g. in multi- media retrieval • It is even worse if this happens in our own community
  • 3. Outline 1. Some successes and areas where there has been significant progress 2. 20 things every CV researcher should know 3. Dissemination
  • 4. Multiple View Geometry One of the success stories of Computer Vision • 1990 estimate epipolar geometry of two images using a calibration rig • 2000 automatic estimation of cameras and structure from 1000s of frames of uncontrolled video footage Two advances made this possible: 1. Understanding the (projective) geometry of multiple views • fundamental matrix, trifocal tensor … 2. Automatic estimation of this geometry from images • development of robust estimation algorithms
  • 5. Computer vision in the movies Match mover/camera tracking PhotoSynth/PhotoTourism
  • 7. What works reasonably well? Significant progress and applications: • Face detection/recognition • OCR, (isolated) hand writing recognition • Industrial inspection • Affine covariant detectors • Gaze tracking • Multi-view stereo reconstruction • Tracking isolated people in video • Interactive segmentation, e.g. medical • Image denoising and correction • Image stitching, HDR • Fingerprint recognition • Duplicate image/video search • Instance recognition for weakly textured object
  • 8. What works reasonably well continued? • Active range finding • Optical flow • Autonomous driving with LIDAR
  • 9. 20 things everyone should know • Data mining community. 2005/2006. Identified – 10 challenging problems, and – 10 most influential algorithms. • The result was: – C4.5, k-Means, SVM, Apriori, EM, PageRank, AdaBoost, kNN, Naive Bayes, and CART – Top 10 algorithms in data mining, Wu et al., Knowl Inf Syst, 2008 • In Computer Vision?? – There is so much: so many areas, so many disciplines, so many application areas
  • 10. What 20 techniques should all computer vision researchers know (short list)? 1. Image formation and optics 11. Clustering 2. Image processing, filtering, 12. Viola-Jones Fourier analysis 13. Bayesian techniques 14. Machine learning 3. Pyramids and wavelets 15. RANSAC and robust techniques 4. Feature extraction 16. Numerical methods 5. Image matching 17. Optimization 6. Bag of words 18. Range finding, active illumination 7. Optical flow 19. Algorithms 20. Graph cuts 8. Structure from motion 21. Dynamic programming 9. Multi view stereo 22. Complexity analysis 10.Segmentation 23. MATLAB and C++. and assembly (optional: GPU programming) 24. Communication and presentation skills Following slides collated from poll
  • 11. Image and features • NCC • Interest point operators • Scale invariant and affine invariant detectors & descriptors • Scale space • Image processing, filtering, Fourier analysis • Pyramids and wavelets • Edge detection • Restoration e.g. deblurring, super-resolution – Linear, e.g. Wiener filter – MRF – Non-local means/BM3D/bilateral filter
  • 12. Segmentation, grouping and tracking • Segmentation – Normalized cuts • Grouping – Hough transforms • Clustering – K-means – Mean-shift – Pedro-clustering • Tracking – Kalman filter – Particle filter
  • 13. Multi-view: stereo, SFM, flow • RANSAC and other robust techniques • Geometry: – epipolar geometry (projective and affine) – planar homographies – Affine camera • Geometry estimators – 8 point algorithm for F – 4 point algorithm for H • Factorization • Bundle-adjustment • Flow – Horn & Schunck L2 – Lucas-Kanade – L1 regularized
  • 14. Recognition • Bag of visual words • HOG, SIFT, GIST • Spatial pyramid • Spatial configurations/Pictorial structures • Sliding window/jumping window • Cascades
  • 15. Others Machine Learning – Adaboost – kNN – SVM – Random forest – PCA, ICA, CCA – EM – MIL/Latent-SVM – Regularization – HMM – Graphical & Bayesian models Optimization – Classical linear and non-linear – Graph operations – Dynamic programming/message passing for MAP, max-marginals – Graph cuts for binary variable MAP • Texture synthesis
  • 17. Dissemination • Accessible Wikipedia articles for generalists, with high-quality links • Separate Wiki: CVOnline ??? • Summer schools (especially US), recorded • Tutorial videos • Online programming assignments (Ted procedures), • Tutorial courses • Industry panels @ conferences (will people attend?) • Introductory video [Serge]
  • 18. Courses that your computer vision students do/should take • Typically, they take computer vision directly (see core techniques), sometimes after graphics or image processing. • Ideally, would like them to take (pre-requisites or co-requisites): 1.Linear algebra, numerical methods, optimization 2.Statistics, Bayesian and robust methods, machine learning 3.Computer graphics and image processing 4.Optics, image formation, sensor design 5.Visual perception 6.Programming: parallel languages (MATLAB), efficient languages (C++ and assembly) 7.Technical writing and presentation (all graduate students, possibly UGs as well)
  • 19. Are current computer vision textbooks sufficient? What is missing? • No good up-to-date introductory book • Need algorithm descriptions + pseudo code • Video lectures, Khan academy (voice-over-pen/tablet drawing), http://www.ai-class.com/, http://robots.stanford.edu/cs221/ • (On-line) books supplemented by online tools, exercises, examples, resources, videos • Social media + textbooks • Ads to support free articles (??)