C280, Computer VisionProf. Trevor Darrelltrevor@eecs.berkeley.edu
TodayAdministrivia“What is vision?”ScheduleIntroductionsImage formation
PrerequisitesThis course is appropriate as a first course for graduate students with a EECS background, which should have prepared the students with these essential prerequisites:Data structuresA good working knowledge of MATLAB programming (or willingness and time to pick it up quickly!) Linear algebra Vector calculusThe course does not assume prior imaging experience, computer vision, image processing, or graphics
TextThe primary course text will be Rick Szeliski’s draft Computer Vision: Algorithms and Applications; we will use an online copy of the June 19thdraft.   The secondary  text is Forsyth and Ponce, Computer Vision: A Modern Approach.
Primary Text
Secondary Text
MatlabProblem sets and projects will involve Matlab programming (you are free to use alternative packages). Matlab runs on all the Instructional Windows and UNIX systems. Instructions and toolkits are described in http://inst.eecs.berkeley.edu/cgi-bin/pub.cgi?file=matlab.help.  CS280 students can use their existing EECS Windows accounts in EECS instructional labs, and they can request new accounts (for non-majors) or additional access to Instructional resources by following the instructions about ’named’ accounts in http://inst.eecs.berkeley.edu/connecting.html#accounts. They can logon remotely and run it on some of our servers: http://inst.eecs.berkeley.edu/connecting.html#labs
GradingThere will be three equal components to the course gradeFive  problem setsA take-home examFinal project (including evaluation of proposal document, in-class presentation, and final report)In addition, strong class participation can offset negative performance in any one of the above components.
Problem setsPset0 – Basic Image Manipulation in MatlabPset1 – Filtering and FeaturesPset2 – Geometry and CalibrationPset3 – Recognition Pset4 – Stereo and MotionCan discuss, but must submit individual work
Take-homeLimited time: 3 daysCovers everything through hand out dateLittle programming *No discussion or collaboration allowed*
Final projectSignificant novel implementation of technique related to course contentTeams of 2 encouraged (document role!)Or journal length review article (no teams)Three components:proposal document (no more than 5 pages)in class results presentation (10 minutes)final write-up (no more than 15 pages)
Class ParticipationClass participation includesshowing upbeing able to articulate key points from last lecturehaving read assigned sections and being able to “fill in the blank” during the lecture I won’t cold call, but will solicit volunteersStrong in-class participation can offset poor performance in one of the other grade components.
Course goals….(broadly speaking)principles of image formationconvolution and image pyramidslocal feature analysismulti-view geometryimage warping and stitchingstructure from motionvisual recognitionimage-based rendering
What is computer vision?		 		       Done?
What is computer vision?Automatic understanding of images and videoComputing properties of the 3D world from visual data (measurement)Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation)
Vision for measurementMulti-view stereo forcommunity photo collectionsReal-time stereoStructure from motionNASA Mars RoverPollefeys et al.Goesele et al.Slide credit: L. Lazebnik
Vision for perception, interpretationObjectsActivitiesScenesLocationsText / writingFacesGesturesMotionsEmotions…amusement parkskyThe Wicked TwisterCedar PointFerris wheelrideride12 EwaterLake Erieridetreetreepeople waiting in linepeople sitting on rideumbrellastreemaxaircarouseldeckbenchtreepedestrians
Related disciplinesGraphicsAlgorithmsArtificial intelligenceImage processingCognitive scienceMachine learningComputer vision
Vision and graphicsModelImagesVisionGraphicsInverse problems: analysis and synthesis.
Why vision?As image sources multiply, so do applicationsRelieve humans of boring, easy tasksEnhance human abilities: human-computer interaction, visualizationPerception for robotics / autonomous agentsOrganize and give access to visual content
Why vision?Personal photo albumsMovies, news, sportsMedical and scientific imagesSurveillance and securityImages and video are everywhere!Slide credit; L. Lazebnik
Again, what is computer vision?Mathematics of geometry of image formation?Statistics of the natural world?Models for neuroscience?Engineering methods for matching images?Science Fiction?
Vision Demo?we’re not quite there yet….Terminator 2
Every picture tells a storyGoal of computer vision is to write computer programs that can interpret images
Can computers match (or beat) human vision?Yes and no (but mostly no!)humans are much better at “hard” thingscomputers can be better at “easy” things
Human perception has its shortcomings…Sinha and Poggio, Nature, 1996
Copyright A.Kitaoka 2003
Current state of the artThe next slides show some examples of what current vision systems can do
Earth viewers (3D modeling)Image from Microsoft’s Virtual Earth(see also:  Google Earth)
Photosynthhttp://labs.live.com/photosynth/Based on Photo Tourism technologydeveloped by Noah Snavely, Steve Seitz, and Rick Szeliski
Photo Tourism overviewPhoto ExplorerScene reconstructionInput photographsRelative camera positions and orientationsPoint cloudSparse correspondenceSystem for interactive browsing and exploring large collections of photos of a scene.Computes viewpoint of each photo as well as a sparse 3d model of the scene.
Photo Tourism overview
Optical character recognition (OCR)Technology to convert scanned docs to textIf you have a scanner, it probably came with OCR softwareDigit recognition, AT&T labshttp://www.research.att.com/~yann/License plate readershttp://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Face detectionMany new digital cameras now detect facesCanon, Sony, Fuji, …
Smile detection?Sony Cyber-shot® T70 Digital Still Camera
Object recognition (in supermarkets)LaneHawk by EvolutionRobotics“A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it… “
Face recognitionWho is she?
Vision-based biometrics“How the Afghan Girl was Identified by Her Iris Patterns”  Read the story
Login without a password…Face recognition systems now beginning to appear more widelyhttp://www.sensiblevision.com/Fingerprint scanners on many new laptops, other devices
Object recognition (in mobile phones)This is becoming real:                      Microsoft ResearchPoint & Find, NokiaSnapTell.com (now amazon)
Snaptellhttp://snaptell.com/demos/DemoLarge.htm
Nokia Point and Tell…http://conversations.nokia.com/home/2008/09/point-and-fin-1.html
Special effects:  shape captureThe Matrix movies, ESC Entertainment, XYZRGB, NRC
Special effects:  motion capturePirates of the Carribean, Industrial Light and MagicClick here for interactive demo
SportsSportvision first down lineNice explanation on www.howstuffworks.com
Smart carsMobileyeVision systems currently in high-end BMW, GM, Volvo models By 2010:  70% of car manufacturers.Video demoSlide content courtesy of AmnonShashua
Smart carsMobileyeVision systems currently in high-end BMW, GM, Volvo models By 2010:  70% of car manufacturers.Video demoSlide content courtesy of AmnonShashua
Vision-based interaction (and games)Digimask:  put your face on a 3D avatar.Nintendo Wii has camera-based IRtracking built in.  See Lee’s work atCMU on clever tricks on using it tocreate a multi-touch display! “Game turns moviegoers into Human Joysticks”, CNETCamera tracking a crowd, based on this work.
Vision in spaceNASA'S Mars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007. Vision systems (JPL) used for several tasksPanorama stitching
3D terrain modeling
Obstacle detection, position tracking
For more, read “Computer Vision on Mars” by Matthies et al.RoboticsNASA’s Mars Spirit Roverhttp://en.wikipedia.org/wiki/Spirit_roverhttp://www.robocup.org/
Medical imagingImage guided surgeryGrimson et al., MIT3D imagingMRI, CT
Current state of the artYou just saw examples of current systems.Many of these are less than 5 years oldThis is a very active research area, and rapidly changingMany new apps in the next 5 yearsTo learn more about vision applications and companiesDavid Lowe maintains an excellent overview of vision companieshttp://www.cs.ubc.ca/spider/lowe/vision.html
Syllabus / Schedule (see handout)http://tinyurl.com/UCBC280CALImage FormationColorImage FilteringPyramids & RegularizationFeature Detection and MatchingGeometric AlignmentGeometric Image StitchingPhotometric Image StitchingRecognitionStereo
Optic Flow
Dense Motion Models
Shape from Silhouettes
Shape from Shading and Texture
Surface Models
Segmentation
SFM
IBR & HDR…And now, who are you?And what do you expect to get out of this class?Previous experience in vision, learning, graphics?Research agenda?(Project topics?)
Let’s get started: Image formationHow are objects in the world captured in an image?
Physical parameters of image formationGeometricType of projectionCamera poseOpticalSensor’s lens typefocal length, field of view, aperturePhotometricType, direction, intensity of light reaching sensorSurfaces’ reflectance properties
Image formationLet’s design a camera
Idea 1:  put a piece of film in front of an object
Do we get a reasonable image?Slide by Steve Seitz

01Introduction.pptx - C280, Computer Vision

  • 1.
    C280, Computer VisionProf.Trevor Darrelltrevor@eecs.berkeley.edu
  • 2.
  • 3.
    PrerequisitesThis course isappropriate as a first course for graduate students with a EECS background, which should have prepared the students with these essential prerequisites:Data structuresA good working knowledge of MATLAB programming (or willingness and time to pick it up quickly!) Linear algebra Vector calculusThe course does not assume prior imaging experience, computer vision, image processing, or graphics
  • 4.
    TextThe primary coursetext will be Rick Szeliski’s draft Computer Vision: Algorithms and Applications; we will use an online copy of the June 19thdraft. The secondary text is Forsyth and Ponce, Computer Vision: A Modern Approach.
  • 5.
  • 6.
  • 7.
    MatlabProblem sets andprojects will involve Matlab programming (you are free to use alternative packages). Matlab runs on all the Instructional Windows and UNIX systems. Instructions and toolkits are described in http://inst.eecs.berkeley.edu/cgi-bin/pub.cgi?file=matlab.help. CS280 students can use their existing EECS Windows accounts in EECS instructional labs, and they can request new accounts (for non-majors) or additional access to Instructional resources by following the instructions about ’named’ accounts in http://inst.eecs.berkeley.edu/connecting.html#accounts. They can logon remotely and run it on some of our servers: http://inst.eecs.berkeley.edu/connecting.html#labs
  • 8.
    GradingThere will bethree equal components to the course gradeFive problem setsA take-home examFinal project (including evaluation of proposal document, in-class presentation, and final report)In addition, strong class participation can offset negative performance in any one of the above components.
  • 9.
    Problem setsPset0 –Basic Image Manipulation in MatlabPset1 – Filtering and FeaturesPset2 – Geometry and CalibrationPset3 – Recognition Pset4 – Stereo and MotionCan discuss, but must submit individual work
  • 10.
    Take-homeLimited time: 3daysCovers everything through hand out dateLittle programming *No discussion or collaboration allowed*
  • 11.
    Final projectSignificant novelimplementation of technique related to course contentTeams of 2 encouraged (document role!)Or journal length review article (no teams)Three components:proposal document (no more than 5 pages)in class results presentation (10 minutes)final write-up (no more than 15 pages)
  • 12.
    Class ParticipationClass participationincludesshowing upbeing able to articulate key points from last lecturehaving read assigned sections and being able to “fill in the blank” during the lecture I won’t cold call, but will solicit volunteersStrong in-class participation can offset poor performance in one of the other grade components.
  • 13.
    Course goals….(broadly speaking)principlesof image formationconvolution and image pyramidslocal feature analysismulti-view geometryimage warping and stitchingstructure from motionvisual recognitionimage-based rendering
  • 14.
    What is computervision? Done?
  • 15.
    What is computervision?Automatic understanding of images and videoComputing properties of the 3D world from visual data (measurement)Algorithms and representations to allow a machine to recognize objects, people, scenes, and activities. (perception and interpretation)
  • 16.
    Vision for measurementMulti-viewstereo forcommunity photo collectionsReal-time stereoStructure from motionNASA Mars RoverPollefeys et al.Goesele et al.Slide credit: L. Lazebnik
  • 17.
    Vision for perception,interpretationObjectsActivitiesScenesLocationsText / writingFacesGesturesMotionsEmotions…amusement parkskyThe Wicked TwisterCedar PointFerris wheelrideride12 EwaterLake Erieridetreetreepeople waiting in linepeople sitting on rideumbrellastreemaxaircarouseldeckbenchtreepedestrians
  • 18.
    Related disciplinesGraphicsAlgorithmsArtificial intelligenceImageprocessingCognitive scienceMachine learningComputer vision
  • 19.
  • 20.
    Why vision?As imagesources multiply, so do applicationsRelieve humans of boring, easy tasksEnhance human abilities: human-computer interaction, visualizationPerception for robotics / autonomous agentsOrganize and give access to visual content
  • 21.
    Why vision?Personal photoalbumsMovies, news, sportsMedical and scientific imagesSurveillance and securityImages and video are everywhere!Slide credit; L. Lazebnik
  • 22.
    Again, what iscomputer vision?Mathematics of geometry of image formation?Statistics of the natural world?Models for neuroscience?Engineering methods for matching images?Science Fiction?
  • 23.
    Vision Demo?we’re notquite there yet….Terminator 2
  • 24.
    Every picture tellsa storyGoal of computer vision is to write computer programs that can interpret images
  • 25.
    Can computers match(or beat) human vision?Yes and no (but mostly no!)humans are much better at “hard” thingscomputers can be better at “easy” things
  • 26.
    Human perception hasits shortcomings…Sinha and Poggio, Nature, 1996
  • 27.
  • 28.
    Current state ofthe artThe next slides show some examples of what current vision systems can do
  • 29.
    Earth viewers (3Dmodeling)Image from Microsoft’s Virtual Earth(see also: Google Earth)
  • 30.
    Photosynthhttp://labs.live.com/photosynth/Based on PhotoTourism technologydeveloped by Noah Snavely, Steve Seitz, and Rick Szeliski
  • 31.
    Photo Tourism overviewPhotoExplorerScene reconstructionInput photographsRelative camera positions and orientationsPoint cloudSparse correspondenceSystem for interactive browsing and exploring large collections of photos of a scene.Computes viewpoint of each photo as well as a sparse 3d model of the scene.
  • 32.
  • 33.
    Optical character recognition(OCR)Technology to convert scanned docs to textIf you have a scanner, it probably came with OCR softwareDigit recognition, AT&T labshttp://www.research.att.com/~yann/License plate readershttp://en.wikipedia.org/wiki/Automatic_number_plate_recognition
  • 34.
    Face detectionMany newdigital cameras now detect facesCanon, Sony, Fuji, …
  • 35.
    Smile detection?Sony Cyber-shot®T70 Digital Still Camera
  • 36.
    Object recognition (insupermarkets)LaneHawk by EvolutionRobotics“A smart camera is flush-mounted in the checkout lane, continuously watching for items. When an item is detected and recognized, the cashier verifies the quantity of items that were found under the basket, and continues to close the transaction. The item can remain under the basket, and with LaneHawk,you are assured to get paid for it… “
  • 37.
  • 38.
    Vision-based biometrics“How theAfghan Girl was Identified by Her Iris Patterns” Read the story
  • 39.
    Login without apassword…Face recognition systems now beginning to appear more widelyhttp://www.sensiblevision.com/Fingerprint scanners on many new laptops, other devices
  • 40.
    Object recognition (inmobile phones)This is becoming real: Microsoft ResearchPoint & Find, NokiaSnapTell.com (now amazon)
  • 41.
  • 42.
    Nokia Point andTell…http://conversations.nokia.com/home/2008/09/point-and-fin-1.html
  • 43.
    Special effects: shape captureThe Matrix movies, ESC Entertainment, XYZRGB, NRC
  • 44.
    Special effects: motion capturePirates of the Carribean, Industrial Light and MagicClick here for interactive demo
  • 45.
    SportsSportvision first downlineNice explanation on www.howstuffworks.com
  • 46.
    Smart carsMobileyeVision systemscurrently in high-end BMW, GM, Volvo models By 2010: 70% of car manufacturers.Video demoSlide content courtesy of AmnonShashua
  • 47.
    Smart carsMobileyeVision systemscurrently in high-end BMW, GM, Volvo models By 2010: 70% of car manufacturers.Video demoSlide content courtesy of AmnonShashua
  • 48.
    Vision-based interaction (andgames)Digimask: put your face on a 3D avatar.Nintendo Wii has camera-based IRtracking built in. See Lee’s work atCMU on clever tricks on using it tocreate a multi-touch display! “Game turns moviegoers into Human Joysticks”, CNETCamera tracking a crowd, based on this work.
  • 49.
    Vision in spaceNASA'SMars Exploration Rover Spirit captured this westward view from atop a low plateau where Spirit spent the closing months of 2007. Vision systems (JPL) used for several tasksPanorama stitching
  • 50.
  • 51.
  • 52.
    For more, read“Computer Vision on Mars” by Matthies et al.RoboticsNASA’s Mars Spirit Roverhttp://en.wikipedia.org/wiki/Spirit_roverhttp://www.robocup.org/
  • 53.
    Medical imagingImage guidedsurgeryGrimson et al., MIT3D imagingMRI, CT
  • 54.
    Current state ofthe artYou just saw examples of current systems.Many of these are less than 5 years oldThis is a very active research area, and rapidly changingMany new apps in the next 5 yearsTo learn more about vision applications and companiesDavid Lowe maintains an excellent overview of vision companieshttp://www.cs.ubc.ca/spider/lowe/vision.html
  • 55.
    Syllabus / Schedule(see handout)http://tinyurl.com/UCBC280CALImage FormationColorImage FilteringPyramids & RegularizationFeature Detection and MatchingGeometric AlignmentGeometric Image StitchingPhotometric Image StitchingRecognitionStereo
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
    IBR & HDR…Andnow, who are you?And what do you expect to get out of this class?Previous experience in vision, learning, graphics?Research agenda?(Project topics?)
  • 64.
    Let’s get started:Image formationHow are objects in the world captured in an image?
  • 65.
    Physical parameters ofimage formationGeometricType of projectionCamera poseOpticalSensor’s lens typefocal length, field of view, aperturePhotometricType, direction, intensity of light reaching sensorSurfaces’ reflectance properties
  • 66.
  • 67.
    Idea 1: put a piece of film in front of an object
  • 68.
    Do we geta reasonable image?Slide by Steve Seitz