SlideShare a Scribd company logo
1 of 73
Download to read offline
PENGENALAN VISI KOMPUTER
VISI KOMPUTER
2022/2023 - 3
SCHEDULE
1
Lecture via zoom meeting or in
classroom:
Every weeks on Saturday start
July 01, 2023
Note online attendance: every week
INSTRUCTOR
2
Dr. Ichsan Ibrahim, S.Si., M.Si.
ichsanibrahim@stmik-im.ac.id
Cell phone: 08158210073
Telegram: @ichsanibrahim
REPORT FORMAT
 Assignment report must have
 Title Page Assignment title, Student’s Name and Student’s ID Number (NIM), instructor’s name, campus logo, date of
report preparation (5 point)
 Summary:There needs to be a summary of the major points, conclusions, and recommendations (6 point)
 Contents (2 point)
 The body of report
 Introduction: comprises the problem statement or aim of the assignment and a short overview of basic theory
related with the task questions (10 point)
 Main :This part should clearly reflect the specific achievements of the assignment, include the process and the
results, or for some assignment, in this section you write the review of paper (50 point)
 Conclusions & Recommendations (12 point)
 Reference (4 point)
 Appendix (if necessary)
 The reports must follow the rules of scientific writing and have the correct format.(6 point)
 All assignments are submitted/upload in digital format to Kuliah Online website ComputerVision course and follow the
instructions given in the Assignment section.
3
GRADING
4
Attendance and participation (attendance the course,zoom meeting,and participate in Forum): 10 %
Homework and assignments: 20 % (4 to 6 assignments)
Mid-term exam: 30 %
Final exam: 40 %
If you don't complete or submit all assignments, it will reduce your chances of passing the course
Assignment submission deadline: pay close attention the settings/info on each task
REFERENCE
 Szeliski, R. (2022). Computer Vision: Algorithms and Applications (Texts in Computer Science) (2nd ed.),
Springer Nature Switzerland AG.
 Klette, R. (2014). Concise Computer Vision: An Introduction into Theory and Algorithms (1st ed.),
Springer-Verlag London.
 Forsyth, David A. and Ponce, J. (2012). Computer Vision: A Modern Approach (2nd ed.), Pearson Education,
Inc.
 Fisher, R. B., Breckon, T. P., Dawson-Howe, K., Fitzgibbon, A., Robertson, C. , Trucco, E., Williams, C. K. I. (2014).
Dictionary of ComputerVision and Image Processing (2nd ed.), JohnWiley & Sons Ltd.
5
COURSE ETHOS
 It's your road & yours alone. Others may walk it with you, but no one
can walk it for you.
Jalāl ad-Dīn Muḥammad Rūmī
 “If you want to build a boat, don't gather your men and women to give
them orders, explain every detail, to tell them where to find everything..
If you want to build a boat, give birth in the hearts of your men and
women to the desire for the sea”
Saint Exupéry
6
HISTORY & MILESTONE
7
 1959—Most experiments started here when neurophysiologists showed an array of images to a cat in an attempt
to correlate responses in its brain. Consequently, they found that it reacted first to the lines or hard edges, which
made it clear that image processing starts with simple shapes, such as straight edges.
 1963—Computers were able to interpret the tridimensionality of a scene from a picture, and AI was already an
academic field.
 1974—Optical character recognition (OCR) was introduced to help interpret texts printed in any typeface.
 1980—Dr. Kunihiko Fukushima, a neuroscientist from Japan, proposed Neocognitron, a hierarchical multilayered
neural network capable of robust visual pattern recognition, including corner, curve, edge, and basic shape
detection.
 1982 - David Marr, a British neuroscientist, published another influential paper—“Vision:A computational
investigation into the human representation and processing of visual information”.
https://blog.superannotate.com/introduction-to-computer-vision/
https://hackernoon.com/a-brief-history-of-computer-vision-and-
convolutional-neural-networks-8fe8aacc79f3
HISTORY & MILESTONE
 1997 - Jitendra Malik (along with his student Jianbo Shi) released a paper in which he described his
attempts to tackle perceptual grouping.
 1999 - David Lowe’s work “Object Recognition from Local Scale-Invariant Features”
 2000-2001—Studies on object recognition increased, helping in the development of the first real-time
face recognition application.
 2009 - Pedro Felzenszwalb, David McAllester, and Deva Ramanan developed “the Deformable Part
Model”
 2010—ImageNet data were made available containing millions of tagged images across various object
classes that provided the foundation of CNNs and other deep learning models used today.
 2014—COCO has also been developed to offer a dataset used in object detection and support future
research. 8
COMPUTERVISION
 What kind of scene?
 Where are the cars?
 How far is the building?
 Make computers understand images
and video.
9
INFORMATION AND KNOWLEDGE
10
The first meaning of information and/or knowledge conceives of it as a physical
objective entity which can be passed from one person to another.
Knowledge, expressed as information, is encapsulated within a physical or electronic
artefact so that it can be communicated from one person to another; in this sense,
we might speak of a textbook, a research paper, a website or a documentary film as
containing knowledge which has been articulated by the author and which can be
interpreted by many others without any loss of meaning.
IS THEVISION EASY OR HARD?
11
http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf
VISION IS REALLY HARD
 Vision is an amazing feat of natural intelligence
 Visual cortex occupies about 50% of Macaque brain
 More human brain devoted to vision than anything else
12
WHY COMPUTERVISION MATTERS
13
Safety Health Security
Comfort Access
Fun
THE 3 “R” OF COMPUTERVISION
The classic problems of computational
vision:
 Reconstruction
 Recognition
 (Re)organization
Jitendra Malik – pioneer of computer vision (student of
early AI researchers)
14
LET’STHINK
 Please think this question for 1 minute:
 Have you ever used computer vision?
 How? Where?
 Put it into three categories: Reconstruction? Recognition? (Re)organization?
15
LIST OF THE EXISTING APPLICATIONS WHICH USED COMPUTER
VISION
 Laptop: Biometrics auto-login (face recognition, 3D), OCR
 Smartphones: QR codes, computational photography (Android Lens Blur, iPhone Portrait Mode), panorama
construction (Google Photo Spheres), face detection, expression detection (smile), Snapchat filters (face tracking),
FaceID (iPhone), Night Sight (Pixel), iPhone 12 Pro (LiDAR)
 Web: Image search, Google photos (face recognition, object recognition, scene recognition, geolocalization from vision),
Facebook (image captioning), Google maps aerial imaging (image stitching),YouTube (content categorization)
 VR/AR: Outside-in tracking (HTCVIVE), inside out tracking (simultaneous localization and mapping, HoloLens), object
occlusion (dense depth estimation)
 Motion: Kinect, full body tracking of skeleton, gesture recognition, virtual try-on
 Medical imaging: CAT / MRI reconstruction, assisted diagnosis, automatic pathology, connectomics, endoscopic
surgery
16
LIST OF THE EXISTING APPLICATIONS WHICH USED COMPUTER
VISION
 Industry: Vision-based robotics (marker-based), machine-assisted router (jig), automated post,ANPR (number
plates), surveillance, drones, shopping
 Transportation: Assisted driving (everything), face tracking/iris dilation for drunkeness, drowsiness,automated
distribution (all modes)
 Media: Visual effects for film,TV (reconstruction), virtual sports replay (reconstruction), semantics-based auto
edits (reconstruction, recognition)
 Robotic – navigation and control
 Remote Sensing – land use and environmental monitoring
 Psychology,AI – exploring representation and computation in natural vision
17
OPTICAL CHARACTER RECOGNITION (OCR)
• Technology to convert images of text into text
• If you have a scanner, it probably came with OCR
software
• Or while using someTranslation Apps:
• Word Lens, a feature in GoogleTranslate,
https://en.wikipedia.org/wiki/Word_Lens
• TextGrabber, https://www.textgrabber.pro/en/
• MicrosoftTranslator,
https://www.microsoft.com/en-us/translator/
• Waygo, http://www.waygoapp.com/
18
Mail digit recognition,AT&T labs
http://www.research.att.com/~yann/
License plate readers
http://en.wikipedia.org/wiki/Automatic_number_plate_recogniti
on
FACE DETECTION
 Almost all digital cameras detect faces
 Snapchat face filters
 Why would this be useful?
 Main reason is focus.
 Also enables “smart” cropping.
19
Photo - http://thetechjournal.com/how-to/tutorial-face-swap-snapchat.xhtml
http://www.pleated-jeans.com/2016/03/02/21-snapchat-face-swaps-that-went-
horribly-wrong/
SMILE DETECTION
20
https://www.sony.com/content/sony/en/en_us/SCA/company-news/press-
releases/sony-electronics/2008/sony-adds-smile-shutter-function-to-cybershot-
wseries-digital-cameras.html
VISION-BASED BIOMETRICS
 How the Afghan Girl was Identified by Her Iris
Patterns” Read the story
 Wikipedia
http://www.cl.cam.ac.uk/~jgd1000/afghan.html
21
LOGINWITHOUT A PASSWORD
22
OBJECT RECOGNITION (IN MOBILE PHONES)
Point & Find, Nokia (obsolete)
Google Lens, https://lens.google/
23
OBJECT RECOGNITION (IN SUPERMARKETS)
24
Amazon Go,
https://www.amazon.com/b?ie=UTF8&node=16008589011
LANEHAWK
https://www.datalogic.com/eng/retail/fixed-retail-
scanners/lanehawk-lh5000-pd-830.html
LaneHawk is a loss-prevention solution that turns
bottom-of-basket (BOB) losses into profits in real time.
“Like its predecessors, the LH5000 utilizes advanced
Visual Pattern Recognition (ViPR) software plus newly
added support for 1D bar codes and Digimarc Barcode
digital watermarks to eliminate up to 90% of shrink
caused by Bottom-Of-Basket (BOB) items…….. Simply
stated, the LaneHawk system is the best to ensure all
items on the bottom of shopping carts are paid for. … “
25
3D FROM IMAGES
 Building Rome in a Day:
 Paper:Agarwal et al. 2009,
https://grail.cs.washington.edu/r
ome/rome_paper.pdf
 Website:
https://grail.cs.washington.edu/r
ome/
26
HUMAN SHAPE CAPTURE
27
http://gl.ict.usc.edu/Research/presidentialportrait
SHAPE CAPTURE
 Shape capture is techniques for
capturing the shape of physical objects.
28
The Matrix movies, ESC Entertainment, XYZRGB, NRC
http://cinetropolis.net/tarkin-care-of-business-rogue-ones-
digital-peter-cushing/
MOTION CAPTURE
Motion Capture is a cutting-edge
method of capturing all or part of
an actor's performance so that it
can be translated into the action
of a computer-generated 3D
character on screen.
29
http://www.digitalspy.com/movies/oscars/feature
/a584704/why-andy-serkis-deserves-an-oscar-
nomination-for-planet-of-the-apes/?zoomable
SPORTS: VIRTUAL PITCH MARKINGS
 Sport vision first down line
 1st & Ten is a computer system that augments televised
coverage of American football by inserting graphical
elements on the field of play as if they were physically
present; the inserted element stays fixed within the
coordinates of the playing field and obeys the visual
rules of foreground objects occluding background
objects.
 Nice explanation on www.howstuffworks.com
 http://www.sportvision.com/video.html
30
COMPUTERVISION IN SPORT
 Thomas, G., Gade, R., Moeslund,T. B., Carr, P., &
Hilton,A. (2017). Computer vision for sports:
Current applications and research topics. Computer
Vision and Image Understanding, 159, 3-18.
https://www.sportperformanceanalysis.com/s/Compu
ter-vision-for-sports-current-applications-and-
research-topics.pdf
 https://www.sportperformanceanalysis.com/article/co
mputer-vision-in-sport
31
INTERACTIVE GAMES
 Object Recognition:
http://www.youtube.com/watch?feature=iv&v=fQ59d
XOo63o
 Mario:
http://www.youtube.com/watch?v=8CTJL5lUjHg
 3D: http://www.youtube.com/watch?v=7QrnwoO1-
8A
 Robot:
http://www.youtube.com/watch?v=w8BmgtMKFbY
32
AUTO CAR
 Mobileye
 Vision systems currently in high-end BMW, GM,Volvo
models. By 2010: 70% of car manufacturers.
33
GOOGLE CAR
 Oct 9, 2010. "Google Cars Drive Themselves, in
Traffic". The NewYorkTimes. John Markoff
 June 24, 2011. "Nevada state law paves the way for
driverless cars". Financial Post. Christine Dobby
 Aug 9, 2011, "Human error blamed after Google's
driverless car sparks five-vehicle crash". The
Star (Toronto)
34
AUTOCARS
 Uber bought Carnegie Mellon University (CMU) lab
(2015),
http://www.cmu.edu/news/stories/archives/2015/febr
uary/uber-partnership.html
 http://www.wsj.com/articles/is-uber-a-friend-or-
foe-of-carnegie-mellon-in-robotics-1433084582
 http://www.freep.com/story/money/cars/ford/201
6/08/21/uber--lyft-gm-pittsburgh-autonomous-
vehicles-self-driving-autos-equal-
profits/88944036/
 Then sold it (2020),
https://techcrunch.com/2020/12/07/uber-sells-self-
driving-unit-uber-atg-in-deal-that-will-push-auroras-
valuation-to-10b/
35
COMPUTERVISION IN SPACE
 Vision systems (JPL) used for several tasks
 Panorama stitching
 3D terrain modeling
 Obstacle detection, position tracking
 For more, read : “ComputerVision on Mars” by
Matthies et al.,
https://www.ri.cmu.edu/pub_files/pub4/matthies_larr
y_2007_1/matthies_larry_2007_1.pdf
36
COMPUTERVISION IN SPACE
 NASA Perseverance lander and rover
37
https://mars.nasa.gov/mars2020/mission/technology/#Terrain-
Relative-Navigation
COMPUTERVISION ON MARS
 It has 23 cameras on it.
https://mars.nasa.gov/mars2020/spacecraft/rover/cam
eras/
 https://mars.nasa.gov/mars2020/spacecraft/rover/brai
ns
 CPU is 200MHz PowerPC arch.
 2GB storage
 256MB RAM
38
INDUSTRIAL ROBOTS
Vision-guided robots position nut runners on wheels
39
MOBILE ROBOTS
 Robotic Grasping of Novel Objects usingVision:
http://ai.stanford.edu/~asaxena/learninggrasp/IJRR_saxena_etal_
roboticgraspingofnovelobjects.pdf
 RoboCup is an international scientific initiative with the goal to
advance the state of the art of intelligent robots.When
established in 1997, the original mission was to field a team of
robots capable of winning against the human soccer World Cup
champions by 2050,https://robocup.org/
 Mars Spirit Rover, One of two rovers launched in 2003 to
explore Mars and search for signs of past life, Spirit far
outlasted her planned 90-day mission, lasting over six years.
https://www.jpl.nasa.gov/missions/mars-exploration-rover-spirit-
mer-spirit
40
MEDICAL IMAGING
 3D Reconstruction,Visualization, and Measurement
of MRI Images, https://sci-
hub.se/https://doi.org/10.1117/12.341059
 Three-Dimensional Medical CT Image
Reconstruction, https://sci-
hub.se/10.1109/ICMTMA.2009.10
 Image Guided Surgery
http://citeseerx.ist.psu.edu/viewdoc/download?doi=1
0.1.1.469.8474&rep=rep1&type=pdf
41
HUMANOID ROBOTS
42
 https://blog.bostondynamics.com/flipping-the-script-
with-atlas Boston Dynamics (2021)
AUGMENTED REALITY ANDVIRTUAL REALITY
43
MS HoloLens, Oculus,
Magic Leap,
ARCore / ARKit
AUGMENTED REALITY ANDVIRTUAL REALITY
44
Oculus (Quest)
Niantic
AI FOR PHYSICAL INTERACTION
45
COMPUTERVISION AND NEARBY FIELDS
46
Computer Graphics: Models to Images
Image Processing : Images to Images
ComputerVision: Images to Models
COMPUTERVISION AND NEARBY FIELDS
47
Derogatory summary of computer vision:
“Machine learning applied to visual data.”
Model of
the visual
world
Images, videos,
sensor data…
Images, videos,
interaction
Digital world
Real world
Information
Computer Vision Computer Graphics
SUPERHUMAN STATE OF THE ART?
Deep learning is an enormous disruption to the field. Since 2012, rapid
expansion and commercialization.
Why?
“With enough data, computer vision matches or even outperforms human
vision at most recognition tasks.”
What.
48
VISION AND SOCIETY
 Lots of data = lots of potential bias in the data.
Needs understanding of possible failures.
+
Responsible approach.
+
Techniques to overcome bias.
49
VISION AND SOCIETY
 “Vision, in my view, is the cause of the greatest
benefit to us, inasmuch as none of the accounts
now given concerning the Universe would ever
have been given if men had not seen the stars or
the sun or the heavens.”
 - Plato (Timeus, 360 BC)
“Worldview” vs.“World-sense”
50
VISION AND SOCIETY
Societal Categorizations
Prioritization ofVision •Visual Categorization •Visual Biases
 “The reason that the body has so much presence in the
West is that the world is primarily perceived by sight.The
differentiation of human bodies in terms of sex, skin
color, and cranium size is a testament to the powers
attributed to "seeing." It is believed that just by looking at
it [the body] one can tell a person's beliefs and social
position or lack thereof.”
- Oyeronke Oyewumi
(The Invention ofWomen, 1997)
51
VISION AND SOCIETY
Derogatory summary of computer vision:
“Machine learning applied to visual data.”
JH
Models of
the visual
world
Images, videos,
sensor data…
Images, videos,
interaction
Digital world
Real world
Computer Graphics
Computer Vision
Information
Culturally defined &
technologically constrained
Visual
categories
Vision
priority
Visual
categories
Digital
constraints 52
VISION AND SOCIETY
53
https://www.bbc.com/
news/technology-
51148501
As of 2020/01/22,
Google have come
out in favour of the
ban; Microsoft against.
SCOPE
54
IMAGE PROCESSINGS: EXAMPLE
 Smoothing is used to reduce noise or to produce a less pixelated image.
Most smoothing methods are based on low-pass filters, but you can also
smooth an image using an average or median value of a group of pixels (a
kernel) that moves through the image
 Image smoothing is part of preprocessing techniques intended for
removing possible image perturbations (noises) without losing image
information.Analogously, sharpening is a pre-processing technique that
plays an important role for feature extraction in image processing.
 Contrast stretching (often called normalization) is a simple image
enhancement technique that attempts to improve the contrast in an
image by ‘stretching’ the range of intensity values it contains to span a
desired range of values, the full range of pixel values that the image type
concerned allows.
 Noise removal algorithm is the process of removing or reducing the
noise from the image.The noise removal algorithms reduce or remove
the visibility of noise by smoothing the entire image leaving areas near
contrast boundaries. But these methods can obscure fine, low contrast
details
55
COMPUTERVISION METHODS: EXAMPLE
 Shape recovery from images is a fundamental problem in
computer vision. Common methods typically fall into one of
two classes: geometric or photometric approaches.
Geometric approaches take images of a scene from multiple
viewpoints, find point correspondences across images and
establish their geometric position to recover the shapes.. On
the other hand, photometric approaches recover per-pixel
surface orientation using shading cues. For example,Shape from
Shading (SFS) recovers per-pixel surface normal vectors from a
single image taken under only one distant light from a single
direction
 Cell Segmentation is a task of splitting a microscopic image
domain into segments, which represent individual instances of
cells. It is a fundamental step in many biomedical studies, and it
is regarded as a cornerstone of image-based cellular research.
56
COMPUTERVISION METHODS: EXAMPLE
 Shape-from-shading (SFS) is an important method to
reconstruct three-dimensional (3D) shape of a
surface in photometry and computer vision.
 Lambertian surface reflectance and orthographic
camera projection are two fundamental assumptions
which generally result in undesirable reconstructed
results since inaccurate imaging model is adopted
(SFS)
57
3D Surface Shape from Shading
COMPUTERVISION OUTPUT: EXAMPLE
 In stereo video/images you have more information
per frame/image allowing for creating a 3D
presentation of the image/video signal (depth)..
 Stereo image may refer to: Stereogram, an image
intended to give a 3-dimensional visual impression
(perception of depth).
58
3D Surface Shape from Stereo Images
REMEMBER:THETHREE “R”
 Jitendra Malik, UC Berkeley:Three ‘R’s of ComputerVision
 “[Further progress in] the classic problems of computational vision:
 reconstruction
 recognition
 (re)organization
 [requires us to study the interaction among these processes].”
Note: organization means building taxonomies of the visual
world so that we can move towards reasoning, not just
recognition.
59
HUMANVISIONVS COMPUTERVISION
 CCD array
 Compaction of information
 RGB Device
 Geometric stereoscopy
 Retina
 organization in layers
 ColorVision
 Vision of depth
60
COMPUTERVISION SYSTEM
ComputerVision System (CVS) is expected to have the level capability as high as HumanVisual System(HVS)
 Object detection – is an object present inthe scene ? If so, where is its boundaries ?
 Recognition – putting a label on an object
 Description – assigning properties to objects
 3D inference – interpreting a 3D scene from 2D views
 Interpreting motion
61
62
TOOLS
Image processing – noise removal, edge detection,
morphology
Feature extraction and clustering
Measure
Modelling, fitting the model, and optimization
Statistics and classification
THETHREE STAGES OF COMPUTERVISION
63
• Image to Image : Noise removal, Image Enhancement
Low Level Processing
• Image to Symbolic :A set of lines/vectors that represent the boundaries of an object in the image
Intermediate Processing
• Symbolic to Symbolic :The symbolic representation of object boundaries produces the object’s
description
High Level Processing
LOW LEVEL
64
blurring
sharpening
INTERMEDIATE LEVEL
65
INTERMEDIATE LEVEL
66
K-means
clustering
original color image regions of homogeneous color
(followed by
connected
component
analysis)
data
structure
LOW-TO HIGH-LEVEL
67
edge image
consistent
line clusters
low-
level
mid-
level
high-
level
Building Recognition
COMPUTER VISION APPROACH
68
CONSIDERATIONS IN COMPUTERVISION DESIGN
 What information do you want to get and
how is that information manifested in the
images?
 It is necessary to determine the relationship
between physical entities and their intrinsic
characteristics. For example, a house can be
distinguished from a tree because it has
straight lines as its intrinsic property, or the
sea can be distinguished from other objects
because the sea has a uniform appearance.
69
http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf
An intrinsic property is a property that an
object or a thing has of itself, including its
context. https://yandex.com/company/technologies/vision/
CONSIDERATIONS IN COMPUTERVISION DESIGN
 What knowledge is needed to recover (recover) information
 A model is needed to determine the relationship between pixel intensity and image properties, be called
 The scene model: types of features, textures, smoothness
 The Illumination model: the position and characteristics of the light source and the reflectance properties
of the object's surface
 The Sensor model: position and optical performance of the camera used, noise and distortion in the
digitization process
70
CONSIDERATIONS IN COMPUTERVISION DESIGN
 Processing speed and knowledge representation
 It is necessary to anticipate real-time processing requirements, for example, in the go-no-go quality
inspection process
 Coding knowledge (knowledge encoding) into a form that is appropriate and easy to understand is
another essential thing in considering the design of a vision system.
71
In general go/no go testing refers to a pass/fail test (or check) principle using two boundary
conditions or a binary classification. The test is passed only when the Go condition is met
and also the No go condition fails.
Encoded knowledge is expressed in terms of an accepted ‘language’ which is understood (or
must be learned) by the recipients – professional ‘jargon’, accepted disciplinary concepts,
technical languages such as statistics or the argot of street language – and which is used to
decode the meaning
THANKYOU
72
https://apod.nasa.gov/apod/astropix.html?
Astronomy Picture of the Day (6 Feb 2022): Blue Marble Earth
Image Credit: NASA,Apollo 17 Crew

More Related Content

Similar to Materi_01_VK_2223_3.pdf

Eye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AIEye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AIDr. Amarjeet Singh
 
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Universitat Politècnica de Catalunya
 
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...Albert Y. C. Chen
 
Lecture 1, 2 - An Introduction ot Computer Vision
Lecture 1, 2 - An Introduction ot Computer VisionLecture 1, 2 - An Introduction ot Computer Vision
Lecture 1, 2 - An Introduction ot Computer VisionAksam Iftikhar
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep LearningIRJET Journal
 
Introduction_to_DEEP_LEARNING.ppt machine learning that uses data, loads ...
Introduction_to_DEEP_LEARNING.ppt     machine learning that uses data, loads ...Introduction_to_DEEP_LEARNING.ppt     machine learning that uses data, loads ...
Introduction_to_DEEP_LEARNING.ppt machine learning that uses data, loads ...gkyenurkar
 
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Skolkovo Robotics Center
 
Computer vision basics
Computer vision basicsComputer vision basics
Computer vision basicsShilpa Sharma
 
Computer vision lightning talk castaway week
Computer vision lightning talk castaway weekComputer vision lightning talk castaway week
Computer vision lightning talk castaway weekChristopher Decker
 
vision_2.ppt
vision_2.pptvision_2.ppt
vision_2.pptnyomans1
 
vision.ppt
vision.pptvision.ppt
vision.pptnyomans1
 
Multimediaexercise
MultimediaexerciseMultimediaexercise
MultimediaexerciseRony Mohamed
 
Top Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and AnimationTop Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and Animationijcga
 
Facial Recognition Based Attendance System
Facial Recognition Based Attendance SystemFacial Recognition Based Attendance System
Facial Recognition Based Attendance SystemIRJET Journal
 
Godeye An Efficient System for Blinds
Godeye An Efficient System for BlindsGodeye An Efficient System for Blinds
Godeye An Efficient System for Blindsijtsrd
 
76 s201920
76 s20192076 s201920
76 s201920IJRAT
 

Similar to Materi_01_VK_2223_3.pdf (20)

Eye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AIEye(I) Still Know! – An App for the Blind Built using Web and AI
Eye(I) Still Know! – An App for the Blind Built using Web and AI
 
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
Closing, Course Offer 17/18 & Homework (D5 2017 UPC Deep Learning for Compute...
 
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...The Opportunities and Challenges of Putting the Latest Computer Vision and De...
The Opportunities and Challenges of Putting the Latest Computer Vision and De...
 
Gesture detection
Gesture detectionGesture detection
Gesture detection
 
Lecture 1, 2 - An Introduction ot Computer Vision
Lecture 1, 2 - An Introduction ot Computer VisionLecture 1, 2 - An Introduction ot Computer Vision
Lecture 1, 2 - An Introduction ot Computer Vision
 
Paper of Final Year Project.pdf
Paper of Final Year Project.pdfPaper of Final Year Project.pdf
Paper of Final Year Project.pdf
 
Unraveling Information about Deep Learning
Unraveling Information about Deep LearningUnraveling Information about Deep Learning
Unraveling Information about Deep Learning
 
Introduction_to_DEEP_LEARNING.ppt machine learning that uses data, loads ...
Introduction_to_DEEP_LEARNING.ppt     machine learning that uses data, loads ...Introduction_to_DEEP_LEARNING.ppt     machine learning that uses data, loads ...
Introduction_to_DEEP_LEARNING.ppt machine learning that uses data, loads ...
 
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
Burnaev and Notchenko. Skoltech. Bridging gap between 2D and 3D with Deep Lea...
 
Computer vision basics
Computer vision basicsComputer vision basics
Computer vision basics
 
Computer vision lightning talk castaway week
Computer vision lightning talk castaway weekComputer vision lightning talk castaway week
Computer vision lightning talk castaway week
 
vision.ppt
vision.pptvision.ppt
vision.ppt
 
vision_2.ppt
vision_2.pptvision_2.ppt
vision_2.ppt
 
vision.ppt
vision.pptvision.ppt
vision.ppt
 
Ijetcas14 435
Ijetcas14 435Ijetcas14 435
Ijetcas14 435
 
Multimediaexercise
MultimediaexerciseMultimediaexercise
Multimediaexercise
 
Top Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and AnimationTop Cited Articles in Computer Graphics and Animation
Top Cited Articles in Computer Graphics and Animation
 
Facial Recognition Based Attendance System
Facial Recognition Based Attendance SystemFacial Recognition Based Attendance System
Facial Recognition Based Attendance System
 
Godeye An Efficient System for Blinds
Godeye An Efficient System for BlindsGodeye An Efficient System for Blinds
Godeye An Efficient System for Blinds
 
76 s201920
76 s20192076 s201920
76 s201920
 

Recently uploaded

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 

Recently uploaded (20)

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 

Materi_01_VK_2223_3.pdf

  • 1. PENGENALAN VISI KOMPUTER VISI KOMPUTER 2022/2023 - 3
  • 2. SCHEDULE 1 Lecture via zoom meeting or in classroom: Every weeks on Saturday start July 01, 2023 Note online attendance: every week
  • 3. INSTRUCTOR 2 Dr. Ichsan Ibrahim, S.Si., M.Si. ichsanibrahim@stmik-im.ac.id Cell phone: 08158210073 Telegram: @ichsanibrahim
  • 4. REPORT FORMAT  Assignment report must have  Title Page Assignment title, Student’s Name and Student’s ID Number (NIM), instructor’s name, campus logo, date of report preparation (5 point)  Summary:There needs to be a summary of the major points, conclusions, and recommendations (6 point)  Contents (2 point)  The body of report  Introduction: comprises the problem statement or aim of the assignment and a short overview of basic theory related with the task questions (10 point)  Main :This part should clearly reflect the specific achievements of the assignment, include the process and the results, or for some assignment, in this section you write the review of paper (50 point)  Conclusions & Recommendations (12 point)  Reference (4 point)  Appendix (if necessary)  The reports must follow the rules of scientific writing and have the correct format.(6 point)  All assignments are submitted/upload in digital format to Kuliah Online website ComputerVision course and follow the instructions given in the Assignment section. 3
  • 5. GRADING 4 Attendance and participation (attendance the course,zoom meeting,and participate in Forum): 10 % Homework and assignments: 20 % (4 to 6 assignments) Mid-term exam: 30 % Final exam: 40 % If you don't complete or submit all assignments, it will reduce your chances of passing the course Assignment submission deadline: pay close attention the settings/info on each task
  • 6. REFERENCE  Szeliski, R. (2022). Computer Vision: Algorithms and Applications (Texts in Computer Science) (2nd ed.), Springer Nature Switzerland AG.  Klette, R. (2014). Concise Computer Vision: An Introduction into Theory and Algorithms (1st ed.), Springer-Verlag London.  Forsyth, David A. and Ponce, J. (2012). Computer Vision: A Modern Approach (2nd ed.), Pearson Education, Inc.  Fisher, R. B., Breckon, T. P., Dawson-Howe, K., Fitzgibbon, A., Robertson, C. , Trucco, E., Williams, C. K. I. (2014). Dictionary of ComputerVision and Image Processing (2nd ed.), JohnWiley & Sons Ltd. 5
  • 7. COURSE ETHOS  It's your road & yours alone. Others may walk it with you, but no one can walk it for you. Jalāl ad-Dīn Muḥammad Rūmī  “If you want to build a boat, don't gather your men and women to give them orders, explain every detail, to tell them where to find everything.. If you want to build a boat, give birth in the hearts of your men and women to the desire for the sea” Saint Exupéry 6
  • 8. HISTORY & MILESTONE 7  1959—Most experiments started here when neurophysiologists showed an array of images to a cat in an attempt to correlate responses in its brain. Consequently, they found that it reacted first to the lines or hard edges, which made it clear that image processing starts with simple shapes, such as straight edges.  1963—Computers were able to interpret the tridimensionality of a scene from a picture, and AI was already an academic field.  1974—Optical character recognition (OCR) was introduced to help interpret texts printed in any typeface.  1980—Dr. Kunihiko Fukushima, a neuroscientist from Japan, proposed Neocognitron, a hierarchical multilayered neural network capable of robust visual pattern recognition, including corner, curve, edge, and basic shape detection.  1982 - David Marr, a British neuroscientist, published another influential paper—“Vision:A computational investigation into the human representation and processing of visual information”. https://blog.superannotate.com/introduction-to-computer-vision/ https://hackernoon.com/a-brief-history-of-computer-vision-and- convolutional-neural-networks-8fe8aacc79f3
  • 9. HISTORY & MILESTONE  1997 - Jitendra Malik (along with his student Jianbo Shi) released a paper in which he described his attempts to tackle perceptual grouping.  1999 - David Lowe’s work “Object Recognition from Local Scale-Invariant Features”  2000-2001—Studies on object recognition increased, helping in the development of the first real-time face recognition application.  2009 - Pedro Felzenszwalb, David McAllester, and Deva Ramanan developed “the Deformable Part Model”  2010—ImageNet data were made available containing millions of tagged images across various object classes that provided the foundation of CNNs and other deep learning models used today.  2014—COCO has also been developed to offer a dataset used in object detection and support future research. 8
  • 10. COMPUTERVISION  What kind of scene?  Where are the cars?  How far is the building?  Make computers understand images and video. 9
  • 11. INFORMATION AND KNOWLEDGE 10 The first meaning of information and/or knowledge conceives of it as a physical objective entity which can be passed from one person to another. Knowledge, expressed as information, is encapsulated within a physical or electronic artefact so that it can be communicated from one person to another; in this sense, we might speak of a textbook, a research paper, a website or a documentary film as containing knowledge which has been articulated by the author and which can be interpreted by many others without any loss of meaning.
  • 12. IS THEVISION EASY OR HARD? 11 http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf
  • 13. VISION IS REALLY HARD  Vision is an amazing feat of natural intelligence  Visual cortex occupies about 50% of Macaque brain  More human brain devoted to vision than anything else 12
  • 14. WHY COMPUTERVISION MATTERS 13 Safety Health Security Comfort Access Fun
  • 15. THE 3 “R” OF COMPUTERVISION The classic problems of computational vision:  Reconstruction  Recognition  (Re)organization Jitendra Malik – pioneer of computer vision (student of early AI researchers) 14
  • 16. LET’STHINK  Please think this question for 1 minute:  Have you ever used computer vision?  How? Where?  Put it into three categories: Reconstruction? Recognition? (Re)organization? 15
  • 17. LIST OF THE EXISTING APPLICATIONS WHICH USED COMPUTER VISION  Laptop: Biometrics auto-login (face recognition, 3D), OCR  Smartphones: QR codes, computational photography (Android Lens Blur, iPhone Portrait Mode), panorama construction (Google Photo Spheres), face detection, expression detection (smile), Snapchat filters (face tracking), FaceID (iPhone), Night Sight (Pixel), iPhone 12 Pro (LiDAR)  Web: Image search, Google photos (face recognition, object recognition, scene recognition, geolocalization from vision), Facebook (image captioning), Google maps aerial imaging (image stitching),YouTube (content categorization)  VR/AR: Outside-in tracking (HTCVIVE), inside out tracking (simultaneous localization and mapping, HoloLens), object occlusion (dense depth estimation)  Motion: Kinect, full body tracking of skeleton, gesture recognition, virtual try-on  Medical imaging: CAT / MRI reconstruction, assisted diagnosis, automatic pathology, connectomics, endoscopic surgery 16
  • 18. LIST OF THE EXISTING APPLICATIONS WHICH USED COMPUTER VISION  Industry: Vision-based robotics (marker-based), machine-assisted router (jig), automated post,ANPR (number plates), surveillance, drones, shopping  Transportation: Assisted driving (everything), face tracking/iris dilation for drunkeness, drowsiness,automated distribution (all modes)  Media: Visual effects for film,TV (reconstruction), virtual sports replay (reconstruction), semantics-based auto edits (reconstruction, recognition)  Robotic – navigation and control  Remote Sensing – land use and environmental monitoring  Psychology,AI – exploring representation and computation in natural vision 17
  • 19. OPTICAL CHARACTER RECOGNITION (OCR) • Technology to convert images of text into text • If you have a scanner, it probably came with OCR software • Or while using someTranslation Apps: • Word Lens, a feature in GoogleTranslate, https://en.wikipedia.org/wiki/Word_Lens • TextGrabber, https://www.textgrabber.pro/en/ • MicrosoftTranslator, https://www.microsoft.com/en-us/translator/ • Waygo, http://www.waygoapp.com/ 18 Mail digit recognition,AT&T labs http://www.research.att.com/~yann/ License plate readers http://en.wikipedia.org/wiki/Automatic_number_plate_recogniti on
  • 20. FACE DETECTION  Almost all digital cameras detect faces  Snapchat face filters  Why would this be useful?  Main reason is focus.  Also enables “smart” cropping. 19 Photo - http://thetechjournal.com/how-to/tutorial-face-swap-snapchat.xhtml http://www.pleated-jeans.com/2016/03/02/21-snapchat-face-swaps-that-went- horribly-wrong/
  • 22. VISION-BASED BIOMETRICS  How the Afghan Girl was Identified by Her Iris Patterns” Read the story  Wikipedia http://www.cl.cam.ac.uk/~jgd1000/afghan.html 21
  • 24. OBJECT RECOGNITION (IN MOBILE PHONES) Point & Find, Nokia (obsolete) Google Lens, https://lens.google/ 23
  • 25. OBJECT RECOGNITION (IN SUPERMARKETS) 24 Amazon Go, https://www.amazon.com/b?ie=UTF8&node=16008589011
  • 26. LANEHAWK https://www.datalogic.com/eng/retail/fixed-retail- scanners/lanehawk-lh5000-pd-830.html LaneHawk is a loss-prevention solution that turns bottom-of-basket (BOB) losses into profits in real time. “Like its predecessors, the LH5000 utilizes advanced Visual Pattern Recognition (ViPR) software plus newly added support for 1D bar codes and Digimarc Barcode digital watermarks to eliminate up to 90% of shrink caused by Bottom-Of-Basket (BOB) items…….. Simply stated, the LaneHawk system is the best to ensure all items on the bottom of shopping carts are paid for. … “ 25
  • 27. 3D FROM IMAGES  Building Rome in a Day:  Paper:Agarwal et al. 2009, https://grail.cs.washington.edu/r ome/rome_paper.pdf  Website: https://grail.cs.washington.edu/r ome/ 26
  • 29. SHAPE CAPTURE  Shape capture is techniques for capturing the shape of physical objects. 28 The Matrix movies, ESC Entertainment, XYZRGB, NRC http://cinetropolis.net/tarkin-care-of-business-rogue-ones- digital-peter-cushing/
  • 30. MOTION CAPTURE Motion Capture is a cutting-edge method of capturing all or part of an actor's performance so that it can be translated into the action of a computer-generated 3D character on screen. 29 http://www.digitalspy.com/movies/oscars/feature /a584704/why-andy-serkis-deserves-an-oscar- nomination-for-planet-of-the-apes/?zoomable
  • 31. SPORTS: VIRTUAL PITCH MARKINGS  Sport vision first down line  1st & Ten is a computer system that augments televised coverage of American football by inserting graphical elements on the field of play as if they were physically present; the inserted element stays fixed within the coordinates of the playing field and obeys the visual rules of foreground objects occluding background objects.  Nice explanation on www.howstuffworks.com  http://www.sportvision.com/video.html 30
  • 32. COMPUTERVISION IN SPORT  Thomas, G., Gade, R., Moeslund,T. B., Carr, P., & Hilton,A. (2017). Computer vision for sports: Current applications and research topics. Computer Vision and Image Understanding, 159, 3-18. https://www.sportperformanceanalysis.com/s/Compu ter-vision-for-sports-current-applications-and- research-topics.pdf  https://www.sportperformanceanalysis.com/article/co mputer-vision-in-sport 31
  • 33. INTERACTIVE GAMES  Object Recognition: http://www.youtube.com/watch?feature=iv&v=fQ59d XOo63o  Mario: http://www.youtube.com/watch?v=8CTJL5lUjHg  3D: http://www.youtube.com/watch?v=7QrnwoO1- 8A  Robot: http://www.youtube.com/watch?v=w8BmgtMKFbY 32
  • 34. AUTO CAR  Mobileye  Vision systems currently in high-end BMW, GM,Volvo models. By 2010: 70% of car manufacturers. 33
  • 35. GOOGLE CAR  Oct 9, 2010. "Google Cars Drive Themselves, in Traffic". The NewYorkTimes. John Markoff  June 24, 2011. "Nevada state law paves the way for driverless cars". Financial Post. Christine Dobby  Aug 9, 2011, "Human error blamed after Google's driverless car sparks five-vehicle crash". The Star (Toronto) 34
  • 36. AUTOCARS  Uber bought Carnegie Mellon University (CMU) lab (2015), http://www.cmu.edu/news/stories/archives/2015/febr uary/uber-partnership.html  http://www.wsj.com/articles/is-uber-a-friend-or- foe-of-carnegie-mellon-in-robotics-1433084582  http://www.freep.com/story/money/cars/ford/201 6/08/21/uber--lyft-gm-pittsburgh-autonomous- vehicles-self-driving-autos-equal- profits/88944036/  Then sold it (2020), https://techcrunch.com/2020/12/07/uber-sells-self- driving-unit-uber-atg-in-deal-that-will-push-auroras- valuation-to-10b/ 35
  • 37. COMPUTERVISION IN SPACE  Vision systems (JPL) used for several tasks  Panorama stitching  3D terrain modeling  Obstacle detection, position tracking  For more, read : “ComputerVision on Mars” by Matthies et al., https://www.ri.cmu.edu/pub_files/pub4/matthies_larr y_2007_1/matthies_larry_2007_1.pdf 36
  • 38. COMPUTERVISION IN SPACE  NASA Perseverance lander and rover 37 https://mars.nasa.gov/mars2020/mission/technology/#Terrain- Relative-Navigation
  • 39. COMPUTERVISION ON MARS  It has 23 cameras on it. https://mars.nasa.gov/mars2020/spacecraft/rover/cam eras/  https://mars.nasa.gov/mars2020/spacecraft/rover/brai ns  CPU is 200MHz PowerPC arch.  2GB storage  256MB RAM 38
  • 40. INDUSTRIAL ROBOTS Vision-guided robots position nut runners on wheels 39
  • 41. MOBILE ROBOTS  Robotic Grasping of Novel Objects usingVision: http://ai.stanford.edu/~asaxena/learninggrasp/IJRR_saxena_etal_ roboticgraspingofnovelobjects.pdf  RoboCup is an international scientific initiative with the goal to advance the state of the art of intelligent robots.When established in 1997, the original mission was to field a team of robots capable of winning against the human soccer World Cup champions by 2050,https://robocup.org/  Mars Spirit Rover, One of two rovers launched in 2003 to explore Mars and search for signs of past life, Spirit far outlasted her planned 90-day mission, lasting over six years. https://www.jpl.nasa.gov/missions/mars-exploration-rover-spirit- mer-spirit 40
  • 42. MEDICAL IMAGING  3D Reconstruction,Visualization, and Measurement of MRI Images, https://sci- hub.se/https://doi.org/10.1117/12.341059  Three-Dimensional Medical CT Image Reconstruction, https://sci- hub.se/10.1109/ICMTMA.2009.10  Image Guided Surgery http://citeseerx.ist.psu.edu/viewdoc/download?doi=1 0.1.1.469.8474&rep=rep1&type=pdf 41
  • 44. AUGMENTED REALITY ANDVIRTUAL REALITY 43 MS HoloLens, Oculus, Magic Leap, ARCore / ARKit
  • 45. AUGMENTED REALITY ANDVIRTUAL REALITY 44 Oculus (Quest) Niantic
  • 46. AI FOR PHYSICAL INTERACTION 45
  • 47. COMPUTERVISION AND NEARBY FIELDS 46 Computer Graphics: Models to Images Image Processing : Images to Images ComputerVision: Images to Models
  • 48. COMPUTERVISION AND NEARBY FIELDS 47 Derogatory summary of computer vision: “Machine learning applied to visual data.” Model of the visual world Images, videos, sensor data… Images, videos, interaction Digital world Real world Information Computer Vision Computer Graphics
  • 49. SUPERHUMAN STATE OF THE ART? Deep learning is an enormous disruption to the field. Since 2012, rapid expansion and commercialization. Why? “With enough data, computer vision matches or even outperforms human vision at most recognition tasks.” What. 48
  • 50. VISION AND SOCIETY  Lots of data = lots of potential bias in the data. Needs understanding of possible failures. + Responsible approach. + Techniques to overcome bias. 49
  • 51. VISION AND SOCIETY  “Vision, in my view, is the cause of the greatest benefit to us, inasmuch as none of the accounts now given concerning the Universe would ever have been given if men had not seen the stars or the sun or the heavens.”  - Plato (Timeus, 360 BC) “Worldview” vs.“World-sense” 50
  • 52. VISION AND SOCIETY Societal Categorizations Prioritization ofVision •Visual Categorization •Visual Biases  “The reason that the body has so much presence in the West is that the world is primarily perceived by sight.The differentiation of human bodies in terms of sex, skin color, and cranium size is a testament to the powers attributed to "seeing." It is believed that just by looking at it [the body] one can tell a person's beliefs and social position or lack thereof.” - Oyeronke Oyewumi (The Invention ofWomen, 1997) 51
  • 53. VISION AND SOCIETY Derogatory summary of computer vision: “Machine learning applied to visual data.” JH Models of the visual world Images, videos, sensor data… Images, videos, interaction Digital world Real world Computer Graphics Computer Vision Information Culturally defined & technologically constrained Visual categories Vision priority Visual categories Digital constraints 52
  • 54. VISION AND SOCIETY 53 https://www.bbc.com/ news/technology- 51148501 As of 2020/01/22, Google have come out in favour of the ban; Microsoft against.
  • 56. IMAGE PROCESSINGS: EXAMPLE  Smoothing is used to reduce noise or to produce a less pixelated image. Most smoothing methods are based on low-pass filters, but you can also smooth an image using an average or median value of a group of pixels (a kernel) that moves through the image  Image smoothing is part of preprocessing techniques intended for removing possible image perturbations (noises) without losing image information.Analogously, sharpening is a pre-processing technique that plays an important role for feature extraction in image processing.  Contrast stretching (often called normalization) is a simple image enhancement technique that attempts to improve the contrast in an image by ‘stretching’ the range of intensity values it contains to span a desired range of values, the full range of pixel values that the image type concerned allows.  Noise removal algorithm is the process of removing or reducing the noise from the image.The noise removal algorithms reduce or remove the visibility of noise by smoothing the entire image leaving areas near contrast boundaries. But these methods can obscure fine, low contrast details 55
  • 57. COMPUTERVISION METHODS: EXAMPLE  Shape recovery from images is a fundamental problem in computer vision. Common methods typically fall into one of two classes: geometric or photometric approaches. Geometric approaches take images of a scene from multiple viewpoints, find point correspondences across images and establish their geometric position to recover the shapes.. On the other hand, photometric approaches recover per-pixel surface orientation using shading cues. For example,Shape from Shading (SFS) recovers per-pixel surface normal vectors from a single image taken under only one distant light from a single direction  Cell Segmentation is a task of splitting a microscopic image domain into segments, which represent individual instances of cells. It is a fundamental step in many biomedical studies, and it is regarded as a cornerstone of image-based cellular research. 56
  • 58. COMPUTERVISION METHODS: EXAMPLE  Shape-from-shading (SFS) is an important method to reconstruct three-dimensional (3D) shape of a surface in photometry and computer vision.  Lambertian surface reflectance and orthographic camera projection are two fundamental assumptions which generally result in undesirable reconstructed results since inaccurate imaging model is adopted (SFS) 57 3D Surface Shape from Shading
  • 59. COMPUTERVISION OUTPUT: EXAMPLE  In stereo video/images you have more information per frame/image allowing for creating a 3D presentation of the image/video signal (depth)..  Stereo image may refer to: Stereogram, an image intended to give a 3-dimensional visual impression (perception of depth). 58 3D Surface Shape from Stereo Images
  • 60. REMEMBER:THETHREE “R”  Jitendra Malik, UC Berkeley:Three ‘R’s of ComputerVision  “[Further progress in] the classic problems of computational vision:  reconstruction  recognition  (re)organization  [requires us to study the interaction among these processes].” Note: organization means building taxonomies of the visual world so that we can move towards reasoning, not just recognition. 59
  • 61. HUMANVISIONVS COMPUTERVISION  CCD array  Compaction of information  RGB Device  Geometric stereoscopy  Retina  organization in layers  ColorVision  Vision of depth 60
  • 62. COMPUTERVISION SYSTEM ComputerVision System (CVS) is expected to have the level capability as high as HumanVisual System(HVS)  Object detection – is an object present inthe scene ? If so, where is its boundaries ?  Recognition – putting a label on an object  Description – assigning properties to objects  3D inference – interpreting a 3D scene from 2D views  Interpreting motion 61
  • 63. 62 TOOLS Image processing – noise removal, edge detection, morphology Feature extraction and clustering Measure Modelling, fitting the model, and optimization Statistics and classification
  • 64. THETHREE STAGES OF COMPUTERVISION 63 • Image to Image : Noise removal, Image Enhancement Low Level Processing • Image to Symbolic :A set of lines/vectors that represent the boundaries of an object in the image Intermediate Processing • Symbolic to Symbolic :The symbolic representation of object boundaries produces the object’s description High Level Processing
  • 67. INTERMEDIATE LEVEL 66 K-means clustering original color image regions of homogeneous color (followed by connected component analysis) data structure
  • 68. LOW-TO HIGH-LEVEL 67 edge image consistent line clusters low- level mid- level high- level Building Recognition
  • 70. CONSIDERATIONS IN COMPUTERVISION DESIGN  What information do you want to get and how is that information manifested in the images?  It is necessary to determine the relationship between physical entities and their intrinsic characteristics. For example, a house can be distinguished from a tree because it has straight lines as its intrinsic property, or the sea can be distinguished from other objects because the sea has a uniform appearance. 69 http://persci.mit.edu/pub_pdfs/adelson_spie_01.pdf An intrinsic property is a property that an object or a thing has of itself, including its context. https://yandex.com/company/technologies/vision/
  • 71. CONSIDERATIONS IN COMPUTERVISION DESIGN  What knowledge is needed to recover (recover) information  A model is needed to determine the relationship between pixel intensity and image properties, be called  The scene model: types of features, textures, smoothness  The Illumination model: the position and characteristics of the light source and the reflectance properties of the object's surface  The Sensor model: position and optical performance of the camera used, noise and distortion in the digitization process 70
  • 72. CONSIDERATIONS IN COMPUTERVISION DESIGN  Processing speed and knowledge representation  It is necessary to anticipate real-time processing requirements, for example, in the go-no-go quality inspection process  Coding knowledge (knowledge encoding) into a form that is appropriate and easy to understand is another essential thing in considering the design of a vision system. 71 In general go/no go testing refers to a pass/fail test (or check) principle using two boundary conditions or a binary classification. The test is passed only when the Go condition is met and also the No go condition fails. Encoded knowledge is expressed in terms of an accepted ‘language’ which is understood (or must be learned) by the recipients – professional ‘jargon’, accepted disciplinary concepts, technical languages such as statistics or the argot of street language – and which is used to decode the meaning
  • 73. THANKYOU 72 https://apod.nasa.gov/apod/astropix.html? Astronomy Picture of the Day (6 Feb 2022): Blue Marble Earth Image Credit: NASA,Apollo 17 Crew