HOW MACHINE SEE?
•Computer Vision, commonly referred to as CV, enables systems
to see, observe, and understand. Computer Vision is similar to
human vision as it trains machines with cameras, data, and
algorithms similar to retinas, optic nerves, and a visual cortex
as in human vision. CV derives meaningful information from
digital images, videos and other visual input and makes
recommendations or takes actions accordingly.
•Computer Vision is a field of artificial intelligence (AI) that
uses Sensing devices and deep learning models to help
systems understand and interpret the visual world.
•Computer Vision is sometimes called Machine Vision.
• fundamental aspect of computer vision lies in
understanding the basics of digital images.
• Basics of digital images
A digital image is a picture that is stored on a computer in the
form of a sequence of numbers that computers can
understand. Digital images can be created in several ways
like using design software (like Paint or Photoshop), taking
one on a digital camera, or scan one using a scanner.
Interpretation of Image in digital form
When a computer processes an image, it perceives it as a
collection of tiny squares known as pixels.
Each pixel, short for "picture element," represents a specific
color value
During the process of digitization, an image is converted into
a grid of pixels. The resolution of the image is determined by
the number of pixels it contains; the higher the resolution,
the more detailed the image appears and the closer it
resembles the original scene.
WORKING OF COMPUTER
VISION
In representing images digitally, each
pixel is assigned a numerical value.
For monochrome images, such as
black and white photographs, a
pixel's value typically ranges from 0
to 255. A value of 0 corresponds to
black, while 255 represents white.
COMPUTER
VISION –
PROCESS:
five stages
1. Image Acquisition
2. Preprocessing
3. Feature Extraction
4. Detection/Segmentation
5. High-Level Processing
Stage 1: Image Acquisition in Computer Vision
• Introduction to Image Acquisition
• Definition: Capturing digital images or videos for
analysis
• Importance: Provides raw data for further processing
• Methods of Image Acquisition
• Digital Cameras
• Scanning Physical Photos/Documents
• Design Software
• Image Quality & Device Capabilities
• Impact on effectiveness of analysis – quality and
characteristics of the acquired images
• Higher vs. lower resolution devices - Higher-resolution
devices can capture finer details and produce clearer
images compared to those with lower resolutions
• Finer details & clarity
Environmental Factors
•Lighting conditions & image quality
•Capture angles affecting perspectives
capturing images in low-light conditions may result
in poorer image quality, while adjusting the angle
of capture can provide different perspectives of the
scene.
Specialized Imaging Techniques
•MRI (Magnetic Resonance Imaging): High-detail
images of tissues
•CT (Computed Tomography): Internal composition
analysis
•detailed images of biological tissues or structures
Applications in Science & Medicine
•Diagnosis
•Research
•Treatment Planning
Stage 2: pre processing
Enhance the quality of the acquired image
Noise Reduction - Removes unwanted elements like blurriness, random spots, or
distortions. This makes the image clearer and reduces distractions for algorithms.
Image Normalization - Adjusts the pixel values of an image so they fall within a
consistent range
Resizing/Cropping - Changes the size or aspect ratio of the image to make it uniform
Histogram Equalization: Adjusts the brightness and contrast of an image
Stage 3: Feature extraction –
involves identifying and extracting relevant visual patterns or attributes
from the pre-processed image
• Edge detection - identifies the
boundaries between different
regions in an image where there is
a significant change in intensity
• Corner detection - identifies points
where two or more edges meet -
identifying sharp changes in image
• Texture analysis - extracts features
like smoothness, roughness, or
repetition in an image
• Colour-based feature extraction -
quantifies colour distributions
within the image
In deep learning-based approaches, feature extraction
is often performed automatically by convolutional
neural networks (CNNs) during the training process.
Detection/Segmentation:
focusing on identifying objects or regions of interest within an image
• stage is categorized into two
primary tasks:
• 1. Single Object Tasks
• 2. Multiple Object Tasks
• Multiple Object Tasks - aim to identify and distinguish between various objects within the
image
• Object Detection
• Image segmentation
Semantic Segmentation
Instance Segmentation
Object Detection
• Identifies and locates multiple objects in an image by analyzing the
entire image and drawing bounding boxes.
• 📌 Key Difference:
• Classification: Determines the class of the entire image.
• Detection: Identifies and classifies multiple objects within an image.
• 📌 Bounding Boxes: Drawn around detected objects and labeled
according to their class.
• 📌 Algorithms Used:
• R-CNN (Region-Based Convolutional Neural Network)
• R-FCN (Region-based Fully Convolutional Network)
• YOLO (You Only Look Once)
• SSD (Single Shot Detector)
Image Segmentation
• 📌 Definition:
• Identifies and groups pixels with similar characteristics.
• Creates a pixel-wise mask for each object in the image for better
granularity.
• 📌 Edge Detection:
• Helps segment images by detecting discontinuities in brightness.
• 📌 Types of Image Segmentation:
1 ️
1️⃣Semantic Segmentation:
• Classifies pixels belonging to a particular class.
• Objects in the same class are not differentiated.
• Example: Pixels labeled as "animals" but do not specify the type of animal.
• 2️⃣Instance Segmentation:
• Classifies pixels belonging to individual instances.
• Differentiates all objects even if they belong to the same class.
• Example: Pixels separately masked for different animals.
• 🔹 Add visuals such as labeled segmented images to enhance
understanding.
🔹 Use contrasting colors to highlight different pixel classes.
High-Level Processing in
Computer Vision
• 📌 Definition:
• The final stage of computer vision that interprets and extracts meaningful
information from detected objects or regions in images.
• Enables computers to understand visual content and make informed
decisions.
• 📌 Key Tasks Involved:
1 ️
1️⃣Object Recognition: Identifies and categorizes objects in an image.
2️⃣Scene Understanding: Analyzes relationships between elements in a
scene.
3 ️
3️⃣Context Analysis: Derives insights from complex visual data for
intelligent decision-making.
• 📌 Techniques Used:
• Machine Learning
• Deep Learning Algorithms
• 📌 Applications:
✅ Autonomous Driving: Helps vehicles understand surroundings for
navigation.
✅ Medical Diagnostics: Assists in analyzing medical images for disease
detection.
✅ Surveillance & Security: Enhances monitoring by identifying objects
and activities.
• 🔹 Include visuals such as scene analysis diagrams to illustrate concepts.
🔹 Use structured layouts to highlight the different stages of high-level
processing.
Applications of CV
• 📌 Overview:
• Computer vision is integrated into major products we use daily.
• Helps machines interpret and make sense of visual data.
• 📌 Key Applications:
✅ Facial Recognition: Used in social media platforms to detect and tag users.
✅ Healthcare: Assists in evaluating tumors, diagnosing diseases, and analyzing medical images.
✅ Self-Driving Vehicles: Captures surroundings, detects objects, reads traffic signals, and pedestrian paths.
✅ Optical Character Recognition (OCR): Extracts printed or handwritten text from images or documents.
✅ Machine Inspection: Detects defects, irregularities, and ensures quality in manufactured products.
✅ 3D Model Building: Constructs 3D computer models for applications like robotics, autonomous driving, and
AR/VR.
✅ Surveillance: CCTV footage aids in identifying suspicious behavior, dangerous objects, and maintaining security.
✅ Fingerprint & Biometrics: Used for identity validation and security systems.
• 🔹 Enhance slide with relevant images (e.g., facial recognition, autonomous cars, medical imaging).
🔹 Use clear visuals like icons or diagrams for each application to improve understanding.
Challenges of Computer Vision
• 📌 Overview:
• Computer vision faces several hurdles in accurately interpreting visual data.
• These challenges impact its effectiveness across various applications.
• 📌 Key Challenges:
1 ️
1️⃣Reasoning & Analytical Issues:
• Requires more than image identification; must accurately interpret content.
• Strong reasoning and analytical skills are crucial for extracting insights.
• 2 ️
2️⃣Difficulty in Image Acquisition:
• Lighting variations, perspectives, and occlusions complicate data collection.
• High-quality image acquisition is necessary for reliable analysis.
• 3 ️
3️⃣Privacy & Security Concerns:
• Vision-powered surveillance raises ethical concerns and privacy risks.
• Regulatory scrutiny over facial recognition and personal data usage.
• 4️⃣Duplicate & False Content:
• Risk of misinformation through manipulated or duplicate media.
• Data breaches expose vulnerabilities in image processing algorithms.
• 🔹 Add relevant visuals (e.g., security concerns, data breaches, occlusion examples).
🔹 Use structured layouts for a clear and visually appealing presentation.
Future of CV
• 📌 Overview:
• Evolution from basic image processing to complex visual understanding.
• Advanced deep learning and labeled training data enhance accuracy.
• 📌 Breakthroughs Driving Progress:
✅ Deep Learning & AI: Enables human-like interpretation of visual data.
✅ Big Data & Training Sets: Vast labeled datasets improve machine learning
capabilities.
• 📌 Future Applications:
🚀 Personalized Healthcare Diagnostics: AI-driven medical imaging for disease
detection.
🚀 Immersive AR & VR Experiences: Enhancing augmented and virtual reality
technologies.
🚀 Smart Surveillance & Security: Advanced monitoring systems ensuring public
safety.
🚀 Autonomous Systems: Smarter self-driving vehicles and robotics for various
industries.
• 📌 The Path Forward:
🔹 Innovation & Collaboration: Advancing computer vision through teamwork
and research.
🔹 Ethical Considerations: Balancing AI advancements with responsible AI
practices.
• 🔹 Include futuristic visuals like AI-driven medical imaging and AR applications.
🔹 Use bold headers and engaging visuals to maintain audience attention.

ARTIFICIAL INTELIIGENCE _HOW MACHINE SEE.pptx

  • 1.
    HOW MACHINE SEE? •ComputerVision, commonly referred to as CV, enables systems to see, observe, and understand. Computer Vision is similar to human vision as it trains machines with cameras, data, and algorithms similar to retinas, optic nerves, and a visual cortex as in human vision. CV derives meaningful information from digital images, videos and other visual input and makes recommendations or takes actions accordingly. •Computer Vision is a field of artificial intelligence (AI) that uses Sensing devices and deep learning models to help systems understand and interpret the visual world. •Computer Vision is sometimes called Machine Vision.
  • 2.
    • fundamental aspectof computer vision lies in understanding the basics of digital images. • Basics of digital images A digital image is a picture that is stored on a computer in the form of a sequence of numbers that computers can understand. Digital images can be created in several ways like using design software (like Paint or Photoshop), taking one on a digital camera, or scan one using a scanner. Interpretation of Image in digital form When a computer processes an image, it perceives it as a collection of tiny squares known as pixels. Each pixel, short for "picture element," represents a specific color value During the process of digitization, an image is converted into a grid of pixels. The resolution of the image is determined by the number of pixels it contains; the higher the resolution, the more detailed the image appears and the closer it resembles the original scene. WORKING OF COMPUTER VISION In representing images digitally, each pixel is assigned a numerical value. For monochrome images, such as black and white photographs, a pixel's value typically ranges from 0 to 255. A value of 0 corresponds to black, while 255 represents white.
  • 3.
    COMPUTER VISION – PROCESS: five stages 1.Image Acquisition 2. Preprocessing 3. Feature Extraction 4. Detection/Segmentation 5. High-Level Processing
  • 4.
    Stage 1: ImageAcquisition in Computer Vision • Introduction to Image Acquisition • Definition: Capturing digital images or videos for analysis • Importance: Provides raw data for further processing • Methods of Image Acquisition • Digital Cameras • Scanning Physical Photos/Documents • Design Software • Image Quality & Device Capabilities • Impact on effectiveness of analysis – quality and characteristics of the acquired images • Higher vs. lower resolution devices - Higher-resolution devices can capture finer details and produce clearer images compared to those with lower resolutions • Finer details & clarity Environmental Factors •Lighting conditions & image quality •Capture angles affecting perspectives capturing images in low-light conditions may result in poorer image quality, while adjusting the angle of capture can provide different perspectives of the scene. Specialized Imaging Techniques •MRI (Magnetic Resonance Imaging): High-detail images of tissues •CT (Computed Tomography): Internal composition analysis •detailed images of biological tissues or structures Applications in Science & Medicine •Diagnosis •Research •Treatment Planning
  • 5.
    Stage 2: preprocessing Enhance the quality of the acquired image Noise Reduction - Removes unwanted elements like blurriness, random spots, or distortions. This makes the image clearer and reduces distractions for algorithms. Image Normalization - Adjusts the pixel values of an image so they fall within a consistent range Resizing/Cropping - Changes the size or aspect ratio of the image to make it uniform Histogram Equalization: Adjusts the brightness and contrast of an image
  • 6.
    Stage 3: Featureextraction – involves identifying and extracting relevant visual patterns or attributes from the pre-processed image • Edge detection - identifies the boundaries between different regions in an image where there is a significant change in intensity • Corner detection - identifies points where two or more edges meet - identifying sharp changes in image • Texture analysis - extracts features like smoothness, roughness, or repetition in an image • Colour-based feature extraction - quantifies colour distributions within the image In deep learning-based approaches, feature extraction is often performed automatically by convolutional neural networks (CNNs) during the training process.
  • 7.
    Detection/Segmentation: focusing on identifyingobjects or regions of interest within an image • stage is categorized into two primary tasks: • 1. Single Object Tasks • 2. Multiple Object Tasks
  • 8.
    • Multiple ObjectTasks - aim to identify and distinguish between various objects within the image • Object Detection • Image segmentation Semantic Segmentation Instance Segmentation
  • 9.
    Object Detection • Identifiesand locates multiple objects in an image by analyzing the entire image and drawing bounding boxes. • 📌 Key Difference: • Classification: Determines the class of the entire image. • Detection: Identifies and classifies multiple objects within an image. • 📌 Bounding Boxes: Drawn around detected objects and labeled according to their class. • 📌 Algorithms Used: • R-CNN (Region-Based Convolutional Neural Network) • R-FCN (Region-based Fully Convolutional Network) • YOLO (You Only Look Once) • SSD (Single Shot Detector)
  • 10.
    Image Segmentation • 📌Definition: • Identifies and groups pixels with similar characteristics. • Creates a pixel-wise mask for each object in the image for better granularity. • 📌 Edge Detection: • Helps segment images by detecting discontinuities in brightness. • 📌 Types of Image Segmentation: 1 ️ 1️⃣Semantic Segmentation: • Classifies pixels belonging to a particular class. • Objects in the same class are not differentiated. • Example: Pixels labeled as "animals" but do not specify the type of animal. • 2️⃣Instance Segmentation: • Classifies pixels belonging to individual instances. • Differentiates all objects even if they belong to the same class. • Example: Pixels separately masked for different animals. • 🔹 Add visuals such as labeled segmented images to enhance understanding. 🔹 Use contrasting colors to highlight different pixel classes.
  • 11.
    High-Level Processing in ComputerVision • 📌 Definition: • The final stage of computer vision that interprets and extracts meaningful information from detected objects or regions in images. • Enables computers to understand visual content and make informed decisions. • 📌 Key Tasks Involved: 1 ️ 1️⃣Object Recognition: Identifies and categorizes objects in an image. 2️⃣Scene Understanding: Analyzes relationships between elements in a scene. 3 ️ 3️⃣Context Analysis: Derives insights from complex visual data for intelligent decision-making. • 📌 Techniques Used: • Machine Learning • Deep Learning Algorithms • 📌 Applications: ✅ Autonomous Driving: Helps vehicles understand surroundings for navigation. ✅ Medical Diagnostics: Assists in analyzing medical images for disease detection. ✅ Surveillance & Security: Enhances monitoring by identifying objects and activities. • 🔹 Include visuals such as scene analysis diagrams to illustrate concepts. 🔹 Use structured layouts to highlight the different stages of high-level processing.
  • 12.
    Applications of CV •📌 Overview: • Computer vision is integrated into major products we use daily. • Helps machines interpret and make sense of visual data. • 📌 Key Applications: ✅ Facial Recognition: Used in social media platforms to detect and tag users. ✅ Healthcare: Assists in evaluating tumors, diagnosing diseases, and analyzing medical images. ✅ Self-Driving Vehicles: Captures surroundings, detects objects, reads traffic signals, and pedestrian paths. ✅ Optical Character Recognition (OCR): Extracts printed or handwritten text from images or documents. ✅ Machine Inspection: Detects defects, irregularities, and ensures quality in manufactured products. ✅ 3D Model Building: Constructs 3D computer models for applications like robotics, autonomous driving, and AR/VR. ✅ Surveillance: CCTV footage aids in identifying suspicious behavior, dangerous objects, and maintaining security. ✅ Fingerprint & Biometrics: Used for identity validation and security systems. • 🔹 Enhance slide with relevant images (e.g., facial recognition, autonomous cars, medical imaging). 🔹 Use clear visuals like icons or diagrams for each application to improve understanding.
  • 13.
    Challenges of ComputerVision • 📌 Overview: • Computer vision faces several hurdles in accurately interpreting visual data. • These challenges impact its effectiveness across various applications. • 📌 Key Challenges: 1 ️ 1️⃣Reasoning & Analytical Issues: • Requires more than image identification; must accurately interpret content. • Strong reasoning and analytical skills are crucial for extracting insights. • 2 ️ 2️⃣Difficulty in Image Acquisition: • Lighting variations, perspectives, and occlusions complicate data collection. • High-quality image acquisition is necessary for reliable analysis. • 3 ️ 3️⃣Privacy & Security Concerns: • Vision-powered surveillance raises ethical concerns and privacy risks. • Regulatory scrutiny over facial recognition and personal data usage. • 4️⃣Duplicate & False Content: • Risk of misinformation through manipulated or duplicate media. • Data breaches expose vulnerabilities in image processing algorithms. • 🔹 Add relevant visuals (e.g., security concerns, data breaches, occlusion examples). 🔹 Use structured layouts for a clear and visually appealing presentation.
  • 14.
    Future of CV •📌 Overview: • Evolution from basic image processing to complex visual understanding. • Advanced deep learning and labeled training data enhance accuracy. • 📌 Breakthroughs Driving Progress: ✅ Deep Learning & AI: Enables human-like interpretation of visual data. ✅ Big Data & Training Sets: Vast labeled datasets improve machine learning capabilities. • 📌 Future Applications: 🚀 Personalized Healthcare Diagnostics: AI-driven medical imaging for disease detection. 🚀 Immersive AR & VR Experiences: Enhancing augmented and virtual reality technologies. 🚀 Smart Surveillance & Security: Advanced monitoring systems ensuring public safety. 🚀 Autonomous Systems: Smarter self-driving vehicles and robotics for various industries. • 📌 The Path Forward: 🔹 Innovation & Collaboration: Advancing computer vision through teamwork and research. 🔹 Ethical Considerations: Balancing AI advancements with responsible AI practices. • 🔹 Include futuristic visuals like AI-driven medical imaging and AR applications. 🔹 Use bold headers and engaging visuals to maintain audience attention.