SlideShare a Scribd company logo
1 of 40
Video processing
Introduction
• Video processing is the manipulation and analysis of digital video
sequences.
• Basic video processing techniques include trimming, image resizing,
brightness and contrast adjustment, fade in and fade out, and analyzing.
• These tasks can be performed using a variety of ML techniques, including
deep learning, computer vision, and natural language processing.
Formats
• MP4 - commonly used for video compression.
• AVI - container format that was developed by Microsoft
• MOV - developed by Apple
• AVCHD - commonly used for video recorded by digital camcorders
and DSLR cameras.
• FLV - used for streaming video over the internet. FLV is commonly
used for videos on websites such as YouTube and Vimeo.
Key Concepts
• Compression - Compression is the process of reducing the size of a video file
while maintaining its quality. Video compression algorithms remove redundant
information.
• Frame - a frame refers to a single still image that makes up a sequence of images
(or "frames") that, when played back in rapid succession, create the illusion of
motion
• Frame rate- The frame rate is the number of frames displayed per second. It
determines the smoothness of the video.
• Resolution - Resolution refers to the number of pixels in a video frame. A higher
resolution means more pixels and better quality.
• Aspect ratio - The aspect ratio is the ratio of the width of a video frame to its
height. Common aspect ratios include 4:3 and 16:9
Video Processing Techniques
• Compression: to reduce the size of a video file while maintaining its
quality.
• Enhancement: to improve the visual quality of a video, such as noise
reduction, color correction, and sharpening.
• Restoration: to repair or improve the quality of a video that has been
degraded by noise, blur, or other factors.
• Analysis: to extract information from video sequences, such as object
tracking, facial recognition, and scene analysis.
Video Compression
• Inter-frame compression is a technique that reduces the amount of data
needed to represent a video by only storing the differences between
consecutive frames, instead of storing each frame in its entirety.
• This is done by comparing each frame to the preceding one and only
storing the changes, rather than the entire frame.
• Most commonly used algorithms are H.264, VP9, HEVC , etc..
Video Compression
• Intra-frame compression, also known as intra-coded compression,
works by compressing each frame individually.
• It uses techniques such as Discrete Cosine Transform (DCT) and
Color Quantization to compress the data of each frame.
• Discrete Cosine - technique applied to image pixels in spatial domain
in order to transform them into a frequency domain in which
redundancy can be identified
• Quantization - the process of mapping continuous infinite values to a
smaller set of discrete finite values.
Video Compression
• Lossy compression is a technique that removes some of the data from
the video in order to reduce its file size.
• This can be done in various ways, such as by removing redundant data
or by removing information that the human eye is less likely to notice.
• The result is a lower quality video but with a smaller file size.
• It's worth noting that most of the existing video compression standard
use inter-frame compression as it's more efficient than intra-frame
compression
Video Compression
• Fractal Compression: This technique uses fractal mathematics to
compress an image. The image is broken down into smaller fractal
patterns, which can be used to recreate the original image with a
smaller file size.
• Vector Quantization: This technique groups similar image features
together and replaces them with a single symbol. This reduces the
amount of data needed to represent the image and can be applied on
both grayscale and color images.
• Run-Length Encoding: This technique is used for images with large
areas of uniform color. It replaces repeating pixels with a single
symbol, reducing the amount of data needed to represent the image.
Video Enhancement Techniques
• Super-resolution: increases the resolution of a video by estimating and
reconstructing high-resolution details from low-resolution frames.
• Denoising: reduces noise in a video by removing or reducing random
variations in pixel intensity.
• Color correction: improves the color accuracy of a video by adjusting the color
balance, saturation, and brightness.
• Deblurring: removes blur from a video caused by camera shake or fast-moving
objects.
• Stabilization: removes jitter or shake from a video caused by camera
movement.
Video Super Resolution
• Video super-resolution (VSR) is a technique used to increase the resolution of a video by
estimating and reconstructing high-resolution details from low-resolution frames. Some
common video super-resolution techniques include:
• Interpolation-based methods: These methods use interpolation algorithms such as bicubic
or Lanczos to estimate missing pixels in the high-resolution version of the video.
• Reconstruction-based methods: These methods use image or video processing techniques
to reconstruct the high-resolution version of the video. Examples of these methods
include single image super-resolution (SISR) and video super-resolution (VSR).
• Deep learning-based methods: These methods use deep neural networks (DNNs) to learn
the mapping between low-resolution and high-resolution images. Examples of these
methods include deep convolutional neural networks (CNNs) and generative adversarial
networks (GANs).
• Hybrid methods : These methods combine multiple techniques to achieve the best results.
For example, combining deep learning with interpolation or reconstruction-based
methods.
Interpolation
• Interpolation-based methods for video super-resolution (VSR) use interpolation algorithms to
estimate missing pixels in the high-resolution version of the video. Some common interpolation-
based VSR methods include:
• Nearest-neighbor interpolation: This method replicates the value of the nearest pixel to fill in
missing pixels. It is simple to implement but can introduce "blocky" artifacts in the output.
• Bilinear interpolation: This method uses the weighted average of the four closest pixels to estimate
the value of missing pixels. It is a more sophisticated method than nearest-neighbor interpolation
but can still introduce some artifacts in the output.
• Bicubic interpolation: This method uses the weighted average of the 16 closest pixels to estimate
the value of missing pixels. It is more sophisticated than bilinear interpolation and typically
produces better results, but it is also more computationally expensive.
• Lanczos interpolation: This method uses a sinc function to estimate the value of missing pixels. It
is a highly sophisticated method and is known for producing the best results among interpolation-
based methods, but it is also the most computationally expensive.
Reconstruction-based Methods
• Reconstruction-based methods for video super-resolution (VSR) use
image or video processing techniques to reconstruct the high-
resolution version of the video.
• These methods typically involve building a model of the image or
video and using that model to generate the high-resolution version.
• Some common reconstruction-based VSR methods include:
• Optical flow-based methods: These methods use the motion information
between the frames to estimate the high-resolution frame.
• Spatial-Temporal Super-Resolution methods: These methods use a
combination of spatial and temporal information in order to increase the
resolution of the video.
Optical Flow-based Methods
• Optical flow-based methods for video super-resolution (VSR) use the motion information
between frames to estimate the high-resolution frame.
• These methods work by estimating the motion vectors between low-resolution frames and
using these vectors to warp the pixels of one frame to match the position of the pixels in
another frame.
• The high-resolution frame can then be reconstructed by combining the warped frames.
• Optical flow-based VSR methods typically involve the following steps:
• Estimating the optical flow: This step involves estimating the motion vectors between low-
resolution frames using techniques such as Lucas-Kanade, Horn-Schunck or deep learning-based
optical flow estimation.
• Warping frames: This step involves using the motion vectors to warp the pixels of one frame to
match the position of the pixels in another frame.
• Combining frames: This step involves combining the warped frames to form the high-resolution
frame. This can be done by averaging, weighted averaging, or median filtering the warped frames.
Spatial-Temporal Super-Resolution
• Spatial-Temporal Super-Resolution (STSR) is a method that combines spatial and
temporal information in order to increase the resolution of a video.
• These methods utilize both the spatial information of the individual frames and the
temporal information between frames to generate high-resolution video.
• STSR methods typically involve the following steps:
• Spatial resolution enhancement: This step involves enhancing the resolution of each individual
frame of the video using interpolation-based methods, SISR or DNN-based methods.
• Temporal information extraction: This step involves extracting temporal information from the
video, such as motion vectors or optical flow, that can be used to align the frames and improve the
resolution of the video.
• Temporal resolution enhancement: This step involves using the extracted temporal information to
align and fuse the frames to generate the high-resolution video.
• STSR methods can be effective for VSR, especially when applied to videos with complex
temporal dynamics such as fast moving objects or complex background. These methods
can also be robust to occlusions and motion discontinuities. However, these methods can
be computationally expensive, especially when extracting temporal information.
Denoising in Video Enhancement
• Denoising in video enhancement is the process of removing noise from a video in
order to improve its visual quality.
• Noise in videos can be caused by various factors such as low-light conditions,
electronic noise in the camera sensor, or compression artifacts.
• There are several methods for denoising videos, including:
• Spatial filtering: This method involves applying a filter to each frame of the video to reduce
noise. Examples of spatial filters include median filters and Gaussian filters.
• Temporal filtering: This method involves using information from multiple frames of the video
to reduce noise. Examples of temporal filters include Kalman filters and recursive filters.
• Non-local Means filter: This method is a spatial-temporal filter that uses information from
similar pixels in other frames to remove noise.
• Deep learning-based methods: These methods use deep neural networks (DNNs) to learn the
mapping between noisy and denoised videos. Examples of DNN-based denoising methods
include autoencoder-based and UNet-based methods.
• Hybrid methods: These methods combine spatial, temporal, and deep learning-based methods
to denoise videos.
Spatial Filtering
• Spatial filtering is a method for denoising videos that involves applying a filter to each
frame of the video to reduce noise.
• These filters operate on the spatial domain, meaning that they process the pixels in each
frame independently of the pixels in other frames.
• Spatial filtering methods are fast and easy to implement, and can be useful for removing
noise such as sensor noise, impulse noise, or salt-and-pepper noise.
• Some examples of spatial filters include:
• Median filter: This filter replaces the value of a pixel with the median value of the pixels in a
neighborhood around it. It is effective at removing salt-and-pepper noise but can blur fine details.
• Gaussian filter: This filter replaces the value of a pixel with a weighted average of the pixels in a
neighborhood around it, where the weighting is determined by a Gaussian function. It is effective at
removing Gaussian noise but can blur fine details.
• Mean filter: This filter replaces the value of a pixel with the mean value of the pixels in a
neighborhood around it. It is effective at removing impulse noise but can blur fine details.
• Bilater filter: This filter is a combination of a Gaussian filter and a mean filter. It smooths the image
while preserving the edges.
Temporal Filtering
• Temporal filtering is a method for denoising videos that involves using information from multiple
frames of the video to reduce noise.
• These filters operate in the temporal domain, meaning that they process the pixels in each frame in
relation to the pixels in other frames.
• Temporal filtering methods can be more effective at removing noise such as temporal noise (noise
that changes over time), camera shake, or compression artifacts.
• Some examples of temporal filters include:
• Kalman filter: This filter uses a mathematical model to estimate the state of a system over time, and is used to
predict the current frame based on the previous frames. It can be effective at removing temporal noise, but can
be computationally expensive.
• Recursive filter: This filter uses recursive algorithms to estimate the current frame based on the previous
frames. It is similar to the Kalman filter but more computationally efficient.
• Optical flow-based filter: This filter uses optical flow to align frames, and then uses spatial filtering to remove
noise. It can be effective at removing noise caused by camera shake but can be sensitive to occlusions and
motion discontinuities.
• Recurrent neural networks (RNN): This filter uses a recurrent neural network to estimate the current frame
based on the previous frames. It is similar to the Kalman filter but uses a deep learning approach, can be more
powerful in removing noise and can be computationally expensive.
The Non-local Means
• The Non-local Means (NLM) filter is a method for denoising videos that uses
information from similar pixels in other frames to remove noise.
• It is a spatial-temporal filter, meaning that it processes the pixels in each frame in
relation to the pixels in other frames and in a neighborhood around them.
• The NLM filter operates in the following steps:
• For each pixel in the current frame, it searches for similar pixels in other frames.
• It computes a weighted average of the similar pixels, where the weighting is determined by a
similarity metric such as the Euclidean distance.
• It replaces the value of the current pixel with the computed weighted average.
• The NLM filter is effective at removing noise such as temporal noise, camera
shake, and compression artifacts. It can also preserve edges and fine details better
than spatial filters. However, it can be computationally expensive, as it requires
searching for similar pixels in other frames.
Color correction
• Color correction is the process of adjusting the colors of a video to improve its visual
quality. It can be used to correct color imbalances, fix exposure issues, and improve the
overall color and tone of the video. There are several techniques that can be used for
color correction:
1.White balance: This technique is used to correct the color cast of a video caused by
different lighting conditions. It can be done by adjusting the color temperature of the video
to make it appear more neutral.
2.Color grading: This technique is used to adjust the overall color and tone of a video. It can
be done by adjusting the brightness, contrast, saturation, and hue of the video.
3.Curves: This technique allows for fine-grained color correction by adjusting the brightness
levels of individual colors.
4.LUTs (Lookup tables): A LUT is a predefined table that maps the input colors to output
colors. Using a LUT allows for fast and consistent color correction across multiple shots.
5.Color matching: This technique is used to match the colors of different shots or scenes. It
can be done by adjusting the colors of one shot to match the colors of another shot.
6.Machine Learning-based methods: These methods use machine learning algorithms to
learn the underlying structure of the video, and then use this knowledge to correct the
color.
Deblurring:
• Deblurring, also known as image restoration, is the process of removing blur from an image or video
caused by factors such as camera shake, fast motion, or a small aperture. There are several
techniques that can be used for deblurring:
1. Inverse Filtering: This technique uses a known blur function to reverse the blurring effect. This
method is highly sensitive to noise and is usually not used in practice.
2. Wiener Filtering: This technique uses a statistical model of the image and the blur function to
estimate the original image. This method is less sensitive to noise but can still produce poor results.
3. Blind Deconvolution: This technique is used when the blur function is not known. It attempts to
estimate both the blur function and the original image simultaneously. This method can produce
good results but is highly sensitive to noise and initialization.
4. Regularization-based methods: These methods add a regularization term to the objective function
to prevent overfitting. Examples include Tikhonov regularization, Total Variation regularization, and
Sparse Representation based methods.
5. Machine Learning-based methods: These methods use machine learning algorithms such as Deep
Learning to learn the underlying structure of the image and then use this knowledge to deblur the
image.
Stabilization
• Video stabilization is the process of removing the unwanted camera shake or jitter from a
video. It is used to make the video appear smoother and more stable. There are several
techniques that can be used for video stabilization:
1.Optical Flow: This technique uses the motion of the pixels between consecutive frames to
estimate the camera motion. The video is then compensated for this motion by aligning
the frames.
2.Feature-based: This technique uses the features such as points, edges or corners in the
video to estimate the camera motion. These features are tracked between consecutive
frames to estimate the motion.
3.Hybrid methods: These methods combine the above techniques. They first use feature-
based methods to estimate the motion, then use optical flow to refine the motion estimate.
4.Gyroscopic stabilization: This technique uses a gyroscopic sensor to measure the rotation
of the camera. The video is then compensated for this rotation by aligning the frames.
5.Machine Learning-based methods: These methods use machine learning algorithms such
as Deep Learning to learn the underlying structure of the video and then use this
knowledge to stabilize the video.
Segmentation
• Two are the most widely used segmentation techniques
• Semantic segmentation: This involves dividing a video into segments
based on semantic content, such as by identifying and separating
different objects or regions in a video, and then classifying them into
semantic categories.
• Motion segmentation: This involves dividing a video into segments
based on motion, such as by identifying and separating different
moving objects or regions in a video.
RCNN
• Generate initial sub-
segmentation, we generate many
candidate regions
• Use greedy algorithm to
recursively combine similar
regions into larger ones
• Use the generated regions to
produce the final candidate
region proposals
Fast-rcnn
• The same author of the previous
paper(R-CNN) solved some of the
drawbacks of R-CNN to build a faster
object detection algorithm and it was
called Fast R-CNN.
• The approach is similar to the R-CNN
algorithm.
• But, instead of feeding the region
proposals to the CNN, we feed the
input image to the CNN to generate a
convolutional feature map.
Faster-rcnn
• R-CNN & Fast R-CNN uses
selective search to find out the
region proposals.
• In faster-rcnn, Lets the network
learn the region proposals.
background subraction
• Frame differencing: Compares each frame to the
previous frame and detects changes.
• Running average: Keeps a running average of the
background and detects changes that deviate
from the average.
• Gaussian mixture model: Uses a statistical model
to represent the background and detect changes.
Optical flow and Clustering-based methods
• Optical flow algorithms compute the motion of each pixel in the
image by analyzing the changes in the pixel's position and color
from one frame to the next.
• Clustering-based methods involve grouping pixels or regions use
a set of features, such as color, texture, or motion information, to
represent the pixels or regions, and then apply a clustering
algorithm to group similar features together.
• popular clustering algorithms used for motion segmentation
include k-means, mean-shift, and Gaussian mixture models.
Video content Analysis
• Video content analysis deals with the extraction of metadata
from raw video to be used as components for further processing
in applications such as search, summarization, classification or
event detection.
• The main goal of video analytics is to automatically recognize
temporal and spatial events in videos.
• This technical capability is used in a wide range of domains including
entertainment, video retrieval and video browsing, health-care, retail,
automotive, transport, home automation, flame and smoke
detection, safety, and security
How does video analytics work?
• Video content analysis can be done in two different ways:
i. In real time, by configuring the system to trigger alerts for specific events
and incidents that unfold in the moment.
ii. In post processing, by performing advanced searches to facilitate forensic
analysis tasks.
• Feeding the system:The data being analyzed can come from various
streaming video sources. The most common are CCTV cameras, traffic
cameras and online video feeds.
• A key goal is coverage: we need to have a clear view of the entire area, and
from various angles,
Central processing vs edge
processing
• Video analysis software can be run centrally on servers that are generally
located in the monitoring station, which is known as central processing.
• Or, it can be embedded in the cameras themselves, a strategy known as edge
processing.
• With a hybrid approach, the processing performed by the cameras reduces
the data being processed by the central servers
Classification of Video Analysis
Tools and Techniques for Video
Analysis
Facial Recognition in Video
Analysis
• Facial recognition systems that can identify or
verify a person from a digital image or video
find application in a variety of contexts.
• Facial recognition works in two parts: face
detection and face identification.
i. In the first stage, the system detects faces
in the input data using methods like
background subtraction.
ii. Next, it measures the facial features to
define facial landmarks and tries to match
them with a known dataset. Based on the
percentage of accuracy of match, the faces
can be recognized or classified as unknown.
• Dlib’s face landmark predictor to detect a face
and extract features such as eyes, mouth,
brows, nose, and jawline.
• The image was standardized by cropping to
include just these features and aligning it
based on the location of eyes and the bottom
lip.
• The preprocessed image was then mapped to a
numerical vector representation. An algorithmic
comparison of the vector images made facial
recognition possible.
Detecting Motion
• We compare each frame of a video stream to the previous one
and detect all spots that have changed.
• We convert the image to gray and smooth it out a bit by blurring
the image. Converting to grey converts all RGB pixels to a value
between 0 and 255 where 0 is black and 255 is white.
• We’ll compare the previous frame with the current one by
examining the pixel values. Remember that since we’ve
converted the image to grey all pixels are represented by a
single value between 0 and 255.
• We use Threshold function “cv2.threshold” to convert each
pixel to either 0 (white) or 1 (black). The threshold for this is 20.
• Finding areas and contouring: We want to find the area that
has changed since the last frame, not each pixel. In order to do
so, we first need to find an area.
• cv.findContours it retrieves contours or outer limits from each
white spot from the part above.
Application of Video Analysis

More Related Content

Similar to Video processing.pptx

The technology of editing
The technology of editingThe technology of editing
The technology of editinghamdi_jama
 
06 13sept 8313 9997-2-ed an adaptive (edit lafi)
06 13sept 8313 9997-2-ed an adaptive (edit lafi)06 13sept 8313 9997-2-ed an adaptive (edit lafi)
06 13sept 8313 9997-2-ed an adaptive (edit lafi)IAESIJEECS
 
To Understand Video
To Understand VideoTo Understand Video
To Understand Videoadil raja
 
HDTV Technology and Scanning Techniques
HDTV Technology and Scanning TechniquesHDTV Technology and Scanning Techniques
HDTV Technology and Scanning TechniquesAnirudh Kannan
 
Training Videovigilancia IP: What, Why, When and How
Training Videovigilancia IP: What, Why, When and HowTraining Videovigilancia IP: What, Why, When and How
Training Videovigilancia IP: What, Why, When and HowNestor Carralero
 
mpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.pptmpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.pptPawachMetharattanara
 
Video Processing (4).ppt
Video Processing                 (4).pptVideo Processing                 (4).ppt
Video Processing (4).pptDesalechali1
 
Direct satellite broadcast receiver using mpeg 2
Direct satellite broadcast receiver using mpeg 2Direct satellite broadcast receiver using mpeg 2
Direct satellite broadcast receiver using mpeg 2arpit shukla
 
Digital image and file formats
Digital image and file formatsDigital image and file formats
Digital image and file formatsRam Chandran
 
Creating 3D neuron reconstructions from image stacks and virtual slides
Creating 3D neuron reconstructions from image stacks and virtual slidesCreating 3D neuron reconstructions from image stacks and virtual slides
Creating 3D neuron reconstructions from image stacks and virtual slidesMBF Bioscience
 
Digital graphics for computer games
Digital graphics for computer gamesDigital graphics for computer games
Digital graphics for computer gamesJason
 
EMC 3130/2130 Lecture One - Image Digital
EMC 3130/2130 Lecture One - Image DigitalEMC 3130/2130 Lecture One - Image Digital
EMC 3130/2130 Lecture One - Image DigitalEdward Bowen
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...SWAMI06
 
Understanding Megapixel Camera Technology.pdf
Understanding Megapixel Camera Technology.pdfUnderstanding Megapixel Camera Technology.pdf
Understanding Megapixel Camera Technology.pdfPawachMetharattanara
 
Chapter 6 : VIDEO
Chapter 6 : VIDEOChapter 6 : VIDEO
Chapter 6 : VIDEOazira96
 

Similar to Video processing.pptx (20)

The technology of editing
The technology of editingThe technology of editing
The technology of editing
 
06 13sept 8313 9997-2-ed an adaptive (edit lafi)
06 13sept 8313 9997-2-ed an adaptive (edit lafi)06 13sept 8313 9997-2-ed an adaptive (edit lafi)
06 13sept 8313 9997-2-ed an adaptive (edit lafi)
 
Video
VideoVideo
Video
 
To Understand Video
To Understand VideoTo Understand Video
To Understand Video
 
HDTV Technology and Scanning Techniques
HDTV Technology and Scanning TechniquesHDTV Technology and Scanning Techniques
HDTV Technology and Scanning Techniques
 
MPEG4 vs H.264
MPEG4 vs H.264MPEG4 vs H.264
MPEG4 vs H.264
 
Training Videovigilancia IP: What, Why, When and How
Training Videovigilancia IP: What, Why, When and HowTraining Videovigilancia IP: What, Why, When and How
Training Videovigilancia IP: What, Why, When and How
 
mpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.pptmpeg4copy-120428133000-phpapp01.ppt
mpeg4copy-120428133000-phpapp01.ppt
 
WT in IP.ppt
WT in IP.pptWT in IP.ppt
WT in IP.ppt
 
Video Processing (4).ppt
Video Processing                 (4).pptVideo Processing                 (4).ppt
Video Processing (4).ppt
 
06 vdo
06 vdo06 vdo
06 vdo
 
Direct satellite broadcast receiver using mpeg 2
Direct satellite broadcast receiver using mpeg 2Direct satellite broadcast receiver using mpeg 2
Direct satellite broadcast receiver using mpeg 2
 
Chapter 1 Video
Chapter 1 VideoChapter 1 Video
Chapter 1 Video
 
Digital image and file formats
Digital image and file formatsDigital image and file formats
Digital image and file formats
 
Creating 3D neuron reconstructions from image stacks and virtual slides
Creating 3D neuron reconstructions from image stacks and virtual slidesCreating 3D neuron reconstructions from image stacks and virtual slides
Creating 3D neuron reconstructions from image stacks and virtual slides
 
Digital graphics for computer games
Digital graphics for computer gamesDigital graphics for computer games
Digital graphics for computer games
 
EMC 3130/2130 Lecture One - Image Digital
EMC 3130/2130 Lecture One - Image DigitalEMC 3130/2130 Lecture One - Image Digital
EMC 3130/2130 Lecture One - Image Digital
 
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
A Segmentation based Sequential Pattern Matching for Efficient Video Copy De...
 
Understanding Megapixel Camera Technology.pdf
Understanding Megapixel Camera Technology.pdfUnderstanding Megapixel Camera Technology.pdf
Understanding Megapixel Camera Technology.pdf
 
Chapter 6 : VIDEO
Chapter 6 : VIDEOChapter 6 : VIDEO
Chapter 6 : VIDEO
 

Recently uploaded

Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 

Recently uploaded (20)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 

Video processing.pptx

  • 2. Introduction • Video processing is the manipulation and analysis of digital video sequences. • Basic video processing techniques include trimming, image resizing, brightness and contrast adjustment, fade in and fade out, and analyzing. • These tasks can be performed using a variety of ML techniques, including deep learning, computer vision, and natural language processing.
  • 3. Formats • MP4 - commonly used for video compression. • AVI - container format that was developed by Microsoft • MOV - developed by Apple • AVCHD - commonly used for video recorded by digital camcorders and DSLR cameras. • FLV - used for streaming video over the internet. FLV is commonly used for videos on websites such as YouTube and Vimeo.
  • 4. Key Concepts • Compression - Compression is the process of reducing the size of a video file while maintaining its quality. Video compression algorithms remove redundant information. • Frame - a frame refers to a single still image that makes up a sequence of images (or "frames") that, when played back in rapid succession, create the illusion of motion • Frame rate- The frame rate is the number of frames displayed per second. It determines the smoothness of the video. • Resolution - Resolution refers to the number of pixels in a video frame. A higher resolution means more pixels and better quality. • Aspect ratio - The aspect ratio is the ratio of the width of a video frame to its height. Common aspect ratios include 4:3 and 16:9
  • 5. Video Processing Techniques • Compression: to reduce the size of a video file while maintaining its quality. • Enhancement: to improve the visual quality of a video, such as noise reduction, color correction, and sharpening. • Restoration: to repair or improve the quality of a video that has been degraded by noise, blur, or other factors. • Analysis: to extract information from video sequences, such as object tracking, facial recognition, and scene analysis.
  • 6. Video Compression • Inter-frame compression is a technique that reduces the amount of data needed to represent a video by only storing the differences between consecutive frames, instead of storing each frame in its entirety. • This is done by comparing each frame to the preceding one and only storing the changes, rather than the entire frame. • Most commonly used algorithms are H.264, VP9, HEVC , etc..
  • 7. Video Compression • Intra-frame compression, also known as intra-coded compression, works by compressing each frame individually. • It uses techniques such as Discrete Cosine Transform (DCT) and Color Quantization to compress the data of each frame. • Discrete Cosine - technique applied to image pixels in spatial domain in order to transform them into a frequency domain in which redundancy can be identified • Quantization - the process of mapping continuous infinite values to a smaller set of discrete finite values.
  • 8. Video Compression • Lossy compression is a technique that removes some of the data from the video in order to reduce its file size. • This can be done in various ways, such as by removing redundant data or by removing information that the human eye is less likely to notice. • The result is a lower quality video but with a smaller file size. • It's worth noting that most of the existing video compression standard use inter-frame compression as it's more efficient than intra-frame compression
  • 9. Video Compression • Fractal Compression: This technique uses fractal mathematics to compress an image. The image is broken down into smaller fractal patterns, which can be used to recreate the original image with a smaller file size. • Vector Quantization: This technique groups similar image features together and replaces them with a single symbol. This reduces the amount of data needed to represent the image and can be applied on both grayscale and color images. • Run-Length Encoding: This technique is used for images with large areas of uniform color. It replaces repeating pixels with a single symbol, reducing the amount of data needed to represent the image.
  • 10. Video Enhancement Techniques • Super-resolution: increases the resolution of a video by estimating and reconstructing high-resolution details from low-resolution frames. • Denoising: reduces noise in a video by removing or reducing random variations in pixel intensity. • Color correction: improves the color accuracy of a video by adjusting the color balance, saturation, and brightness. • Deblurring: removes blur from a video caused by camera shake or fast-moving objects. • Stabilization: removes jitter or shake from a video caused by camera movement.
  • 11. Video Super Resolution • Video super-resolution (VSR) is a technique used to increase the resolution of a video by estimating and reconstructing high-resolution details from low-resolution frames. Some common video super-resolution techniques include: • Interpolation-based methods: These methods use interpolation algorithms such as bicubic or Lanczos to estimate missing pixels in the high-resolution version of the video. • Reconstruction-based methods: These methods use image or video processing techniques to reconstruct the high-resolution version of the video. Examples of these methods include single image super-resolution (SISR) and video super-resolution (VSR). • Deep learning-based methods: These methods use deep neural networks (DNNs) to learn the mapping between low-resolution and high-resolution images. Examples of these methods include deep convolutional neural networks (CNNs) and generative adversarial networks (GANs). • Hybrid methods : These methods combine multiple techniques to achieve the best results. For example, combining deep learning with interpolation or reconstruction-based methods.
  • 12. Interpolation • Interpolation-based methods for video super-resolution (VSR) use interpolation algorithms to estimate missing pixels in the high-resolution version of the video. Some common interpolation- based VSR methods include: • Nearest-neighbor interpolation: This method replicates the value of the nearest pixel to fill in missing pixels. It is simple to implement but can introduce "blocky" artifacts in the output. • Bilinear interpolation: This method uses the weighted average of the four closest pixels to estimate the value of missing pixels. It is a more sophisticated method than nearest-neighbor interpolation but can still introduce some artifacts in the output. • Bicubic interpolation: This method uses the weighted average of the 16 closest pixels to estimate the value of missing pixels. It is more sophisticated than bilinear interpolation and typically produces better results, but it is also more computationally expensive. • Lanczos interpolation: This method uses a sinc function to estimate the value of missing pixels. It is a highly sophisticated method and is known for producing the best results among interpolation- based methods, but it is also the most computationally expensive.
  • 13. Reconstruction-based Methods • Reconstruction-based methods for video super-resolution (VSR) use image or video processing techniques to reconstruct the high- resolution version of the video. • These methods typically involve building a model of the image or video and using that model to generate the high-resolution version. • Some common reconstruction-based VSR methods include: • Optical flow-based methods: These methods use the motion information between the frames to estimate the high-resolution frame. • Spatial-Temporal Super-Resolution methods: These methods use a combination of spatial and temporal information in order to increase the resolution of the video.
  • 14. Optical Flow-based Methods • Optical flow-based methods for video super-resolution (VSR) use the motion information between frames to estimate the high-resolution frame. • These methods work by estimating the motion vectors between low-resolution frames and using these vectors to warp the pixels of one frame to match the position of the pixels in another frame. • The high-resolution frame can then be reconstructed by combining the warped frames. • Optical flow-based VSR methods typically involve the following steps: • Estimating the optical flow: This step involves estimating the motion vectors between low- resolution frames using techniques such as Lucas-Kanade, Horn-Schunck or deep learning-based optical flow estimation. • Warping frames: This step involves using the motion vectors to warp the pixels of one frame to match the position of the pixels in another frame. • Combining frames: This step involves combining the warped frames to form the high-resolution frame. This can be done by averaging, weighted averaging, or median filtering the warped frames.
  • 15. Spatial-Temporal Super-Resolution • Spatial-Temporal Super-Resolution (STSR) is a method that combines spatial and temporal information in order to increase the resolution of a video. • These methods utilize both the spatial information of the individual frames and the temporal information between frames to generate high-resolution video. • STSR methods typically involve the following steps: • Spatial resolution enhancement: This step involves enhancing the resolution of each individual frame of the video using interpolation-based methods, SISR or DNN-based methods. • Temporal information extraction: This step involves extracting temporal information from the video, such as motion vectors or optical flow, that can be used to align the frames and improve the resolution of the video. • Temporal resolution enhancement: This step involves using the extracted temporal information to align and fuse the frames to generate the high-resolution video. • STSR methods can be effective for VSR, especially when applied to videos with complex temporal dynamics such as fast moving objects or complex background. These methods can also be robust to occlusions and motion discontinuities. However, these methods can be computationally expensive, especially when extracting temporal information.
  • 16. Denoising in Video Enhancement • Denoising in video enhancement is the process of removing noise from a video in order to improve its visual quality. • Noise in videos can be caused by various factors such as low-light conditions, electronic noise in the camera sensor, or compression artifacts. • There are several methods for denoising videos, including: • Spatial filtering: This method involves applying a filter to each frame of the video to reduce noise. Examples of spatial filters include median filters and Gaussian filters. • Temporal filtering: This method involves using information from multiple frames of the video to reduce noise. Examples of temporal filters include Kalman filters and recursive filters. • Non-local Means filter: This method is a spatial-temporal filter that uses information from similar pixels in other frames to remove noise. • Deep learning-based methods: These methods use deep neural networks (DNNs) to learn the mapping between noisy and denoised videos. Examples of DNN-based denoising methods include autoencoder-based and UNet-based methods. • Hybrid methods: These methods combine spatial, temporal, and deep learning-based methods to denoise videos.
  • 17. Spatial Filtering • Spatial filtering is a method for denoising videos that involves applying a filter to each frame of the video to reduce noise. • These filters operate on the spatial domain, meaning that they process the pixels in each frame independently of the pixels in other frames. • Spatial filtering methods are fast and easy to implement, and can be useful for removing noise such as sensor noise, impulse noise, or salt-and-pepper noise. • Some examples of spatial filters include: • Median filter: This filter replaces the value of a pixel with the median value of the pixels in a neighborhood around it. It is effective at removing salt-and-pepper noise but can blur fine details. • Gaussian filter: This filter replaces the value of a pixel with a weighted average of the pixels in a neighborhood around it, where the weighting is determined by a Gaussian function. It is effective at removing Gaussian noise but can blur fine details. • Mean filter: This filter replaces the value of a pixel with the mean value of the pixels in a neighborhood around it. It is effective at removing impulse noise but can blur fine details. • Bilater filter: This filter is a combination of a Gaussian filter and a mean filter. It smooths the image while preserving the edges.
  • 18. Temporal Filtering • Temporal filtering is a method for denoising videos that involves using information from multiple frames of the video to reduce noise. • These filters operate in the temporal domain, meaning that they process the pixels in each frame in relation to the pixels in other frames. • Temporal filtering methods can be more effective at removing noise such as temporal noise (noise that changes over time), camera shake, or compression artifacts. • Some examples of temporal filters include: • Kalman filter: This filter uses a mathematical model to estimate the state of a system over time, and is used to predict the current frame based on the previous frames. It can be effective at removing temporal noise, but can be computationally expensive. • Recursive filter: This filter uses recursive algorithms to estimate the current frame based on the previous frames. It is similar to the Kalman filter but more computationally efficient. • Optical flow-based filter: This filter uses optical flow to align frames, and then uses spatial filtering to remove noise. It can be effective at removing noise caused by camera shake but can be sensitive to occlusions and motion discontinuities. • Recurrent neural networks (RNN): This filter uses a recurrent neural network to estimate the current frame based on the previous frames. It is similar to the Kalman filter but uses a deep learning approach, can be more powerful in removing noise and can be computationally expensive.
  • 19. The Non-local Means • The Non-local Means (NLM) filter is a method for denoising videos that uses information from similar pixels in other frames to remove noise. • It is a spatial-temporal filter, meaning that it processes the pixels in each frame in relation to the pixels in other frames and in a neighborhood around them. • The NLM filter operates in the following steps: • For each pixel in the current frame, it searches for similar pixels in other frames. • It computes a weighted average of the similar pixels, where the weighting is determined by a similarity metric such as the Euclidean distance. • It replaces the value of the current pixel with the computed weighted average. • The NLM filter is effective at removing noise such as temporal noise, camera shake, and compression artifacts. It can also preserve edges and fine details better than spatial filters. However, it can be computationally expensive, as it requires searching for similar pixels in other frames.
  • 20. Color correction • Color correction is the process of adjusting the colors of a video to improve its visual quality. It can be used to correct color imbalances, fix exposure issues, and improve the overall color and tone of the video. There are several techniques that can be used for color correction: 1.White balance: This technique is used to correct the color cast of a video caused by different lighting conditions. It can be done by adjusting the color temperature of the video to make it appear more neutral. 2.Color grading: This technique is used to adjust the overall color and tone of a video. It can be done by adjusting the brightness, contrast, saturation, and hue of the video. 3.Curves: This technique allows for fine-grained color correction by adjusting the brightness levels of individual colors. 4.LUTs (Lookup tables): A LUT is a predefined table that maps the input colors to output colors. Using a LUT allows for fast and consistent color correction across multiple shots. 5.Color matching: This technique is used to match the colors of different shots or scenes. It can be done by adjusting the colors of one shot to match the colors of another shot. 6.Machine Learning-based methods: These methods use machine learning algorithms to learn the underlying structure of the video, and then use this knowledge to correct the color.
  • 21. Deblurring: • Deblurring, also known as image restoration, is the process of removing blur from an image or video caused by factors such as camera shake, fast motion, or a small aperture. There are several techniques that can be used for deblurring: 1. Inverse Filtering: This technique uses a known blur function to reverse the blurring effect. This method is highly sensitive to noise and is usually not used in practice. 2. Wiener Filtering: This technique uses a statistical model of the image and the blur function to estimate the original image. This method is less sensitive to noise but can still produce poor results. 3. Blind Deconvolution: This technique is used when the blur function is not known. It attempts to estimate both the blur function and the original image simultaneously. This method can produce good results but is highly sensitive to noise and initialization. 4. Regularization-based methods: These methods add a regularization term to the objective function to prevent overfitting. Examples include Tikhonov regularization, Total Variation regularization, and Sparse Representation based methods. 5. Machine Learning-based methods: These methods use machine learning algorithms such as Deep Learning to learn the underlying structure of the image and then use this knowledge to deblur the image.
  • 22. Stabilization • Video stabilization is the process of removing the unwanted camera shake or jitter from a video. It is used to make the video appear smoother and more stable. There are several techniques that can be used for video stabilization: 1.Optical Flow: This technique uses the motion of the pixels between consecutive frames to estimate the camera motion. The video is then compensated for this motion by aligning the frames. 2.Feature-based: This technique uses the features such as points, edges or corners in the video to estimate the camera motion. These features are tracked between consecutive frames to estimate the motion. 3.Hybrid methods: These methods combine the above techniques. They first use feature- based methods to estimate the motion, then use optical flow to refine the motion estimate. 4.Gyroscopic stabilization: This technique uses a gyroscopic sensor to measure the rotation of the camera. The video is then compensated for this rotation by aligning the frames. 5.Machine Learning-based methods: These methods use machine learning algorithms such as Deep Learning to learn the underlying structure of the video and then use this knowledge to stabilize the video.
  • 23. Segmentation • Two are the most widely used segmentation techniques • Semantic segmentation: This involves dividing a video into segments based on semantic content, such as by identifying and separating different objects or regions in a video, and then classifying them into semantic categories. • Motion segmentation: This involves dividing a video into segments based on motion, such as by identifying and separating different moving objects or regions in a video.
  • 24. RCNN • Generate initial sub- segmentation, we generate many candidate regions • Use greedy algorithm to recursively combine similar regions into larger ones • Use the generated regions to produce the final candidate region proposals
  • 25. Fast-rcnn • The same author of the previous paper(R-CNN) solved some of the drawbacks of R-CNN to build a faster object detection algorithm and it was called Fast R-CNN. • The approach is similar to the R-CNN algorithm. • But, instead of feeding the region proposals to the CNN, we feed the input image to the CNN to generate a convolutional feature map.
  • 26. Faster-rcnn • R-CNN & Fast R-CNN uses selective search to find out the region proposals. • In faster-rcnn, Lets the network learn the region proposals.
  • 27. background subraction • Frame differencing: Compares each frame to the previous frame and detects changes. • Running average: Keeps a running average of the background and detects changes that deviate from the average. • Gaussian mixture model: Uses a statistical model to represent the background and detect changes.
  • 28. Optical flow and Clustering-based methods • Optical flow algorithms compute the motion of each pixel in the image by analyzing the changes in the pixel's position and color from one frame to the next. • Clustering-based methods involve grouping pixels or regions use a set of features, such as color, texture, or motion information, to represent the pixels or regions, and then apply a clustering algorithm to group similar features together. • popular clustering algorithms used for motion segmentation include k-means, mean-shift, and Gaussian mixture models.
  • 29. Video content Analysis • Video content analysis deals with the extraction of metadata from raw video to be used as components for further processing in applications such as search, summarization, classification or event detection. • The main goal of video analytics is to automatically recognize temporal and spatial events in videos. • This technical capability is used in a wide range of domains including entertainment, video retrieval and video browsing, health-care, retail, automotive, transport, home automation, flame and smoke detection, safety, and security
  • 30. How does video analytics work? • Video content analysis can be done in two different ways: i. In real time, by configuring the system to trigger alerts for specific events and incidents that unfold in the moment. ii. In post processing, by performing advanced searches to facilitate forensic analysis tasks. • Feeding the system:The data being analyzed can come from various streaming video sources. The most common are CCTV cameras, traffic cameras and online video feeds. • A key goal is coverage: we need to have a clear view of the entire area, and from various angles,
  • 31. Central processing vs edge processing • Video analysis software can be run centrally on servers that are generally located in the monitoring station, which is known as central processing. • Or, it can be embedded in the cameras themselves, a strategy known as edge processing. • With a hybrid approach, the processing performed by the cameras reduces the data being processed by the central servers
  • 33. Tools and Techniques for Video Analysis
  • 34.
  • 35. Facial Recognition in Video Analysis • Facial recognition systems that can identify or verify a person from a digital image or video find application in a variety of contexts. • Facial recognition works in two parts: face detection and face identification. i. In the first stage, the system detects faces in the input data using methods like background subtraction. ii. Next, it measures the facial features to define facial landmarks and tries to match them with a known dataset. Based on the percentage of accuracy of match, the faces can be recognized or classified as unknown.
  • 36. • Dlib’s face landmark predictor to detect a face and extract features such as eyes, mouth, brows, nose, and jawline. • The image was standardized by cropping to include just these features and aligning it based on the location of eyes and the bottom lip. • The preprocessed image was then mapped to a numerical vector representation. An algorithmic comparison of the vector images made facial recognition possible.
  • 37. Detecting Motion • We compare each frame of a video stream to the previous one and detect all spots that have changed. • We convert the image to gray and smooth it out a bit by blurring the image. Converting to grey converts all RGB pixels to a value between 0 and 255 where 0 is black and 255 is white.
  • 38. • We’ll compare the previous frame with the current one by examining the pixel values. Remember that since we’ve converted the image to grey all pixels are represented by a single value between 0 and 255. • We use Threshold function “cv2.threshold” to convert each pixel to either 0 (white) or 1 (black). The threshold for this is 20.
  • 39. • Finding areas and contouring: We want to find the area that has changed since the last frame, not each pixel. In order to do so, we first need to find an area. • cv.findContours it retrieves contours or outer limits from each white spot from the part above.