Unit 3 Image Compression and Segmentation.pptx

Unit 3
Image Compression and Segmentation

3.1 Lossless vs. Lossy Compression
3.2 Basics of Image Compression
3.2.1 Run-Length Encoding
3.2.2 Huffman Coding
3.2.3 Transform Coding and JPEG Compression
3.3Evaluation of Compression Techniques
3.4 Importance of Image Segmentation
3.5Thresholding Techniques
3.6Region-based Segmentation
3.6.1 Region Growing
3.6.2 Split and Merge
3.8 Edge Detection and Boundary Extraction

Introduction
Compression techniques are essential for efficient data storage and transmission. There are two forms of compression:
lossless and lossy. Understanding the differences between these strategies is critical for selecting the best solution
depending on the unique requirements of various applications. In this article, we will discuss the differences between
lossy and lossless compression.

What is Data Compression?
Data compression is a technique used to reduce the size of data files. This process involves encoding information using
fewer bits than the original representation. The main goal of data compression is to save storage space or reduce the time
required to transmit data over networks.

What is Lossy Compression?
Lossy Compression reduces file size by
permanently removing some of the original
data. It’s commonly used when a file can afford
to lose some data or if storage space needs to
be significantly freed up.
A lossy compression is a data compression
method which discards (loses) some of the
data, in order to achieve a high compression
rate, with the result that decompressing the data
yields content that is different from the original
one.

Advantages of Lossy Compression
● Smaller File Sizes: Lossy compression significantly reduces file sizes, making it ideal for
web use and faster loading times.
● Widely Supported: Many tools and software support lossy formats (e.g., JPEG for images,
MP3 for audio).
● Efficient for Multimedia: Effective for compressing multimedia files without noticeable
quality loss.
Disadvantages of Lossy Compression
● Quality Degradation: Due to data removal, lossy files may exhibit reduced quality.
● Not Suitable for Critical Data: Inappropriate for situations where data integrity is crucial.

What is Lossless Compression?
Lossless compression reduces file size by removing unnecessary metadata without any discernible loss in picture
quality. The original data can be perfectly reconstructed after decompression.
Advantages of Lossless Compression
● No Quality Loss: Lossless compression maintains original quality during compression and
decompression.
● Suitable for Text and Archives: Ideal for text-based files, software installations, and backups.
● Minor File Size Reduction: Reduces file size without compromising quality significantly.
Disadvantages of Lossless Compression
● Larger Compressed Files: Compared to lossy formats they compressed larger files.
● Less Efficient for Multimedia: Not as effective for multimedia files

3.1 Lossless vs. Lossy Compression
Lossy Compression Lossless Compression
Lossy compression is the method which eliminate the data which
is not noticeable.
While Lossless Compression does not eliminate the data which is
not noticeable.
In Lossy compression, A file does not restore or rebuilt in its
original form.
While in Lossless Compression, A file can be restored in its
original form.
In Lossy compression, Data’s quality is compromised.
But Lossless Compression does not compromise the data’s
quality.
Lossy compression reduces the size of data. But Lossless Compression does not reduce the size of data.

Algorithms used in Lossy compression are: Transform coding,
Discrete Cosine Transform, Discrete Wavelet Transform, fractal
compression etc.
Algorithms used in Lossless compression are:
Run Length Encoding, Lempel-Ziv-Welch, Huffman Coding,
Arithmetic encoding etc.
Lossy compression is used in Images, audio, video. Lossless Compression is used in Text, images, sound.
Lossy compression has more data-holding capacity.
Lossless Compression has less data-holding capacity than Lossy
compression technique.
Lossy compression is also termed as irreversible compression. Lossless Compression is also termed as reversible compression.
Algorithms used in Lossy compression are: Transform coding,
Discrete Cosine Transform, Discrete Wavelet Transform, fractal
compression etc.
Algorithms used in Lossless compression are:
Run Length Encoding, Lempel-Ziv-Welch, Huffman Coding,
Arithmetic encoding etc.

3.2 Basics of Image Compression
In the field of Image processing, the compression of images is an important step before we start the
processing of larger images or videos. The compression of images is carried out by an encoder and
output a compressed form of an image. In the processes of compression, the mathematical transforms
play a vital role. A flow chart of the process of the compression of the image can be represented as:
In this article, we try to explain the overview of the concepts involved in the image compression techniques. The general
representation of the image in a computer is like a vector of pixels. Each pixel is represented by a fixed number of bits.
These bits determine the intensity of the color (on grayscale if a black and white image and has three channels of RGB if
colored images.)

Why Do We Need Image Compression?
Consider a black and white image that has a resolution of 1000*1000 and each pixel uses 8 bits to
represent the intensity. So the total no of bits required = 1000*1000*8 = 80,00,000 bits per image.
And consider if it is a video with 30 frames per second of the above-mentioned type images then
the total bits for a video of 3 secs is: 3*(30*(8, 000, 000))=720, 000, 000 bits
As we see just to store a 3-sec video we need so many bits which is very huge. So, we need a way to
have proper representation as well to store the information about the image in a minimum no of bits
without losing the character of the image. Thus, image compression plays an important role.

Basic steps in image compression:
● Applying the image transform
● Quantization of the levels
● Encoding the sequences.
Transforming The Image
What is a transformation(Mathematically):
It is a function that maps from one domain(vector space) to another domain(other vector space).
Assume, T is a transform, f(t):X->X’ is a function then, T(f(t)) is called the transform of the
function.
In a simple sense, we can say that T changes the shape(representation) of the function as it is a mapping from
one vector space to another (without changing basic function f(t) i.e. the relationship between the domain and
co-domain).

3.2.1 Run-Length Encoding
Run-length encoding (RLE) is a form of lossless data compression in which
runs of data (sequences in which the same data value occurs in many
consecutive data elements) are stored as a single data value and count, rather
than as the original run. This is most useful on data that contains many such
runs. The general idea behind this method is to replace consecutive repeating
occurrences of a symbol by one occurrence of the symbol followed by the
number of occurrences.
For example, simple graphic images such as icons, line drawings etc.
Example 1: Suppose string is AAAAAAA then Run-length encoding is A7
(A is character and 7 is number of times appear that string)
Example 2: If input string is “WWWWAAADEXXXXXX”, then the Run-
length encoding is W4A3D1E1X6.

3.2.2 Huffman Coding
Huffman coding is a popular algorithm used for lossless data compression. It assigns variable-length codes to input characters, with
shorter codes assigned to more frequent characters. Here’s a brief overview of how it works, particularly in the context of data
compression:
Steps of Huffman Coding:
1. Frequency Count: Count the frequency of each character in the input data.
2. Build a Priority Queue: Create a priority queue (often implemented with a min-heap) where each node contains a character
and its frequency.
Construct the Huffman Tree:
● While there is more than one node in the queue:
○ Remove the two nodes of the lowest frequency.
○ Create a new internal node with these two nodes as children and a frequency equal to the sum of their frequencies.
○ Insert this new node back into the priority queue.

Generate Codes: Traverse the Huffman tree to assign codes:
● Move left adds a "0" to the code.
● Move right adds a "1" to the code.
● The leaf nodes of the tree correspond to the characters, and the path to each leaf represents the Huffman code for
that character.
Encode Data: Replace each character in the original data with its corresponding Huffman code to produce the
compressed output.
Decode Data: To decode, traverse the Huffman tree using the bits of the encoded data, starting from the root and moving
left or right based on the bits until reaching a leaf node, then output the character.
Advantages of Huffman Coding:
● Efficient for compressing data with varying character frequencies.
● Simple to implement and understand.
Limitations:
● It may not be as effective for data with uniform character frequency.
● Requires additional storage for the Huffman tree to decode the data.
This algorithm is widely used in formats like ZIP and JPEG due to its efficiency in reducing file sizes without losing any
information. If you need more specific details or applications, feel free to ask!

3.2.3 Transform Coding and JPEG Compression
Transform Coding
Transform coding is a technique that converts spatial domain data (like pixel values) into a frequency domain
representation. This is typically done using transforms such as:
● Discrete Cosine Transform (DCT): Commonly used in JPEG compression, it transforms an image into a
sum of cosine functions, separating the image into different frequency components.
● Discrete Wavelet Transform (DWT): Used in some other compression techniques, it provides both
frequency and temporal resolution.
The idea is to represent the image in a way that allows for more effective compression, focusing on the most
important frequency components and discarding less significant information.

JPEG Compression
JPEG (Joint Photographic Experts Group) is a widely used method of lossy compression for digital images. The
JPEG compression process involves several steps:
1. Color Space Conversion: Images are often converted from RGB to YCbCr, separating luminance (Y) from
chrominance (Cb and Cr). This allows for more efficient compression, as the human eye is more sensitive to
brightness than color.
2. Blocking: The image is divided into small blocks (usually 8x8 pixels).
3. Transform: Each block undergoes a Discrete Cosine Transform (DCT), converting pixel values into
frequency components.
4. Quantization: The frequency coefficients are quantized using a quantization matrix, which reduces precision
for higher frequencies (less perceptible to the human eye), resulting in data reduction.
5. Entropy Coding: Finally, the quantized values are encoded using techniques like Huffman coding to further
compress the data.

3.3Evaluation of Compression Techniques
Evaluating compression techniques involves assessing several key criteria to determine their effectiveness, efficiency, and suitability for specific
applications. Here are the main factors to consider:
1. Compression Ratio
● Definition: The ratio of the original file size to the compressed file size.
● Importance: A higher compression ratio indicates better space-saving capabilities.
2. Lossiness vs. Losslessness
● Lossy Compression: Reduces file size by permanently eliminating some data (e.g., JPEG). It’s often used for images and audio where
some loss of quality is acceptable.
● Lossless Compression: Reduces file size without any loss of data (e.g., PNG, ZIP). This is crucial for applications requiring exact
reproduction of the original data.
3. Speed
● Compression Speed: The time taken to compress data.
● Decompression Speed: The time taken to restore data to its original form.
● Importance: Fast compression and decompression are vital for real-time applications (e.g., streaming).

4. Quality of Output
● For lossy compression, the perceptual quality of the output is essential. This can be evaluated through:
○ Visual Quality Assessment: Subjective evaluation by human observers.
○ Objective Metrics: Metrics like PSNR (Peak Signal-to-Noise Ratio) and SSIM (Structural Similarity
Index) quantify the quality of compressed images.
5. Algorithm Complexity
● Computational Requirements: The complexity of the algorithm affects its speed and resource consumption.
● Implementation Ease: Simpler algorithms might be easier to implement and integrate into existing systems.
6. Adaptability
● The technique's ability to handle different types of data (e.g., text, images, audio) and various data sizes.
7. Error Resilience
● How well the compression technique handles data loss or corruption during transmission. This is crucial for
applications like streaming media.
8. Standardization and Compatibility
● Whether the technique is widely accepted and supported across different platforms and devices.

3.4 Importance of Image Segmentation
Image segmentation is the process of dividing an image into different parts or segments to identify the objects and boundaries within it.
It is one of the most essential tasks in computer vision, as it enables us to extract valuable information from images. Image segmentation
has many applications in various fields, including medical imaging, autonomous vehicles, robotics, agriculture, and gaming. In this blog
post, we will explore the importance of image segmentation, its applications, and its future.

Why do we need Image Segmentation?
Image segmentation is crucial in computer vision tasks because it breaks down complex images into manageable
pieces. It's like separating ingredients in a dish. By isolating objects (things) and backgrounds (stuff), image
analysis becomes more efficient and accurate. This is essential for tasks like self-driving cars identifying objects or
medical imaging analyzing tumours. Understanding the image's content at this granular level unlocks a wider range
of applications in computer vision.

What are the Benefits of Image Segmentation?
Image segmentation is a powerful tool that enables the identification and isolation of specific
objects or areas of interest within an image. One of the primary benefits of image segmentation
is the ability to extract relevant information from an image accurately. Dividing an image into
smaller, more manageable segments, it makes easier to analyze each segment individually,
helping to identify the objects or features within the image. This information is used to make
informed decisions or predictions based on the image’s content.
Another significant advantage of image segmentation is the ability to enhance image quality. By
isolating the areas of interest, it becomes easier to apply various image processing techniques
such as contrast enhancement, noise reduction, and edge detection to effectively remove
unwanted elements from an image, such as noise, background clutter, or other unwanted
objects.
Additionally, image segmentation plays a vital role in object recognition and tracking. By
dividing an image into segments and analyzing each segment’s properties, computer vision
algorithms can identify and track specific objects within an image or video stream. This
technology has numerous applications in fields such as robotics, where machines can utilize
image segmentation to identify and track objects in real time.

What are Some Applications of Image Segmentation?
Image segmentation is one of the assertive techniques of computer vision that
has many applications in various fields, including medical imaging,
autonomous driving, and video surveillance. Here are a few most important
applications that are currently in practice or the stage of advanced research.
Medical imaging
Medical imaging technologies have revolutionized the way doctors diagnose
diseases and injuries. Radiography, magnetic resonance imaging (MRI),
ultrasound, and computed tomography (CT) are some of the most common
medical imaging techniques. However, making sense of the images produced
by these machines requires a lot of work. This is where image segmentation
comes in.

● Tumor detection
Tumor detection is one of the most important applications of image segmentation in medical imaging. With
the help of image segmentation, doctors can identify the exact location and size of a tumor. This information
is critical for planning the treatment of cancer. Image segmentation works by separating the tumor from the
surrounding healthy tissues. This allows doctors to see the tumor more clearly and accurately measure its size.
The accuracy of tumor detection using image segmentation is higher than traditional methods.

Autonomous vehicles
Image segmentation is a crucial strategy for the proper functioning of autonomous vehicles. Object detection, lane
segmentation, and semantic segmentation are three important types of image segmentation that enable autonomous
vehicles to interpret and understand their environment. With accurate image segmentation, autonomous vehicles can
navigate safely and efficiently on the road, making them an essential technology for the future of transportation.

● Object detection
Object detection is a type of image segmentation that involves identifying and locating objects within an
image. This is an essential process for autonomous vehicles, as it enables them to detect and avoid obstacles
on the road. These algorithms can recognize a wide range of objects, from pedestrians and cyclists to other
vehicles and road signs. With accurate image segmentation, autonomous vehicles can navigate safely and
efficiently on the road.
● Lane segmentation
Lane segmentation is another important type of image segmentation for autonomous vehicles. This process
involves identifying and separating different lanes on the road. Lane segmentation algorithms use a variety of
features, including color, texture, and shape, to differentiate between different lanes. With accurate lane
segmentation, autonomous vehicles can stay within their designated lanes while driving, which is essential for
safety and efficiency. Lane segmentation is also important for navigation, as it enables autonomous vehicles to
follow a specific route.
● Semantic segmentation
Semantic segmentation is a more advanced type of image segmentation that involves labeling different areas
of an image with semantic information. This process enables autonomous vehicles to understand the meaning
of different objects and areas in their surroundings. With semantic segmentation, autonomous vehicles can
identify and differentiate between different types of objects and areas, including roads, sidewalks, buildings,
and vegetation. This information is crucial for autonomous vehicles to navigate effectively and make
informed driving decisions.

3.5 Thresholding Techniques
What is Image Thresholding?
Image thresholding works on grayscale images, where each pixel has an intensity value between 0 (black) and 255 (white). The
thresholding process involves converting this grayscale image into a binary image, where pixels are classified as either foreground
(object of interest) or background based on their intensity values and a predetermined threshold. Pixels with intensities above the
threshold are assigned to the foreground, while those below are assigned to the background.
Key Points:
● Process: Compare each pixel's intensity to a threshold value.
● Result: Pixels above the threshold are set to white (255), and those below are set to black (0).
● Purpose: Simplifies the image, making it easier to identify and analyze regions of interest.

Thresholding Techniques in Computer Vision
1. Simple Thresholding
Simple thresholding uses a single threshold value to classify pixel intensities. If a pixel's intensity is greater than the
threshold, it is set to 255 (white); otherwise, it is set to 0 (black).
[Tex]begin{equation} T(x, y) = begin{cases} 0 & text{if } I(x, y) leq T 255 & text{if } I(x, y) > T end{cases}
end{equation} [/Tex]
In this formula:
● I(x,y) is the intensity of the pixel at coordinates (x, y).
● T is the threshold value.
● If the pixel intensity I(x,y) is less than or equal to the threshold T, the output pixel value is set to 0 (black).
● If the pixel intensity I(x,y) is greater than the threshold T, the output pixel value is set to 255 (white).
Pros of Simple Thresholding
● Simple and easy to implement.
● Computationally efficient.
Cons of Simple Thresholding
● Ineffective for images with varying lighting conditions.
● Requires manual selection of the threshold value.

2. Adaptive Thresholding
Adaptive thresholding is used for images with non-uniform illumination. Instead of a single global threshold value,
it calculates the threshold for small regions of the image, which allows for better handling of varying lighting
conditions.
Types of Adaptive Thresholding
● Mean Thresholding: The threshold value is the mean of the neighborhood area.
○ N is the neighborhood of (x,y)
○ |N| is the number of pixels in the neighborhood
● Gaussian Thresholding: The threshold value is a weighted sum (Gaussian window) of the neighborhood
area.
○ w(i,j) are the weights given by the Gaussian window
Pros of Adaptive Thresholding
● Handles varying illumination well.
● More accurate for complex images.
Cons of Adaptive Thresholding
● More computationally intensive.
● Requires careful selection of neighborhood size and method parameters.

3. Otsu's Thresholding
Otsu's method is an automatic thresholding technique that calculates the optimal threshold value by
minimizing the intra-class variance (the variance within the foreground and background classes).
Steps to perform Otsu's Thresholding
1. Compute the histogram and probabilities of each intensity level.
2. Compute the cumulative sums, means, and variances for all threshold values.
3. Select the threshold that minimizes the within-class variance.
Pros of Otsu's Thresholding
● Automatic selection of the threshold value.
● Effective for bimodal histograms.
Cons of Otsu's Thresholding
● Assumes a bimodal histogram, which may not be suitable for all images.
● Computationally more intensive than simple thresholding.

4. Multilevel Thresholding
Multilevel thresholding extends simple thresholding by using multiple threshold values to segment the image into
more than two regions. This is useful for images with complex structures and varying intensities.
Approaches of Multilevel Thresholding
● Otsu's Method Extension: Extending Otsu's method to multiple levels.
● Optimization Techniques: Using optimization algorithms to determine multiple thresholds.
Pros of Multilevel Thresholding
● Can segment images into multiple regions.
● Useful for images with complex intensity distributions.
Cons of Multilevel Thresholding
● More computationally intensive.
● Requires careful selection of the number of thresholds.

5. Color Thresholding
In color images, thresholding can be applied to each color channel (e.g., RGB, HSV) separately. This method
leverages color information to segment objects.
Approaches of Color Thresholding
● Manual Thresholding: Setting thresholds for each color channel manually.
● Automatic Thresholding: Using methods like Otsu's method for each channel.
Pros of Color Thresholding
● Effective for segmenting objects based on color.
● Can handle images with rich color information.
Cons of Color Thresholding
● More complex than grayscale thresholding.
● Requires careful selection of thresholds for each channel.

6. Local Thresholding
Local thresholding calculates a different threshold for each pixel based on its local neighborhood. This
method is effective for images with non-uniform illumination or varying textures.
Techniques of Local Thresholding
1. Niblack's Method
● The threshold is calculated as the mean of the local neighborhood minus a constant times the
standard deviation.
● T(x,y)=μ(x,y)+kσ(x,y)
● T(x,y)=μ(x,y)+kσ(x,y)
● Here,
○ μ(x,y) is the mean and σ(x,y) is the standard deviation of the local neighborhood
○ k is a constant.

7. Global Thresholding
Global thresholding uses a single threshold value for the entire image. This technique is suitable for
images with uniform lighting and clear contrast between the foreground and background.
Pros of Global Thresholding
● Simple and easy to implement.
● Computationally efficient.
Cons of Global Thresholding
● Not suitable for images with varying illumination.
● Requires manual selection of the threshold value

8. Iterative Thresholding
Iterative thresholding starts with an initial guess for the threshold value and iteratively refines it based
on the mean intensity of the pixels above and below the threshold. The process continues until the
threshold value converges.
Pros of Iterative Thresholding
● Provides an automatic way to determine the threshold.
● Suitable for images with a clear distinction between foreground and background.
Cons of Iterative Thresholding
● May require several iterations to converge.
● Not effective for images with complex intensity distributions.

Applications of Thresholding
Thresholding techniques are used in various applications, including:
1. Document Image Analysis: Thresholding is widely used to binarize text in scanned documents, making it
easier for Optical Character Recognition (OCR) systems to process the text.
2. Medical Imaging: In medical imaging, thresholding is used to segment anatomical structures in MRI or
CT scans, aiding in diagnosis and treatment planning.
3. Industrial Inspection: Thresholding is employed in industrial inspection systems to detect defects in
manufactured products, ensuring quality control.
4. Object Detection: In surveillance footage or robotic vision systems, thresholding is used to identify and
track objects, enhancing security and automation.

3.6 Region-based Segmentation
This process involves dividing the image into smaller
segments that have a certain set of rules. This
technique employs an algorithm that divides the image
into several components with common pixel
characteristics. The process looks out for chunks of
segments within the image. Small segments can
include similar pixes from neighboring pixels and
subsequently grow in size. The algorithm can pick up
the gray level from surrounding pixels.

3.6.1 Region Growing
Region growing approach is the opposite of the split and merge
approach:
• An initial set of small areas is iteratively merged according to
similarity constraints.
• Start by choosing an arbitrary seed pixel and compare it with
neighboring pixels (see Fig).
• Region is grown from the seed pixel by adding in neighboring
pixels that are similar, increasing the size of the region.
• When the growth of one region stops we simply choose
another seed pixel which does not yet belong to any region and
start again.
• This whole process is continued until all pixels belong to some
region.
• A bottom up method. Region growing methods often give
very good segmentations that correspond well to the observed
edges. However starting with a particular seed pixel and letting
this region grow completely before trying other seeds biases the
segmentation in favour of the regions which are segmented first.

3.6.2 Split and Merge
● Split-and-merge segmentation is based on a quadtree partition of an image. It is sometimes called
quadtree segmentation.
● A combination of splitting and merging may result in a method with the advantages of both approaches.
● Split-and-merge approaches work using pyramid image representations; regions are square-shaped and
correspond to elements of the appropriate pyramid level.

Splitting and Merging Algorithm
1. Define an initial segmentation into regions, a
homogeneity criterion, and a pyramid data structure.
2. If any region R in the pyramid data structure is not
homogeneous (H (R) = FALSE), split it into 4 child-
regions; if any 4 regions with the same parent can
be merged into a single homogenous region, merge
them. If no region can be split or merged, go to (step
3).
3. If there are any two adjacent regions, Ri, Rj (even if
they are in different pyramid levels or do not have
the same parent) that can be merged into a
homogeneous region, merge them.
4. Merge small regions with the most similar adjacent
region if it is necessary to remove small-size
regions.

Example:
1. Original grey image
2. Split of a into 4 regions
3. Split b grey regions; one is still grey
4. Spit last c grey region = final quad tree

3.8 Edge Detection and Boundary Extraction
What is image edge detection?
Edge detection is a technique used to identify the
boundaries of objects within images. It helps in
simplifying the image data by reducing the amount of
information to be processed while preserving the
structural properties of the image. This simplification is
essential for various image analysis tasks, including
object recognition, segmentation, and image
enhancement.

Boundary Extraction
The boundary of the image is different from the edges in the image. Edges represent the abrupt change in pixel
intensity values while the boundary of the image is the contour. As the name boundary suggests that something
whose ownership changes, in the image when pixel ownership changes from one surface to another, the boundary
comes into the picture. Edge is basically the boundary line but the boundary is the line or location dividing the two
surfaces.

Types of boundary extraction techniques
There are two types of boundaries in binary images.
● Inner boundary: It is the difference
between the original image and the eroded
image. The eroded image is the shrunk
image when erosion is applied to the
original image. On taking the difference
between the original image and the eroded
version, we get the inner boundary of the
image. The inner boundary is the part of
the main surface separating the other
surface. Erosion shrinks the white portion
thus the boundary is the part of the white
surface itself.

● Outer boundary: It is the difference
between dilated image and an original
image. The dilated image is the
expanded image when dilation is
applied to the original image. Dilation
increases the white portion of the
image. On taking the difference
between dilated and original versions of
the image we get the boundary which is
the lost art of the black surface of the
original image.

Unit 3 Image Compression and Segmentation.pptx

Unit 3 Image Compression and Segmentation.pptx

More Related Content

Similar to Unit 3 Image Compression and Segmentation.pptx

Recently uploaded

Unit 3 Image Compression and Segmentation.pptx