Data compression algorithms

DATACOMPRESSIONALGORITHMS
- MOHNISH REDDY
(16CS01034)

What is ‘DATA COMPRESSION’ ?
As the name suggests it is a technique by which data is compressed. Virtual
data such as text files, images, video files, game files, etc. are made smaller
in size while preserving the original data as much as possible..

Why do we need ‘Data Compression’
There are basically two main reasons why we need Data Compression..
● Large files takes up lot of space. There are many files which are not
used frequently; these files can be compressed and stored, which
takes up much less space, and decompressed very easily when
required.
● The second utility is much more common and widely used,
downloading large files from net is slow and data consuming thus,
by downloading compressed files we not only save our time but
also our internet data.

Lossless and Lossy Data Compression
Lossless : On decompressing the
compressed file there is absolutely no
loss of data/quality.
Lossy : On decompression there is
small or large loss of data/quality

Different algorithms for data compression
● Run-Length Encoding
● Huffman Coding
● Lempel-Ziv-Welch Encoding
● Arithmetic coding
● Delta Encoding
● Adaptive Huffman coding
● Wavelet compression
● Discrete Cosine Transform

Run Length Encoding
This is the most basic method for data compression, as the name suggests while
iterating over each term we look for repetitive terms and encode them to
shorter/compressed forms.
Complexity : O(n)
Example:
➢ A line with ‘a’ as repeating character.
➢ Two characters in the compressed line replace each run of ‘a’.
➢ For the first 8 repeating a’s in original file, the first encoded stream in
compressed line is showing that ’a’ was repeating 8 times.

Huffman Coding
The characters in a data file are converted to a binary code. The most common
characters in the input file(characters with higher probability) are assigned short
binary codes and least common characters(with lower probabilities) are assigned
longer binary codes. Codes can be of different lengths.
Complexity : O(nLogn)
Example : MISSISSIPPI_RIVER
= 17 Characters = (17*8) bits (as 1 ASCII char = 8 bits)
= 136 bits (Original number of bits/Space consumed)

Compression Rate = (136-46)/136 = 66.18%

Drawbacks
Though there are not many drawbacks of data compression some small
disadvantages are listed below..
● Data Compression is mostly time consuming.
● There are not many optimal data compression techniques, some which exists
are not always efficient.
● Due to UV radiation or Magnetic field there is some loss of data, on non-
compressed files its not very destructive but in compressed loss of small data
can lead to loss of large chunks of data.

Data compression algorithms

More Related Content

Similar to Data compression algorithms

Recently uploaded

Data compression algorithms

Editor's Notes