SlideShare a Scribd company logo
1 of 39
1
S
I
L
I
C
O
N
MAIN TOPIC
NAME:Rashmi kanta mohapatra
ROLL.No:052
2
S
I
L
I
C
O
N
contents
īŽ IntroductionIntroduction
īŽ What ,whenWhat ,when
īŽ Some questionSome question
īŽ UsesUses
īŽ Major stepsMajor steps
īŽ Type of data compressionType of data compression
īŽ disadvantagesdisadvantages
īŽ conclusionconclusion
3
S
I
L
I
C
O
N
INTRODUCTION
Data Compression What:Data Compression What:
īŽ As name implies, makes your data smaller, saving space
īŽ Looks for repetitive sequences or patterns in data - e.g. the
the quick the brown fox the
īŽ We are more repetitive than we think - text often
compresses over 50%
īŽ Lossless vs. lossy
4
S
I
L
I
C
O
N
Data Compression - WHY
īŽ Most data from nature has redundancy
īŽ There is more data than the actual information contained
in the data.
īŽ Squeezing out the excess data amounts to compression.
īŽ However, unsqeezing out is necessary to be able to figure
out what the data means.
īŽ Always possible to compress?
īŽ Consider a two-bit sequence.
īŽ Can you always compress it to one bit?
īŽ the limits of compression and give clues on
how to compress well.
5
S
I
L
I
C
O
N
Question:
Question:Question: Why do we want to make files smaller?Why do we want to make files smaller?
Answer:Answer:
īĩ To use less storage, i.e., saving costsTo use less storage, i.e., saving costs
īĩ To transmit these files faster, decreasing accessTo transmit these files faster, decreasing access
time or using the same access time, but with atime or using the same access time, but with a
lower and cheaper bandwidthlower and cheaper bandwidth
īĩ To process the file sequentially faster.To process the file sequentially faster.
6
S
I
L
I
C
O
N
MAJOR STEPS
UncompressUncompress īƒ īƒ  PreparationPreparation īƒ īƒ  QuantizationQuantizationīƒ īƒ  EntropyEntropy īƒ īƒ  Compress dataCompress dataīƒ īƒ 
Data EncodingData Encoding
7
S
I
L
I
C
O
N
Preparation:-Preparation:-It include analog to digital conversionIt include analog to digital conversion
and generating appropriate digital representationand generating appropriate digital representation
of the information. An image is divided intoof the information. An image is divided into
blacks of 8/8 pixels, and represented by affix no.blacks of 8/8 pixels, and represented by affix no.
of bit per pixel.of bit per pixel.
īŽ Processing:-Processing:-It is 1st stage of compression processIt is 1st stage of compression process
which make use sophisticated algorithms.which make use sophisticated algorithms.
īŽ Quantization:-Quantization:-It is the result of previous step. ItIt is the result of previous step. It
specifies the granularity of the mapping of realspecifies the granularity of the mapping of real
number into integer number. This process resultsnumber into integer number. This process results
in a reduction of precision.in a reduction of precision.
īŽ Entropy encoding: -Entropy encoding: - It is the last step. ItIt is the last step. It
compresses a sequential digital data streamcompresses a sequential digital data stream
without loss. For ex:-compress sequence ofwithout loss. For ex:-compress sequence of
zeroes specifying the no. of occurrence.zeroes specifying the no. of occurrence.
8
S
I
L
I
C
O
N
USES OF DATA
COMPRESSION
īŽ More and more data is being stored electronically. DigitalMore and more data is being stored electronically. Digital
video libraries, for example, contain vast amounts of data,video libraries, for example, contain vast amounts of data,
and compression allows cost-effective storage of the data.and compression allows cost-effective storage of the data.
īŽ New technology has allowed the possibility of interactiveNew technology has allowed the possibility of interactive
digital television and the demand is for high-qualitydigital television and the demand is for high-quality
transmissions, a wide selection of programs to choose fromtransmissions, a wide selection of programs to choose from
and inexpensive hardware. But for digital television to be aand inexpensive hardware. But for digital television to be a
success, it must use data compression [Saxton, 1996].success, it must use data compression [Saxton, 1996]. DataData
compression reduces the number of bits required tocompression reduces the number of bits required to
represent or transmit information.represent or transmit information.
9
S
I
L
I
C
O
N
TYPES OF DATA
COMPRESSIONīŽ Entropy encodingEntropy encoding -- lossless. Data considered a-- lossless. Data considered a
simple digital sequence and semantics of data aresimple digital sequence and semantics of data are
ignored.ignored.
īŽ Source encodingSource encoding -- lossy. Takes semantics of data-- lossy. Takes semantics of data
into account. Amount of compression depends oninto account. Amount of compression depends on
data contents.data contents.
īŽ Hybrid encodingHybrid encoding -- combination of entropy and-- combination of entropy and
source. Most multimedia systems use these.source. Most multimedia systems use these.
10
S
I
L
I
C
O
N
TYPES OF DATA
COMPRESSIONīŽ Entropy encodingEntropy encoding -- lossless.-- lossless.
īĩ Data in data stream considered a simple digitalData in data stream considered a simple digital
sequence and semantics of data are ignored.sequence and semantics of data are ignored.
īĩ Short Code words for frequently occurring symbols.Short Code words for frequently occurring symbols.
Longer Code words for more infrequently occurringLonger Code words for more infrequently occurring
symbolssymbols
ī´ For example: E occurs frequently in English, soFor example: E occurs frequently in English, so
we should give it a shorter code than Qwe should give it a shorter code than Q
īĩ Examples of Entropy Encoding:Examples of Entropy Encoding:
ī´ Loss less data compressionLoss less data compression
ī´ Huffman codingHuffman coding
ī´ Arithmetic codingArithmetic coding
11
S
I
L
I
C
O
N
LOSSLESS DATA
COMPRESSION
īŽ Run-Length CodingRun-Length Coding
īĩ RunsRuns (sequences) of data are stored as a single value(sequences) of data are stored as a single value
and count, rather than the individual run.and count, rather than the individual run.
īĩ Example:Example:
ī´ ThisThis::
â€ĸ WWWWWWWWWWWWBWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWW
WWBBBWWWWWWWWWWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWW
WWWWWBWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW
ī´ Becomes:Becomes:
â€ĸ 12WB12W3B24WB14W12WB12W3B24WB14W
12
S
I
L
I
C
O
N
īŽ Data is not lost - the original is really needed.Data is not lost - the original is really needed.
īŽ text compression.text compression.
īŽ compression of computer binaries to fit on acompression of computer binaries to fit on a
floppy.floppy.
īŽ Compression ratio typically 2:1 to 8:1Compression ratio typically 2:1 to 8:1..
lossless compression on many kinds of files.lossless compression on many kinds of files.
īŽ Statistical Techniques:Statistical Techniques:
īŽ Huffman coding.Huffman coding.
īŽ Arithmetic coding.Arithmetic coding.
īŽ Dictionary techniques:Dictionary techniques:
īŽ LZW, LZ77.LZW, LZ77.
Standards - Morse code, Braille, Unix compress,Standards - Morse code, Braille, Unix compress,
gzip,gzip,
īŽ zip, bzip, GIF, PNG, JBIG, Lossless JPEG.zip, bzip, GIF, PNG, JBIG, Lossless JPEG.
13
S
I
L
I
C
O
N
SHANNON-FANO
COADING
īŽ Shannon lossless source coding theorem isShannon lossless source coding theorem is
based on the concept of block coding. Tobased on the concept of block coding. To
illustrate this concept, we introduce aillustrate this concept, we introduce a
special information source in which thespecial information source in which the
alphabet consists of only two letters:alphabet consists of only two letters:
1.1. First-Order Block CodeFirst-Order Block Code
A={a,b}A={a,b}
14
S
I
L
I
C
O
N
B1B1 P(B1)P(B1) CodewordCodeword
aa 0.50.5 00
BB 0.50.5 11
R=1 bit/characterR=1 bit/character
15
S
I
L
I
C
O
N
An example:-
Note that 24 bits are used to represent 24Note that 24 bits are used to represent 24
characters --- an average of 1characters --- an average of 1
bit/character.bit/character.
16
S
I
L
I
C
O
N
īŽ Second-Order Block Code :-Second-Order Block Code :- Pairs ofPairs of
characters are mapped to either one, two, or threecharacters are mapped to either one, two, or three
bits.bits.
17
S
I
L
I
C
O
N
.
īŽ ..B2B2 P(B2)P(B2) CodewordCodeword
aaaa 0.450.45 00
bbbb 0.450.45 1010
abab 0.050.05 110110
baba 0.050.05 111111
R=0.825bits/characterR=0.825bits/character
18
S
I
L
I
C
O
N
An example:
Note that 20 bits are used to represent 24Note that 20 bits are used to represent 24
characters --- an average of 0.83characters --- an average of 0.83
bits/character.bits/character.
19
S
I
L
I
C
O
N
īŽ Third-Order Block Code: -Third-Order Block Code: -Triplets ofTriplets of
characters are mapped to bit sequence of lengths onecharacters are mapped to bit sequence of lengths one
through six.through six.
20
S
I
L
I
C
O
N
..
B3B3 P(B3)P(B3) CodewordCodeword
aaaaaa 0.4050.405 00
bbbbbb 0.4050.405 0101
aabaab 0.4050.405 11001100
abbabb 0.4050.405 11011101
bbabba 0.4050.405 11101110
baabaa 0.4050.405 1111011110
abaaba 0.0050.005 111110111110
R=0.68R=0.68 Bits/charactersBits/characters
21
S
I
L
I
C
O
N
īŽ An example:An example:
Note that 17 bits are used to represent 24Note that 17 bits are used to represent 24
characters --- an average of 0.71characters --- an average of 0.71
bits/character.bits/character.
22
S
I
L
I
C
O
N
HUFFMAN CODING
īŽ Suppose messages are made of letters a, b, c, d, and e,Suppose messages are made of letters a, b, c, d, and e,
which appear with probabilities .12, .4, .15, .08, and .25,which appear with probabilities .12, .4, .15, .08, and .25,
respectively.respectively.
īŽ We wish to encode each character into a sequence of 0’sWe wish to encode each character into a sequence of 0’s
and 1’s so that no code for a character is theand 1’s so that no code for a character is the prefixprefix forfor
another.another.
īŽ Answer (using Huffman’s algorithm given on the nextAnswer (using Huffman’s algorithm given on the next
slide): a=1111, b=0, c=110, d=1110, e=10.slide): a=1111, b=0, c=110, d=1110, e=10.
23
S
I
L
I
C
O
N
HUFFMAN CODING
īŽ ExampleExample
īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9].
92 5 4 7 9
24
S
I
L
I
C
O
N
HUFFMAN CODING
īŽ ExampleExample
95 75 7 9
2
īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9].
4
6
25
S
I
L
I
C
O
N
HUFFMAN CODING
īŽ EXAMPLEEXAMPLE
5
īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9].
2 4
6
11
7 9
16
26
S
I
L
I
C
O
N
HUFFMAN CODING
īŽ ExampleExample
5
īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9].
īŽ 2=0102=010
īŽ 5=005=00
īŽ 4=0114=011
īŽ 7=107=10
īŽ 9=119=11 2 4
6
11
7 9
16
27
00
0
0
0
1
1
1 1
27
S
I
L
I
C
O
N
LZ-77 ENCODING
īŽ Good as they are, Huffman and arithmeticGood as they are, Huffman and arithmetic
coding are not perfect for encoding textcoding are not perfect for encoding text
because they don't capture the higher-orderbecause they don't capture the higher-order
relationships between words and phrases.relationships between words and phrases.
There is a simple, clever, and effectiveThere is a simple, clever, and effective
approach to compressing text known asapproach to compressing text known as
"LZ-77", which uses the redundant nature"LZ-77", which uses the redundant nature
of text to provide compression.of text to provide compression.
28
S
I
L
I
C
O
N
For an example, consider the phrase:For an example, consider the phrase:
the_rain_in_Spain_falls_mainly_in_the_the_rain_in_Spain_falls_mainly_in_the_
plainplain
-- where the underscores ("_") indicate-- where the underscores ("_") indicate
spaces. This uncompressed message is 43spaces. This uncompressed message is 43
bytes, or 344 bits, long.bytes, or 344 bits, long.
29
S
I
L
I
C
O
N
the_rain_in_Spain_falls_mainly_in_the_plain
At first, LZ-77 simply outputs uncompressedAt first, LZ-77 simply outputs uncompressed
characters, since there are no previous occurrencescharacters, since there are no previous occurrences
of any strings to refer back to. In our example,of any strings to refer back to. In our example,
these characters will not be compressed:these characters will not be compressed:
1- the_rain_1- the_rain_ The next chunk of the message:The next chunk of the message:
in_in_ -- has occurred earlier in the message, and can-- has occurred earlier in the message, and can
be represented as a pointer back to that earlier text,be represented as a pointer back to that earlier text,
along with a length field. This gives:along with a length field. This gives:
2-the_rain_<3,3>2-the_rain_<3,3>
30
S
I
L
I
C
O
N
the_rain_in_Spain_falls_mainly_in_the_plain
-- which has to be output uncompressed:-- which has to be output uncompressed:
3- the_rain_<3,3>Sp3- the_rain_<3,3>Sp However, the charactersHowever, the characters
"ain_" have already been sent, so they are encoded"ain_" have already been sent, so they are encoded
with a pointer:with a pointer:
4- the_rain_<3,3>Sp<9,4>4- the_rain_<3,3>Sp<9,4>
The characters "falls_m" are output uncompressed,The characters "falls_m" are output uncompressed,
but "ain" has been used before in "rain" andbut "ain" has been used before in "rain" and
"Spain", so once again it is encoded with a"Spain", so once again it is encoded with a
pointer:pointer:
5- the_rain_<3,3>Sp<9,4>falls _m<11,3>5- the_rain_<3,3>Sp<9,4>falls _m<11,3>
31
S
I
L
I
C
O
N
the_rain_in_Spain_falls_mainly_in_the_plain
6-6- the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3>the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3>
<34,4><34,4>
7- the_rain_in_Spain_falls_mainly_in_the_plain7- the_rain_in_Spain_falls_mainly_in_the_plain
FINAL STEPFINAL STEP
the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3><34,4>pl<the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3><34,4>pl<
15,3>15,3>
So total byte acquire this above text is 23So total byte acquire this above text is 23
Actual is 43Actual is 43
32
S
I
L
I
C
O
N
ARITHMATIC CODEIND
īŽ Huffman coding looks pretty slick, and it is, butHuffman coding looks pretty slick, and it is, but
there's a way to improve on it, known asthere's a way to improve on it, known as
"arithmetic coding". The idea is subtle and best"arithmetic coding". The idea is subtle and best
explained by example.explained by example.
īŽ Suppose we have a message that only contains theSuppose we have a message that only contains the
characters A, B, and C, with the followingcharacters A, B, and C, with the following
frequencies, expressed as fractions:frequencies, expressed as fractions:
īŽ A: 0.5 B: 0.2 C: 0.3A: 0.5 B: 0.2 C: 0.3
33
S
I
L
I
C
O
N
letter probability interval binary fractionletter probability interval binary fraction
____ _________ ______ ___________ _________ ______ _______
C: 0.3 0.0 : 0.3 0C: 0.3 0.0 : 0.3 0
B: 0.2 0.3 : 0.5 0.011 = 3/8 = 0.375B: 0.2 0.3 : 0.5 0.011 = 3/8 = 0.375
A: 0.5 0.5 : 1.0 0.1 = 1/2 = 0.5A: 0.5 0.5 : 1.0 0.1 = 1/2 = 0.5
34
S
I
L
I
C
O
N
Irreversible Compression
īŽ Irreversible CompressionIrreversible Compression is based on the assumptionis based on the assumption
that some information can be sacrificed. [Irreversiblethat some information can be sacrificed. [Irreversible
compression is also calledcompression is also called Entropy ReductionEntropy Reduction].].
īŽ Example: Shrinking a raster image from 400-by-400Example: Shrinking a raster image from 400-by-400
pixels to 100-by-100 pixels. The new image containspixels to 100-by-100 pixels. The new image contains
1 pixel for every 16 pixels in the original image.1 pixel for every 16 pixels in the original image.
īŽ There is usually no way to determine what theThere is usually no way to determine what the
original pixels were from the one new pixel.original pixels were from the one new pixel.
īŽ In data files, irreversible compression is seldom used.In data files, irreversible compression is seldom used.
However, it is used in image and speech processing.However, it is used in image and speech processing.
35
S
I
L
I
C
O
N
LOSSY COMPRESSION
īŽ Data is lost, but not too much:Data is lost, but not too much:
īŽ Audio.Audio.
īŽ Video.Video.
īŽ Still images, medical images, photographs.Still images, medical images, photographs.
īŽ Compression ratios of 10:1 often yield quiteCompression ratios of 10:1 often yield quite
īŽ Major techniques include:Major techniques include:
īŽ Vector Quantization.Vector Quantization.
īŽ Block transforms.Block transforms.
īŽ Standards – JPEG, JPEG 2000, MPEG (1, 2, 4, 7).Standards – JPEG, JPEG 2000, MPEG (1, 2, 4, 7).
36
S
I
L
I
C
O
N
IMAGE COMPRESSION
a) 24-bit true colour
bitmap (253,014
bytes)
b) 60% image quality (5,599
bytes)
37
S
I
L
I
C
O
N
DISADVANTAGES
Some technique are there by which data canSome technique are there by which data can
compress efficiently. But there is a chancecompress efficiently. But there is a chance
of losses data.of losses data.
38
S
I
L
I
C
O
N
CONCLUSION
From the above description ,there is noFrom the above description ,there is no
algorithm has not been devloped.That is noalgorithm has not been devloped.That is no
such kind of algorithm which is applicablesuch kind of algorithm which is applicable
in every data file.But this difficulties can bein every data file.But this difficulties can be
handle by using Hybrid data compression.Inhandle by using Hybrid data compression.In
this IT ara data compression isthis IT ara data compression is
essential.Even though some data will beessential.Even though some data will be
loss.loss.
39
S
I
L
I
C
O
N

More Related Content

What's hot

Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic codingVikas Goyal
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisRamakant Soni
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algorithamRahul Khanwani
 
Huffman coding01
Huffman coding01Huffman coding01
Huffman coding01Nv Thejaswini
 
Lzw coding technique for image compression
Lzw coding technique for image compressionLzw coding technique for image compression
Lzw coding technique for image compressionTata Consultancy Services
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding09lavee
 
Multimedia lossless compression algorithms
Multimedia lossless compression algorithmsMultimedia lossless compression algorithms
Multimedia lossless compression algorithmsMazin Alwaaly
 
Manoch1raw 160512091436
Manoch1raw 160512091436Manoch1raw 160512091436
Manoch1raw 160512091436marangburu42
 
Huffman coding || Huffman Tree
Huffman coding || Huffman TreeHuffman coding || Huffman Tree
Huffman coding || Huffman TreeSatishKumarInumarthi
 
Multimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and EntropyMultimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and EntropyUnited States Air Force Academy
 
NETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical AddressingNETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical AddressingPankaj Debbarma
 
Ch19 network layer-logical add
Ch19 network layer-logical addCh19 network layer-logical add
Ch19 network layer-logical addMohammed Romi
 

What's hot (19)

Arithmetic Coding
Arithmetic CodingArithmetic Coding
Arithmetic Coding
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb CodesLec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysis
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algoritham
 
Huffman coding01
Huffman coding01Huffman coding01
Huffman coding01
 
Lzw coding technique for image compression
Lzw coding technique for image compressionLzw coding technique for image compression
Lzw coding technique for image compression
 
Arithmetic coding
Arithmetic codingArithmetic coding
Arithmetic coding
 
Huffman Student
Huffman StudentHuffman Student
Huffman Student
 
Multimedia lossless compression algorithms
Multimedia lossless compression algorithmsMultimedia lossless compression algorithms
Multimedia lossless compression algorithms
 
Manoch1raw 160512091436
Manoch1raw 160512091436Manoch1raw 160512091436
Manoch1raw 160512091436
 
Lecture 01
Lecture 01Lecture 01
Lecture 01
 
Shannon Fano
Shannon FanoShannon Fano
Shannon Fano
 
Huffman coding || Huffman Tree
Huffman coding || Huffman TreeHuffman coding || Huffman Tree
Huffman coding || Huffman Tree
 
ECE 4490 Multimedia Communication Lec01
ECE 4490 Multimedia Communication Lec01ECE 4490 Multimedia Communication Lec01
ECE 4490 Multimedia Communication Lec01
 
Multimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and EntropyMultimedia Communication Lec02: Info Theory and Entropy
Multimedia Communication Lec02: Info Theory and Entropy
 
NETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical AddressingNETWORK LAYER - Logical Addressing
NETWORK LAYER - Logical Addressing
 
Huffman Encoding Pr
Huffman Encoding PrHuffman Encoding Pr
Huffman Encoding Pr
 
Ch19 network layer-logical add
Ch19 network layer-logical addCh19 network layer-logical add
Ch19 network layer-logical add
 

Similar to Data compretion

Chapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptxChapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptxMedinaBedru
 
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...Helan4
 
Source coding
Source codingSource coding
Source codingMOHIT KUMAR
 
Compressionbasics
CompressionbasicsCompressionbasics
CompressionbasicsRohini R Iyer
 
Teknik Pengkodean (2).pptx
Teknik Pengkodean (2).pptxTeknik Pengkodean (2).pptx
Teknik Pengkodean (2).pptxzulhelmanz
 
VII Compression Introduction
VII Compression IntroductionVII Compression Introduction
VII Compression Introductionsangusajjan
 
Compression techniques
Compression techniquesCompression techniques
Compression techniquesm_divya_bharathi
 
Image compression
Image compression Image compression
Image compression GARIMA SHAKYA
 
Data Compression
Data CompressionData Compression
Data CompressionDr Qaim Mehdi
 
Ch03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standardCh03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standardtarekiceiuk
 
Data representation in a computer
Data representation in a computerData representation in a computer
Data representation in a computerGirmachew Tilahun
 
Data Compression Project Presentation
Data Compression Project PresentationData Compression Project Presentation
Data Compression Project PresentationMyuran Kanga, MS, MBA
 
UNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptx
UNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptxUNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptx
UNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptxKaameshwaranKaameshw
 
ImageCompression.ppt
ImageCompression.pptImageCompression.ppt
ImageCompression.pptdudoo1
 
ImageCompression.ppt
ImageCompression.pptImageCompression.ppt
ImageCompression.pptssuser6d1fca
 

Similar to Data compretion (20)

Chapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptxChapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptx
 
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
 
Source coding
Source codingSource coding
Source coding
 
Compressionbasics
CompressionbasicsCompressionbasics
Compressionbasics
 
Teknik Pengkodean (2).pptx
Teknik Pengkodean (2).pptxTeknik Pengkodean (2).pptx
Teknik Pengkodean (2).pptx
 
VII Compression Introduction
VII Compression IntroductionVII Compression Introduction
VII Compression Introduction
 
Compression techniques
Compression techniquesCompression techniques
Compression techniques
 
Presentation on Image Compression
Presentation on Image Compression Presentation on Image Compression
Presentation on Image Compression
 
Image compression
Image compression Image compression
Image compression
 
Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...Introduction Data Compression/ Data compression, modelling and coding,Image C...
Introduction Data Compression/ Data compression, modelling and coding,Image C...
 
Data Compression
Data CompressionData Compression
Data Compression
 
Ch03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standardCh03 block-cipher-and-data-encryption-standard
Ch03 block-cipher-and-data-encryption-standard
 
Data representation in a computer
Data representation in a computerData representation in a computer
Data representation in a computer
 
Data compression
Data compressionData compression
Data compression
 
Data Compression Project Presentation
Data Compression Project PresentationData Compression Project Presentation
Data Compression Project Presentation
 
UNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptx
UNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptxUNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptx
UNIT-I U20EST109 - PROBLEM SOLVING APPROACH - Copy (1).pptx
 
Compression ii
Compression iiCompression ii
Compression ii
 
Chap54
Chap54Chap54
Chap54
 
ImageCompression.ppt
ImageCompression.pptImageCompression.ppt
ImageCompression.ppt
 
ImageCompression.ppt
ImageCompression.pptImageCompression.ppt
ImageCompression.ppt
 

More from Sajan Sahu

Architecture of message oriented middleware
Architecture of message oriented middlewareArchitecture of message oriented middleware
Architecture of message oriented middlewareSajan Sahu
 
Insurance envoy
Insurance envoyInsurance envoy
Insurance envoySajan Sahu
 
Computer’s memory
Computer’s memoryComputer’s memory
Computer’s memorySajan Sahu
 
Automated inspection of aircraft
Automated inspection of aircraftAutomated inspection of aircraft
Automated inspection of aircraftSajan Sahu
 
Deadlock
DeadlockDeadlock
DeadlockSajan Sahu
 
Data warehouseing
Data warehouseingData warehouseing
Data warehouseingSajan Sahu
 
Information system
Information systemInformation system
Information systemSajan Sahu
 
Dna computing
Dna computingDna computing
Dna computingSajan Sahu
 
Wireless application protocol (WAP)
Wireless application protocol (WAP)Wireless application protocol (WAP)
Wireless application protocol (WAP)Sajan Sahu
 
Blink detection and tracking of eyes for eye localisat
Blink detection and tracking of eyes for eye localisatBlink detection and tracking of eyes for eye localisat
Blink detection and tracking of eyes for eye localisatSajan Sahu
 
Database system
Database systemDatabase system
Database systemSajan Sahu
 
Bluetooth
Bluetooth Bluetooth
Bluetooth Sajan Sahu
 
Internet telephony
Internet telephonyInternet telephony
Internet telephonySajan Sahu
 
Criptography
CriptographyCriptography
CriptographySajan Sahu
 
Implimating counter
Implimating counterImplimating counter
Implimating counterSajan Sahu
 

More from Sajan Sahu (20)

Architecture of message oriented middleware
Architecture of message oriented middlewareArchitecture of message oriented middleware
Architecture of message oriented middleware
 
Insurance envoy
Insurance envoyInsurance envoy
Insurance envoy
 
Computer’s memory
Computer’s memoryComputer’s memory
Computer’s memory
 
Automated inspection of aircraft
Automated inspection of aircraftAutomated inspection of aircraft
Automated inspection of aircraft
 
Deadlock
DeadlockDeadlock
Deadlock
 
Data warehouseing
Data warehouseingData warehouseing
Data warehouseing
 
Information system
Information systemInformation system
Information system
 
Dna computing
Dna computingDna computing
Dna computing
 
Wireless application protocol (WAP)
Wireless application protocol (WAP)Wireless application protocol (WAP)
Wireless application protocol (WAP)
 
Blink detection and tracking of eyes for eye localisat
Blink detection and tracking of eyes for eye localisatBlink detection and tracking of eyes for eye localisat
Blink detection and tracking of eyes for eye localisat
 
Database system
Database systemDatabase system
Database system
 
GPRS
GPRSGPRS
GPRS
 
Bios
BiosBios
Bios
 
Bluetooth
Bluetooth Bluetooth
Bluetooth
 
802.11
802.11802.11
802.11
 
Erp
ErpErp
Erp
 
Internet telephony
Internet telephonyInternet telephony
Internet telephony
 
Wcdma
WcdmaWcdma
Wcdma
 
Criptography
CriptographyCriptography
Criptography
 
Implimating counter
Implimating counterImplimating counter
Implimating counter
 

Recently uploaded

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vÃĄzquez
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
đŸŦ The future of MySQL is Postgres 🐘
đŸŦ  The future of MySQL is Postgres   🐘đŸŦ  The future of MySQL is Postgres   🐘
đŸŦ The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel AraÃējo
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Recently uploaded (20)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
đŸŦ The future of MySQL is Postgres 🐘
đŸŦ  The future of MySQL is Postgres   🐘đŸŦ  The future of MySQL is Postgres   🐘
đŸŦ The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Data compretion

  • 2. 2 S I L I C O N contents īŽ IntroductionIntroduction īŽ What ,whenWhat ,when īŽ Some questionSome question īŽ UsesUses īŽ Major stepsMajor steps īŽ Type of data compressionType of data compression īŽ disadvantagesdisadvantages īŽ conclusionconclusion
  • 3. 3 S I L I C O N INTRODUCTION Data Compression What:Data Compression What: īŽ As name implies, makes your data smaller, saving space īŽ Looks for repetitive sequences or patterns in data - e.g. the the quick the brown fox the īŽ We are more repetitive than we think - text often compresses over 50% īŽ Lossless vs. lossy
  • 4. 4 S I L I C O N Data Compression - WHY īŽ Most data from nature has redundancy īŽ There is more data than the actual information contained in the data. īŽ Squeezing out the excess data amounts to compression. īŽ However, unsqeezing out is necessary to be able to figure out what the data means. īŽ Always possible to compress? īŽ Consider a two-bit sequence. īŽ Can you always compress it to one bit? īŽ the limits of compression and give clues on how to compress well.
  • 5. 5 S I L I C O N Question: Question:Question: Why do we want to make files smaller?Why do we want to make files smaller? Answer:Answer: īĩ To use less storage, i.e., saving costsTo use less storage, i.e., saving costs īĩ To transmit these files faster, decreasing accessTo transmit these files faster, decreasing access time or using the same access time, but with atime or using the same access time, but with a lower and cheaper bandwidthlower and cheaper bandwidth īĩ To process the file sequentially faster.To process the file sequentially faster.
  • 6. 6 S I L I C O N MAJOR STEPS UncompressUncompress īƒ īƒ  PreparationPreparation īƒ īƒ  QuantizationQuantizationīƒ īƒ  EntropyEntropy īƒ īƒ  Compress dataCompress dataīƒ īƒ  Data EncodingData Encoding
  • 7. 7 S I L I C O N Preparation:-Preparation:-It include analog to digital conversionIt include analog to digital conversion and generating appropriate digital representationand generating appropriate digital representation of the information. An image is divided intoof the information. An image is divided into blacks of 8/8 pixels, and represented by affix no.blacks of 8/8 pixels, and represented by affix no. of bit per pixel.of bit per pixel. īŽ Processing:-Processing:-It is 1st stage of compression processIt is 1st stage of compression process which make use sophisticated algorithms.which make use sophisticated algorithms. īŽ Quantization:-Quantization:-It is the result of previous step. ItIt is the result of previous step. It specifies the granularity of the mapping of realspecifies the granularity of the mapping of real number into integer number. This process resultsnumber into integer number. This process results in a reduction of precision.in a reduction of precision. īŽ Entropy encoding: -Entropy encoding: - It is the last step. ItIt is the last step. It compresses a sequential digital data streamcompresses a sequential digital data stream without loss. For ex:-compress sequence ofwithout loss. For ex:-compress sequence of zeroes specifying the no. of occurrence.zeroes specifying the no. of occurrence.
  • 8. 8 S I L I C O N USES OF DATA COMPRESSION īŽ More and more data is being stored electronically. DigitalMore and more data is being stored electronically. Digital video libraries, for example, contain vast amounts of data,video libraries, for example, contain vast amounts of data, and compression allows cost-effective storage of the data.and compression allows cost-effective storage of the data. īŽ New technology has allowed the possibility of interactiveNew technology has allowed the possibility of interactive digital television and the demand is for high-qualitydigital television and the demand is for high-quality transmissions, a wide selection of programs to choose fromtransmissions, a wide selection of programs to choose from and inexpensive hardware. But for digital television to be aand inexpensive hardware. But for digital television to be a success, it must use data compression [Saxton, 1996].success, it must use data compression [Saxton, 1996]. DataData compression reduces the number of bits required tocompression reduces the number of bits required to represent or transmit information.represent or transmit information.
  • 9. 9 S I L I C O N TYPES OF DATA COMPRESSIONīŽ Entropy encodingEntropy encoding -- lossless. Data considered a-- lossless. Data considered a simple digital sequence and semantics of data aresimple digital sequence and semantics of data are ignored.ignored. īŽ Source encodingSource encoding -- lossy. Takes semantics of data-- lossy. Takes semantics of data into account. Amount of compression depends oninto account. Amount of compression depends on data contents.data contents. īŽ Hybrid encodingHybrid encoding -- combination of entropy and-- combination of entropy and source. Most multimedia systems use these.source. Most multimedia systems use these.
  • 10. 10 S I L I C O N TYPES OF DATA COMPRESSIONīŽ Entropy encodingEntropy encoding -- lossless.-- lossless. īĩ Data in data stream considered a simple digitalData in data stream considered a simple digital sequence and semantics of data are ignored.sequence and semantics of data are ignored. īĩ Short Code words for frequently occurring symbols.Short Code words for frequently occurring symbols. Longer Code words for more infrequently occurringLonger Code words for more infrequently occurring symbolssymbols ī´ For example: E occurs frequently in English, soFor example: E occurs frequently in English, so we should give it a shorter code than Qwe should give it a shorter code than Q īĩ Examples of Entropy Encoding:Examples of Entropy Encoding: ī´ Loss less data compressionLoss less data compression ī´ Huffman codingHuffman coding ī´ Arithmetic codingArithmetic coding
  • 11. 11 S I L I C O N LOSSLESS DATA COMPRESSION īŽ Run-Length CodingRun-Length Coding īĩ RunsRuns (sequences) of data are stored as a single value(sequences) of data are stored as a single value and count, rather than the individual run.and count, rather than the individual run. īĩ Example:Example: ī´ ThisThis:: â€ĸ WWWWWWWWWWWWBWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWW WWBBBWWWWWWWWWWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWW WWWWWBWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW ī´ Becomes:Becomes: â€ĸ 12WB12W3B24WB14W12WB12W3B24WB14W
  • 12. 12 S I L I C O N īŽ Data is not lost - the original is really needed.Data is not lost - the original is really needed. īŽ text compression.text compression. īŽ compression of computer binaries to fit on acompression of computer binaries to fit on a floppy.floppy. īŽ Compression ratio typically 2:1 to 8:1Compression ratio typically 2:1 to 8:1.. lossless compression on many kinds of files.lossless compression on many kinds of files. īŽ Statistical Techniques:Statistical Techniques: īŽ Huffman coding.Huffman coding. īŽ Arithmetic coding.Arithmetic coding. īŽ Dictionary techniques:Dictionary techniques: īŽ LZW, LZ77.LZW, LZ77. Standards - Morse code, Braille, Unix compress,Standards - Morse code, Braille, Unix compress, gzip,gzip, īŽ zip, bzip, GIF, PNG, JBIG, Lossless JPEG.zip, bzip, GIF, PNG, JBIG, Lossless JPEG.
  • 13. 13 S I L I C O N SHANNON-FANO COADING īŽ Shannon lossless source coding theorem isShannon lossless source coding theorem is based on the concept of block coding. Tobased on the concept of block coding. To illustrate this concept, we introduce aillustrate this concept, we introduce a special information source in which thespecial information source in which the alphabet consists of only two letters:alphabet consists of only two letters: 1.1. First-Order Block CodeFirst-Order Block Code A={a,b}A={a,b}
  • 14. 14 S I L I C O N B1B1 P(B1)P(B1) CodewordCodeword aa 0.50.5 00 BB 0.50.5 11 R=1 bit/characterR=1 bit/character
  • 15. 15 S I L I C O N An example:- Note that 24 bits are used to represent 24Note that 24 bits are used to represent 24 characters --- an average of 1characters --- an average of 1 bit/character.bit/character.
  • 16. 16 S I L I C O N īŽ Second-Order Block Code :-Second-Order Block Code :- Pairs ofPairs of characters are mapped to either one, two, or threecharacters are mapped to either one, two, or three bits.bits.
  • 17. 17 S I L I C O N . īŽ ..B2B2 P(B2)P(B2) CodewordCodeword aaaa 0.450.45 00 bbbb 0.450.45 1010 abab 0.050.05 110110 baba 0.050.05 111111 R=0.825bits/characterR=0.825bits/character
  • 18. 18 S I L I C O N An example: Note that 20 bits are used to represent 24Note that 20 bits are used to represent 24 characters --- an average of 0.83characters --- an average of 0.83 bits/character.bits/character.
  • 19. 19 S I L I C O N īŽ Third-Order Block Code: -Third-Order Block Code: -Triplets ofTriplets of characters are mapped to bit sequence of lengths onecharacters are mapped to bit sequence of lengths one through six.through six.
  • 20. 20 S I L I C O N .. B3B3 P(B3)P(B3) CodewordCodeword aaaaaa 0.4050.405 00 bbbbbb 0.4050.405 0101 aabaab 0.4050.405 11001100 abbabb 0.4050.405 11011101 bbabba 0.4050.405 11101110 baabaa 0.4050.405 1111011110 abaaba 0.0050.005 111110111110 R=0.68R=0.68 Bits/charactersBits/characters
  • 21. 21 S I L I C O N īŽ An example:An example: Note that 17 bits are used to represent 24Note that 17 bits are used to represent 24 characters --- an average of 0.71characters --- an average of 0.71 bits/character.bits/character.
  • 22. 22 S I L I C O N HUFFMAN CODING īŽ Suppose messages are made of letters a, b, c, d, and e,Suppose messages are made of letters a, b, c, d, and e, which appear with probabilities .12, .4, .15, .08, and .25,which appear with probabilities .12, .4, .15, .08, and .25, respectively.respectively. īŽ We wish to encode each character into a sequence of 0’sWe wish to encode each character into a sequence of 0’s and 1’s so that no code for a character is theand 1’s so that no code for a character is the prefixprefix forfor another.another. īŽ Answer (using Huffman’s algorithm given on the nextAnswer (using Huffman’s algorithm given on the next slide): a=1111, b=0, c=110, d=1110, e=10.slide): a=1111, b=0, c=110, d=1110, e=10.
  • 23. 23 S I L I C O N HUFFMAN CODING īŽ ExampleExample īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9]. 92 5 4 7 9
  • 24. 24 S I L I C O N HUFFMAN CODING īŽ ExampleExample 95 75 7 9 2 īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9]. 4 6
  • 25. 25 S I L I C O N HUFFMAN CODING īŽ EXAMPLEEXAMPLE 5 īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9]. 2 4 6 11 7 9 16
  • 26. 26 S I L I C O N HUFFMAN CODING īŽ ExampleExample 5 īŽ n = 5n = 5,, w[0:4] = [2, 5, 4, 7, 9].w[0:4] = [2, 5, 4, 7, 9]. īŽ 2=0102=010 īŽ 5=005=00 īŽ 4=0114=011 īŽ 7=107=10 īŽ 9=119=11 2 4 6 11 7 9 16 27 00 0 0 0 1 1 1 1
  • 27. 27 S I L I C O N LZ-77 ENCODING īŽ Good as they are, Huffman and arithmeticGood as they are, Huffman and arithmetic coding are not perfect for encoding textcoding are not perfect for encoding text because they don't capture the higher-orderbecause they don't capture the higher-order relationships between words and phrases.relationships between words and phrases. There is a simple, clever, and effectiveThere is a simple, clever, and effective approach to compressing text known asapproach to compressing text known as "LZ-77", which uses the redundant nature"LZ-77", which uses the redundant nature of text to provide compression.of text to provide compression.
  • 28. 28 S I L I C O N For an example, consider the phrase:For an example, consider the phrase: the_rain_in_Spain_falls_mainly_in_the_the_rain_in_Spain_falls_mainly_in_the_ plainplain -- where the underscores ("_") indicate-- where the underscores ("_") indicate spaces. This uncompressed message is 43spaces. This uncompressed message is 43 bytes, or 344 bits, long.bytes, or 344 bits, long.
  • 29. 29 S I L I C O N the_rain_in_Spain_falls_mainly_in_the_plain At first, LZ-77 simply outputs uncompressedAt first, LZ-77 simply outputs uncompressed characters, since there are no previous occurrencescharacters, since there are no previous occurrences of any strings to refer back to. In our example,of any strings to refer back to. In our example, these characters will not be compressed:these characters will not be compressed: 1- the_rain_1- the_rain_ The next chunk of the message:The next chunk of the message: in_in_ -- has occurred earlier in the message, and can-- has occurred earlier in the message, and can be represented as a pointer back to that earlier text,be represented as a pointer back to that earlier text, along with a length field. This gives:along with a length field. This gives: 2-the_rain_<3,3>2-the_rain_<3,3>
  • 30. 30 S I L I C O N the_rain_in_Spain_falls_mainly_in_the_plain -- which has to be output uncompressed:-- which has to be output uncompressed: 3- the_rain_<3,3>Sp3- the_rain_<3,3>Sp However, the charactersHowever, the characters "ain_" have already been sent, so they are encoded"ain_" have already been sent, so they are encoded with a pointer:with a pointer: 4- the_rain_<3,3>Sp<9,4>4- the_rain_<3,3>Sp<9,4> The characters "falls_m" are output uncompressed,The characters "falls_m" are output uncompressed, but "ain" has been used before in "rain" andbut "ain" has been used before in "rain" and "Spain", so once again it is encoded with a"Spain", so once again it is encoded with a pointer:pointer: 5- the_rain_<3,3>Sp<9,4>falls _m<11,3>5- the_rain_<3,3>Sp<9,4>falls _m<11,3>
  • 31. 31 S I L I C O N the_rain_in_Spain_falls_mainly_in_the_plain 6-6- the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3>the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3> <34,4><34,4> 7- the_rain_in_Spain_falls_mainly_in_the_plain7- the_rain_in_Spain_falls_mainly_in_the_plain FINAL STEPFINAL STEP the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3><34,4>pl<the_rain_<3,3>Sp<9,4>falls_m<11,3>ly_<16,3><34,4>pl< 15,3>15,3> So total byte acquire this above text is 23So total byte acquire this above text is 23 Actual is 43Actual is 43
  • 32. 32 S I L I C O N ARITHMATIC CODEIND īŽ Huffman coding looks pretty slick, and it is, butHuffman coding looks pretty slick, and it is, but there's a way to improve on it, known asthere's a way to improve on it, known as "arithmetic coding". The idea is subtle and best"arithmetic coding". The idea is subtle and best explained by example.explained by example. īŽ Suppose we have a message that only contains theSuppose we have a message that only contains the characters A, B, and C, with the followingcharacters A, B, and C, with the following frequencies, expressed as fractions:frequencies, expressed as fractions: īŽ A: 0.5 B: 0.2 C: 0.3A: 0.5 B: 0.2 C: 0.3
  • 33. 33 S I L I C O N letter probability interval binary fractionletter probability interval binary fraction ____ _________ ______ ___________ _________ ______ _______ C: 0.3 0.0 : 0.3 0C: 0.3 0.0 : 0.3 0 B: 0.2 0.3 : 0.5 0.011 = 3/8 = 0.375B: 0.2 0.3 : 0.5 0.011 = 3/8 = 0.375 A: 0.5 0.5 : 1.0 0.1 = 1/2 = 0.5A: 0.5 0.5 : 1.0 0.1 = 1/2 = 0.5
  • 34. 34 S I L I C O N Irreversible Compression īŽ Irreversible CompressionIrreversible Compression is based on the assumptionis based on the assumption that some information can be sacrificed. [Irreversiblethat some information can be sacrificed. [Irreversible compression is also calledcompression is also called Entropy ReductionEntropy Reduction].]. īŽ Example: Shrinking a raster image from 400-by-400Example: Shrinking a raster image from 400-by-400 pixels to 100-by-100 pixels. The new image containspixels to 100-by-100 pixels. The new image contains 1 pixel for every 16 pixels in the original image.1 pixel for every 16 pixels in the original image. īŽ There is usually no way to determine what theThere is usually no way to determine what the original pixels were from the one new pixel.original pixels were from the one new pixel. īŽ In data files, irreversible compression is seldom used.In data files, irreversible compression is seldom used. However, it is used in image and speech processing.However, it is used in image and speech processing.
  • 35. 35 S I L I C O N LOSSY COMPRESSION īŽ Data is lost, but not too much:Data is lost, but not too much: īŽ Audio.Audio. īŽ Video.Video. īŽ Still images, medical images, photographs.Still images, medical images, photographs. īŽ Compression ratios of 10:1 often yield quiteCompression ratios of 10:1 often yield quite īŽ Major techniques include:Major techniques include: īŽ Vector Quantization.Vector Quantization. īŽ Block transforms.Block transforms. īŽ Standards – JPEG, JPEG 2000, MPEG (1, 2, 4, 7).Standards – JPEG, JPEG 2000, MPEG (1, 2, 4, 7).
  • 36. 36 S I L I C O N IMAGE COMPRESSION a) 24-bit true colour bitmap (253,014 bytes) b) 60% image quality (5,599 bytes)
  • 37. 37 S I L I C O N DISADVANTAGES Some technique are there by which data canSome technique are there by which data can compress efficiently. But there is a chancecompress efficiently. But there is a chance of losses data.of losses data.
  • 38. 38 S I L I C O N CONCLUSION From the above description ,there is noFrom the above description ,there is no algorithm has not been devloped.That is noalgorithm has not been devloped.That is no such kind of algorithm which is applicablesuch kind of algorithm which is applicable in every data file.But this difficulties can bein every data file.But this difficulties can be handle by using Hybrid data compression.Inhandle by using Hybrid data compression.In this IT ara data compression isthis IT ara data compression is essential.Even though some data will beessential.Even though some data will be loss.loss.