SlideShare a Scribd company logo
1 of 30
Arithmetic Coding
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20062
How we can do better than
Huffman? - I
 As we have seen, the main drawback of
Huffman scheme is that it has problems when
there is a symbol with very high probability
 Remember static Huffman redundancy bound
where is the probability of the most likely
simbol
1redundancy 0.086p≤ +
1p
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20063
How we can do better than
Huffman? - II
 The only way to overcome this limitation is to
use, as symbols, “blocks” of several
characters.
In this way the per-symbol inefficiency is
spread over the whole block
 However, the use of blocks is difficult to
implement as there must be a block for every
possible combination of symbols, so block
number increases exponentially with their
length
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20064
How we can do better than
Huffman? - III
 Huffman Coding is optimal in its
framework
 static model
 one symbol, one word
adaptive Huffman
blocking
arithmetic
coding
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20065
The key idea
 Arithmetic coding completely bypasses the
idea of replacing an input symbol with a
specific code.
 Instead, it takes a stream of input symbols
and replaces it with a single floating point
number in
 The longer and more complex the message, the
more bits are needed to represents the output
number
[0,1)
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20066
The key idea - II
 The output of an arithmetic coding is, as usual,
a stream of bits
 However we can think that there is a prefix 0,
and the stream represents a fractional binary
number between 0 and 1
 In order to explain the algorithm, numbers will
be shown as decimal, but obviously they are
always binary
01101010 0110 00. 101→
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20067
An example - I
 String bccb from the alphabet {a,b,c}
 Zero-frequency problem solved initializing at 1
all character counters
 When the first b is to be coded all symbols
have a 33% probability (why?)
 The arithmetic coder maintains two numbers,
low and high, which represent a subinterval
[low,high) of the range [0,1)
 Initially low=0 and high=1
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20068
An example - II
 The range between low and high is divided
between the symbols of the alphabet,
according to their probabilities
low
high
0
1
0.333
3
0.666
7
a
b
c(P[c]=1/3)
(P[b]=1/3)
(P[a]=1/3)
9
An example - III
low
high
0
1
0.333
3
0.666
7
a
b
c
b
low = 0.3333
high = 0.6667
 P[a]=1/4
 P[b]=2/4
 P[c]=1/4
new probabilities
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200610
An example - IV
new probabilities
 P[a]=1/5
 P[b]=2/5
 P[c]=2/5
low
high
0.333
3
0.666
7
0.416
7
0.583
4
a
b
c
c
low = 0.5834
high = 0.6667
(P[c]=1/4)
(P[b]=2/4)
(P[a]=1/4)
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200611
An example - V
new probabilities
 P[a]=1/6
 P[b]=2/6
 P[c]=3/6
low
high
0.583
4
0.666
7
0.600
1
0.633
4
a
b
c
c
low = 0.6334
high = 0.6667
(P[c]=2/5)
(P[b]=2/5)
(P[a]=1/5)
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200612
An example - VI
 Final interval
[0.6390,0.6501)
 we can send 0.64
low
high
0.633
4
0.666
7
0.639
0
0.650
1
a
b
c
low = 0.6390
high = 0.6501
b
(P[c]=3/6)
(P[b]=2/6)
(P[a]=1/6)
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200613
An example - summary
 Starting from the range between 0 and 1 we
restrict ourself each time to the subinterval
that codify the given symbol
 At the end the whole sequence can be codified
by any of the numbers in the final range (but
mind the brackets...)
14
An example - summary
0
1
0.333
3
0.666
7
a
b
c
0.6667
0.3333
1/3
1/3
1/3
0.4167
0.5834
1/4
2/4
1/4
a
b
c
0. 5834
0. 6667
2/5
2/5
1/5
0.6001
0.6334
a
b
c
0. 6667
0.6334 a
b
c
0.6390
0.6501
3/6
2/6
1/6
[0.6390, 0.6501) 0.64
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200615
Another example - I
 Consider encoding the name BILL GATES
Again, we need the frequency of all the
characters in the text.
chr freq.
space 0.1
A 0.1
B 0.1
E 0.1
G 0.1
I 0.1
L 0.2
S 0.1
T 0.1
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200616
Another example - II
character probability range
space 0.1 [0.00, 0.10)
A 0.1 [0.10, 0.20)
B 0.1 [0.20, 0.30)
E 0.1 [0.30, 0.40)
G 0.1 [0.40, 0.50)
I 0.1 [0.50, 0.60)
L 0.2 [0.60, 0.80)
S 0.1 [0.80, 0.90)
T 0.1 [0.90, 1.00)
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200617
Another example - III
chr low high
0.0 1.0
B 0.2 0.3
I 0.25 0.26
L 0.256 0.258
L 0.2572 0.2576
Space 0.25720 0.25724
G 0.257216 0.257220
A 0.2572164 0.2572168
T 0.25721676 0.2572168
E 0.257216772 0.257216776
S 0.2572167752 0.2572167756
The final low value, 0.2572167752 will uniquely encode
the name BILL GATES
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200618
Decoding - I
 Suppose we have to decode 0.64
 The decoder needs symbol probabilities, as it
simulates what the encoder must have been
doing
 It starts with low=0 and high=1 and divides
the interval exactly in the same manner as the
encoder (a in [0, 1/3), b in [1/3, 2/3), c in
[2/3, 1)
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200619
Decoding - II
 The trasmitted number falls in the interval
corresponding to b, so b must have been the
first symbol encoded
 Then the decoder evaluates the new values for
low (0.3333) and for high (0.6667), updates
symbol probabilities and divides the range
from low to high according to these new
probabilities
 Decoding proceeds until the full string has
been reconstructed
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200620
Decoding - III
 0.64 in [0.3333, 0.6667) b
 0.64 in [0.5834, 0.6667) c...
and so on...
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200621
Why does it works?
 More bits are necessary to express a number
in a smaller interval
 High-probability events do not decrease very
much interval range, while low probability
events result a much smaller next interval
 The number of digits needed is proportional to
the negative logarithm of the size of the
interval
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200622
Why does it works?
 The size of the final interval is the product of
the probabilities of the symbols coded, so the
logarithm of this product is the sum of the
logarithm of each term
 So a symbol s with probability Pr[s]
contributes
bits to the output, that is equal to symbol
probability content (uncertainty)!!
log Pr[ ]s−
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200623
Why does it works?
 For this reason arithmetic coding is nearly
optimum as number of output bits, and it is
capable to code very high probability events in
just a fraction of bit
 In practice, the algorithm is not exactly
optimal because of the use of limited precision
arithmetic, and because trasmission requires
to send a whole number of bits
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200624
A trick - I
 As the algorithm was described until now, the
whole output is available only when encoding
are finished
 In practice, it is possible to output bits during
the encoding, which avoids the need for higher
and higher arithmetic precision in the encoding
 The trick is to observe that when low and high
are close they could share a common prefix
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200625
A trick - II
 This prefix will remain forever in the two
values, so we can transmit it and remove from
low and high
 For example, during the encoding of “bccb”, it
has happened that after the encoding of the
third character the range is low=0.6334,
high=0.6667
 We can remove the common prefix, sending 6
to the output and transforming low and high
into 0.334 and 0,667
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200626
The encoding step
 To code symbol s, where symbols are
numbered from 1 to n and symbol i has
probability Pr[i]
 low_bound =
 high_bound =
 range = high - low
 low = low + range * low_bound
 high = low + range * high_bound
1
1
Pr[ ]
s
i
i
−
=∑
1
Pr[ ]
s
i
i=∑
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200627
The decoding step
 The symbols are numbered from 1 to n and
value is the arithmetic code to be processed
 Find s such that
 Return symbol s
 Perform the same range-narrowing step of the encoding step
1
1 1
( )
Pr[ ] Pr[ ]
( )
s s
i i
value low
i i
high low
−
= =
−
≤ ≤
−
∑ ∑
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200628
Implementing arithmetic coding
 As mentioned early, arithmetic coding uses
binary fractional number with unlimited
arithmetic precision
 Working with finite precision (16 or 32 bits)
causes compression be a little worser than
entropy bound
 It is possible also to build coders based on
integer arithmetic, with another little
degradation of compression
29
Arithmetic coding vs. Huffman coding
 In tipical English text, the space character is
the most common, with a probability of about
18%, so Huffman redundancy is quite small.
Moreover this is an upper bound
 On the contrary, in black and white images,
arithmetic coding is much better than Huffman
coding, unless a blocking technique is used
A A. coding requires less memory, as symbol
representation is calculated on the fly
A A. coding is more suitable for high
performance models, where there are
confident predictions
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200630
Arithmetic coding vs. Huffman coding
H H. decoding is generally faster than a.
decoding
H In a. coding it is not easy to start decoding in
the middle of the stream, while in H. coding
we can use “starting points”
 In large collections of text and images,
Huffman coding is likely to be used for the
text, and arithmeting coding for the images

More Related Content

What's hot

Point processing
Point processingPoint processing
Point processingpanupriyaa7
 
Discrete cosine transform
Discrete cosine transform   Discrete cosine transform
Discrete cosine transform Rashmi Karkra
 
Fundamentals and image compression models
Fundamentals and image compression modelsFundamentals and image compression models
Fundamentals and image compression modelslavanya marichamy
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & DescriptorsPundrikPatel
 
Homomorphic filtering
Homomorphic filteringHomomorphic filtering
Homomorphic filteringGautam Saxena
 
Sharpening using frequency Domain Filter
Sharpening using frequency Domain FilterSharpening using frequency Domain Filter
Sharpening using frequency Domain Filterarulraj121
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit NotesAAKANKSHA JAIN
 
Image Processing: Spatial filters
Image Processing: Spatial filtersImage Processing: Spatial filters
Image Processing: Spatial filtersA B Shinde
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation pptGichelle Amon
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationMostafa G. M. Mostafa
 
Interpixel redundancy
Interpixel redundancyInterpixel redundancy
Interpixel redundancyNaveen Kumar
 
Image compression standards
Image compression standardsImage compression standards
Image compression standardskirupasuchi1996
 
Color image processing Presentation
Color image processing PresentationColor image processing Presentation
Color image processing PresentationRevanth Chimmani
 

What's hot (20)

Noise Models
Noise ModelsNoise Models
Noise Models
 
Sharpening spatial filters
Sharpening spatial filtersSharpening spatial filters
Sharpening spatial filters
 
Point processing
Point processingPoint processing
Point processing
 
Discrete cosine transform
Discrete cosine transform   Discrete cosine transform
Discrete cosine transform
 
Image segmentation
Image segmentation Image segmentation
Image segmentation
 
Fundamentals and image compression models
Fundamentals and image compression modelsFundamentals and image compression models
Fundamentals and image compression models
 
Lzw coding technique for image compression
Lzw coding technique for image compressionLzw coding technique for image compression
Lzw coding technique for image compression
 
Image Representation & Descriptors
Image Representation & DescriptorsImage Representation & Descriptors
Image Representation & Descriptors
 
Digital image processing
Digital image processing  Digital image processing
Digital image processing
 
image compression ppt
image compression pptimage compression ppt
image compression ppt
 
Homomorphic filtering
Homomorphic filteringHomomorphic filtering
Homomorphic filtering
 
Sharpening using frequency Domain Filter
Sharpening using frequency Domain FilterSharpening using frequency Domain Filter
Sharpening using frequency Domain Filter
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit Notes
 
Image Processing: Spatial filters
Image Processing: Spatial filtersImage Processing: Spatial filters
Image Processing: Spatial filters
 
Image segmentation ppt
Image segmentation pptImage segmentation ppt
Image segmentation ppt
 
Digital Image Processing: Image Segmentation
Digital Image Processing: Image SegmentationDigital Image Processing: Image Segmentation
Digital Image Processing: Image Segmentation
 
Interpixel redundancy
Interpixel redundancyInterpixel redundancy
Interpixel redundancy
 
Image compression standards
Image compression standardsImage compression standards
Image compression standards
 
Color image processing Presentation
Color image processing PresentationColor image processing Presentation
Color image processing Presentation
 
Image compression models
Image compression modelsImage compression models
Image compression models
 

Viewers also liked

Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)anithabalaprabhu
 
Module 4 Arithmetic Coding
Module 4 Arithmetic CodingModule 4 Arithmetic Coding
Module 4 Arithmetic Codinganithabalaprabhu
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algorithamRahul Khanwani
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisRamakant Soni
 
Dictionary Based Compression
Dictionary Based CompressionDictionary Based Compression
Dictionary Based Compressionanithabalaprabhu
 
Text Localizaion Output
Text Localizaion OutputText Localizaion Output
Text Localizaion OutputVikas Goyal
 
IRJET-Lossless Image compression and decompression using Huffman coding
IRJET-Lossless Image compression and decompression using Huffman codingIRJET-Lossless Image compression and decompression using Huffman coding
IRJET-Lossless Image compression and decompression using Huffman codingIRJET Journal
 
Walkie talkie ppt
Walkie talkie pptWalkie talkie ppt
Walkie talkie ppttbs123
 
Gray Coded Grayscale Image Steganoraphy using Hufman Encoding
Gray Coded Grayscale Image Steganoraphy using Hufman EncodingGray Coded Grayscale Image Steganoraphy using Hufman Encoding
Gray Coded Grayscale Image Steganoraphy using Hufman EncodingCSCJournals
 
Presen_Segmentation
Presen_SegmentationPresen_Segmentation
Presen_SegmentationVikas Goyal
 

Viewers also liked (20)

Arithmetic Coding
Arithmetic CodingArithmetic Coding
Arithmetic Coding
 
Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)Ch 04 Arithmetic Coding (Ppt)
Ch 04 Arithmetic Coding (Ppt)
 
Module 4 Arithmetic Coding
Module 4 Arithmetic CodingModule 4 Arithmetic Coding
Module 4 Arithmetic Coding
 
Data compression huffman coding algoritham
Data compression huffman coding algorithamData compression huffman coding algoritham
Data compression huffman coding algoritham
 
Huffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysisHuffman and Arithmetic coding - Performance analysis
Huffman and Arithmetic coding - Performance analysis
 
Huffman Coding
Huffman CodingHuffman Coding
Huffman Coding
 
06 Arithmetic 1
06 Arithmetic 106 Arithmetic 1
06 Arithmetic 1
 
Adaptive Huffman Coding
Adaptive Huffman CodingAdaptive Huffman Coding
Adaptive Huffman Coding
 
Dictionary Based Compression
Dictionary Based CompressionDictionary Based Compression
Dictionary Based Compression
 
Data compression
Data compressionData compression
Data compression
 
Compression
CompressionCompression
Compression
 
Output
OutputOutput
Output
 
Text Localizaion Output
Text Localizaion OutputText Localizaion Output
Text Localizaion Output
 
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb CodesLec-03 Entropy Coding I: Hoffmann & Golomb Codes
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
 
IRJET-Lossless Image compression and decompression using Huffman coding
IRJET-Lossless Image compression and decompression using Huffman codingIRJET-Lossless Image compression and decompression using Huffman coding
IRJET-Lossless Image compression and decompression using Huffman coding
 
Walkie talkie ppt
Walkie talkie pptWalkie talkie ppt
Walkie talkie ppt
 
Gray Coded Grayscale Image Steganoraphy using Hufman Encoding
Gray Coded Grayscale Image Steganoraphy using Hufman EncodingGray Coded Grayscale Image Steganoraphy using Hufman Encoding
Gray Coded Grayscale Image Steganoraphy using Hufman Encoding
 
Presen_Segmentation
Presen_SegmentationPresen_Segmentation
Presen_Segmentation
 
Huffman coding
Huffman codingHuffman coding
Huffman coding
 
Grouping
GroupingGrouping
Grouping
 

Similar to Arithmetic coding

Digital image processing- Compression- Different Coding techniques
Digital image processing- Compression- Different Coding techniques Digital image processing- Compression- Different Coding techniques
Digital image processing- Compression- Different Coding techniques sudarmani rajagopal
 
Huffman coding.ppt
Huffman coding.pptHuffman coding.ppt
Huffman coding.pptvace1
 
Chapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptxChapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptxMedinaBedru
 
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...Helan4
 
Lec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniquesLec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniquesDom Mike
 
Reed Solomon Coding For Error Detection and Correction
Reed Solomon Coding For Error Detection and CorrectionReed Solomon Coding For Error Detection and Correction
Reed Solomon Coding For Error Detection and Correctioninventionjournals
 
2,Combinational Logic Circuits.pdf
2,Combinational Logic Circuits.pdf2,Combinational Logic Circuits.pdf
2,Combinational Logic Circuits.pdfDamotTesfaye
 
A Lossless FBAR Compressor
A Lossless FBAR CompressorA Lossless FBAR Compressor
A Lossless FBAR CompressorPhilip Alipour
 
Forward Bit Error Correction - Wireless Communications
Forward Bit Error Correction - Wireless Communications Forward Bit Error Correction - Wireless Communications
Forward Bit Error Correction - Wireless Communications Surya Chandra
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
 
review of number systems and codes
review of number systems and codesreview of number systems and codes
review of number systems and codessrinu247
 
Bt0064 logic design
Bt0064 logic designBt0064 logic design
Bt0064 logic designsmumbahelp
 

Similar to Arithmetic coding (20)

Digital image processing- Compression- Different Coding techniques
Digital image processing- Compression- Different Coding techniques Digital image processing- Compression- Different Coding techniques
Digital image processing- Compression- Different Coding techniques
 
Compression ii
Compression iiCompression ii
Compression ii
 
Huffman coding.ppt
Huffman coding.pptHuffman coding.ppt
Huffman coding.ppt
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
Compression Ii
Compression IiCompression Ii
Compression Ii
 
Chapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptxChapter 4 Lossless Compression Algorithims.pptx
Chapter 4 Lossless Compression Algorithims.pptx
 
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
2.3 unit-ii-text-compression-a-outline-compression-techniques-run-length-codi...
 
Lec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniquesLec7 8 9_10 coding techniques
Lec7 8 9_10 coding techniques
 
Reed Solomon Coding For Error Detection and Correction
Reed Solomon Coding For Error Detection and CorrectionReed Solomon Coding For Error Detection and Correction
Reed Solomon Coding For Error Detection and Correction
 
2,Combinational Logic Circuits.pdf
2,Combinational Logic Circuits.pdf2,Combinational Logic Circuits.pdf
2,Combinational Logic Circuits.pdf
 
A Lossless FBAR Compressor
A Lossless FBAR CompressorA Lossless FBAR Compressor
A Lossless FBAR Compressor
 
Forward Bit Error Correction - Wireless Communications
Forward Bit Error Correction - Wireless Communications Forward Bit Error Correction - Wireless Communications
Forward Bit Error Correction - Wireless Communications
 
Y03301460154
Y03301460154Y03301460154
Y03301460154
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
review of number systems and codes
review of number systems and codesreview of number systems and codes
review of number systems and codes
 
Source coding
Source codingSource coding
Source coding
 
PS3
PS3PS3
PS3
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 
Digi qestions
Digi qestionsDigi qestions
Digi qestions
 
Bt0064 logic design
Bt0064 logic designBt0064 logic design
Bt0064 logic design
 

Recently uploaded

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Arithmetic coding

  • 2. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20062 How we can do better than Huffman? - I  As we have seen, the main drawback of Huffman scheme is that it has problems when there is a symbol with very high probability  Remember static Huffman redundancy bound where is the probability of the most likely simbol 1redundancy 0.086p≤ + 1p
  • 3. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20063 How we can do better than Huffman? - II  The only way to overcome this limitation is to use, as symbols, “blocks” of several characters. In this way the per-symbol inefficiency is spread over the whole block  However, the use of blocks is difficult to implement as there must be a block for every possible combination of symbols, so block number increases exponentially with their length
  • 4. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20064 How we can do better than Huffman? - III  Huffman Coding is optimal in its framework  static model  one symbol, one word adaptive Huffman blocking arithmetic coding
  • 5. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20065 The key idea  Arithmetic coding completely bypasses the idea of replacing an input symbol with a specific code.  Instead, it takes a stream of input symbols and replaces it with a single floating point number in  The longer and more complex the message, the more bits are needed to represents the output number [0,1)
  • 6. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20066 The key idea - II  The output of an arithmetic coding is, as usual, a stream of bits  However we can think that there is a prefix 0, and the stream represents a fractional binary number between 0 and 1  In order to explain the algorithm, numbers will be shown as decimal, but obviously they are always binary 01101010 0110 00. 101→
  • 7. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20067 An example - I  String bccb from the alphabet {a,b,c}  Zero-frequency problem solved initializing at 1 all character counters  When the first b is to be coded all symbols have a 33% probability (why?)  The arithmetic coder maintains two numbers, low and high, which represent a subinterval [low,high) of the range [0,1)  Initially low=0 and high=1
  • 8. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-20068 An example - II  The range between low and high is divided between the symbols of the alphabet, according to their probabilities low high 0 1 0.333 3 0.666 7 a b c(P[c]=1/3) (P[b]=1/3) (P[a]=1/3)
  • 9. 9 An example - III low high 0 1 0.333 3 0.666 7 a b c b low = 0.3333 high = 0.6667  P[a]=1/4  P[b]=2/4  P[c]=1/4 new probabilities
  • 10. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200610 An example - IV new probabilities  P[a]=1/5  P[b]=2/5  P[c]=2/5 low high 0.333 3 0.666 7 0.416 7 0.583 4 a b c c low = 0.5834 high = 0.6667 (P[c]=1/4) (P[b]=2/4) (P[a]=1/4)
  • 11. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200611 An example - V new probabilities  P[a]=1/6  P[b]=2/6  P[c]=3/6 low high 0.583 4 0.666 7 0.600 1 0.633 4 a b c c low = 0.6334 high = 0.6667 (P[c]=2/5) (P[b]=2/5) (P[a]=1/5)
  • 12. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200612 An example - VI  Final interval [0.6390,0.6501)  we can send 0.64 low high 0.633 4 0.666 7 0.639 0 0.650 1 a b c low = 0.6390 high = 0.6501 b (P[c]=3/6) (P[b]=2/6) (P[a]=1/6)
  • 13. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200613 An example - summary  Starting from the range between 0 and 1 we restrict ourself each time to the subinterval that codify the given symbol  At the end the whole sequence can be codified by any of the numbers in the final range (but mind the brackets...)
  • 14. 14 An example - summary 0 1 0.333 3 0.666 7 a b c 0.6667 0.3333 1/3 1/3 1/3 0.4167 0.5834 1/4 2/4 1/4 a b c 0. 5834 0. 6667 2/5 2/5 1/5 0.6001 0.6334 a b c 0. 6667 0.6334 a b c 0.6390 0.6501 3/6 2/6 1/6 [0.6390, 0.6501) 0.64
  • 15. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200615 Another example - I  Consider encoding the name BILL GATES Again, we need the frequency of all the characters in the text. chr freq. space 0.1 A 0.1 B 0.1 E 0.1 G 0.1 I 0.1 L 0.2 S 0.1 T 0.1
  • 16. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200616 Another example - II character probability range space 0.1 [0.00, 0.10) A 0.1 [0.10, 0.20) B 0.1 [0.20, 0.30) E 0.1 [0.30, 0.40) G 0.1 [0.40, 0.50) I 0.1 [0.50, 0.60) L 0.2 [0.60, 0.80) S 0.1 [0.80, 0.90) T 0.1 [0.90, 1.00)
  • 17. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200617 Another example - III chr low high 0.0 1.0 B 0.2 0.3 I 0.25 0.26 L 0.256 0.258 L 0.2572 0.2576 Space 0.25720 0.25724 G 0.257216 0.257220 A 0.2572164 0.2572168 T 0.25721676 0.2572168 E 0.257216772 0.257216776 S 0.2572167752 0.2572167756 The final low value, 0.2572167752 will uniquely encode the name BILL GATES
  • 18. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200618 Decoding - I  Suppose we have to decode 0.64  The decoder needs symbol probabilities, as it simulates what the encoder must have been doing  It starts with low=0 and high=1 and divides the interval exactly in the same manner as the encoder (a in [0, 1/3), b in [1/3, 2/3), c in [2/3, 1)
  • 19. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200619 Decoding - II  The trasmitted number falls in the interval corresponding to b, so b must have been the first symbol encoded  Then the decoder evaluates the new values for low (0.3333) and for high (0.6667), updates symbol probabilities and divides the range from low to high according to these new probabilities  Decoding proceeds until the full string has been reconstructed
  • 20. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200620 Decoding - III  0.64 in [0.3333, 0.6667) b  0.64 in [0.5834, 0.6667) c... and so on...
  • 21. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200621 Why does it works?  More bits are necessary to express a number in a smaller interval  High-probability events do not decrease very much interval range, while low probability events result a much smaller next interval  The number of digits needed is proportional to the negative logarithm of the size of the interval
  • 22. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200622 Why does it works?  The size of the final interval is the product of the probabilities of the symbols coded, so the logarithm of this product is the sum of the logarithm of each term  So a symbol s with probability Pr[s] contributes bits to the output, that is equal to symbol probability content (uncertainty)!! log Pr[ ]s−
  • 23. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200623 Why does it works?  For this reason arithmetic coding is nearly optimum as number of output bits, and it is capable to code very high probability events in just a fraction of bit  In practice, the algorithm is not exactly optimal because of the use of limited precision arithmetic, and because trasmission requires to send a whole number of bits
  • 24. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200624 A trick - I  As the algorithm was described until now, the whole output is available only when encoding are finished  In practice, it is possible to output bits during the encoding, which avoids the need for higher and higher arithmetic precision in the encoding  The trick is to observe that when low and high are close they could share a common prefix
  • 25. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200625 A trick - II  This prefix will remain forever in the two values, so we can transmit it and remove from low and high  For example, during the encoding of “bccb”, it has happened that after the encoding of the third character the range is low=0.6334, high=0.6667  We can remove the common prefix, sending 6 to the output and transforming low and high into 0.334 and 0,667
  • 26. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200626 The encoding step  To code symbol s, where symbols are numbered from 1 to n and symbol i has probability Pr[i]  low_bound =  high_bound =  range = high - low  low = low + range * low_bound  high = low + range * high_bound 1 1 Pr[ ] s i i − =∑ 1 Pr[ ] s i i=∑
  • 27. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200627 The decoding step  The symbols are numbered from 1 to n and value is the arithmetic code to be processed  Find s such that  Return symbol s  Perform the same range-narrowing step of the encoding step 1 1 1 ( ) Pr[ ] Pr[ ] ( ) s s i i value low i i high low − = = − ≤ ≤ − ∑ ∑
  • 28. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200628 Implementing arithmetic coding  As mentioned early, arithmetic coding uses binary fractional number with unlimited arithmetic precision  Working with finite precision (16 or 32 bits) causes compression be a little worser than entropy bound  It is possible also to build coders based on integer arithmetic, with another little degradation of compression
  • 29. 29 Arithmetic coding vs. Huffman coding  In tipical English text, the space character is the most common, with a probability of about 18%, so Huffman redundancy is quite small. Moreover this is an upper bound  On the contrary, in black and white images, arithmetic coding is much better than Huffman coding, unless a blocking technique is used A A. coding requires less memory, as symbol representation is calculated on the fly A A. coding is more suitable for high performance models, where there are confident predictions
  • 30. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-200630 Arithmetic coding vs. Huffman coding H H. decoding is generally faster than a. decoding H In a. coding it is not easy to start decoding in the middle of the stream, while in H. coding we can use “starting points”  In large collections of text and images, Huffman coding is likely to be used for the text, and arithmeting coding for the images