Present v0.2

Improvement of lossless Compression for JPEG ﬁles

Irina Bocharova, Kirill Yurkov,
Mikhail Bogdanov, Roman Bolshakov, Alexander Buslaev,
Yuri Konoplev, Anrew Tereskin, Oleg Finkelshteyn
ITMO

autumn 2010 - spring 2011

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 1 / 27

Agenda

Purpose
Schemes of encoder and decoder
encoding DC
encoding RUN’s and AC
Levenstein encoder
Arithmetic encoder
Results
Problems


Purpose

Realize a recoder of JPEG to reduce bit stream
Requirements: bit-to-bit corrsepondense


Scheme of encoder


Scheme of decoder


encoding DC (DC Prediction)

B C

?

?
A X

DCC , |DCB − DCA | < |DCB − DCC |
P=
DCA , otherwise

x - P encoded by arithmetic encoder.


encoding DC ( zero map, numbers of nonzero encoding )

y0 y1 y2

y3 x

Context for encoding x:

y 0 + λ1 y 1 + λ2 y 2 + λ3 y 3


AC blocks encoding


Runs and levels encoding

We need to encode the pairs: (l0 , r0 ), (l1 , r1 ), . . . , (ln , rn , )
The value n known to encoder. For encoding pair (li , ri ) we construct
two dimensional context:
n
n−i


Arithmetic coding

Arithmetic + Adaptive
model

autumn 2010 - spring 2011 10 /
-: big team from ITMO :- () Compression of JPEG 27

Levenstein code

A universal code encoding the non-negative integers
It works so:
code of 0 is "0 and if we want to encode a positive number we do
next:
1 Init the step count var C to 1
2 Write a binary representation of the number without the leading "1"to
the beginning of the code.
3 Let M be the number of bits written in step 2.
4 If M is not 0, increment C, repeat from step 2 with M as the new
number.
5 Write C "1"bits and a "0"to the beginning of the code.


Some samples


Some information about samples


Results and Comparison

Picture Size PackJpg PCAR
A10 842 KB 19.2 % 11.5 %
Aﬁsha 213 KB 28.6 % 20.0 %
Bird 82 KB 17.7 % 9.4 %
Document 103 KB 29.7 % 25.4 %
Flower 5 KB 18.5 % 6.0 %
Monkey 30 KB 30.6 % 24.7 %
Portrait 63 KB 25.5 % 25.0 %


Problems (bit-to-bit)

We need to read and write JFIF (JPEG) files maintatining bitwise
identity.
Two possible implementation paths:
Full parser: file → internal structrures → file
Pros: very flexible, easy to process once we have the structure
Cons: implementing a writer adhering to the bitwise identity
requirement is difficult. High serialization overhead.
Stream encoder: leaves most of non-interesting metadata as is
(compressing using general-purpose stream methods)
Pros: faster, no serialization code (decoder reuses the jpeg header
parser from encoder), guarantees exactness in metadata
Cons: we lose flexibility, save some redundant information (e.g.
standard Huffman tables)
After several attempts, we settled on the latter solution which works
for an estimate of 95% of JPEG files in the wild (for those we are
unable to process, a diagnostic is provided)

Problems (Unknown alphabet size)

Starts from alphabet contains one symbol Ω = {ζ},
where ζ is escape symbol
For each new input symbol at+1
1 a ∈ Ω,
τ (a)
encode a with probality distribution p(a) = t+1
2 a∈Ω
/
τ (a)
encode escape symbol with probability distribution p(a) = t+1
encode a with Levenstein code
Ω = Ω ∪ {a}


Thanks

Questions ?


References

[Rissanen, J.J.; Langdon, G.G., 1979]
Arithmetic coding
IBM Journal of Research and Development, p: 149-162.

[Levenstein V.I., 1968]
About redundancy and slowdown of diﬀerence coding of natural
numbers
Problems of cybernetics, Moscow, Science, p: 173-179.

[Krichevsky, R.E.; Troﬁmov V.K., 1981]
The Performance of Universal Encoding
IEEE Trans. Information Theory, Vol. IT-27, No. 2, pp. 199–207.


other information


Present v0.2

Recommended

Recommended

More Related Content

What's hot

What's hot (11)

Viewers also liked

Viewers also liked (7)

Similar to Present v0.2

Similar to Present v0.2 (20)

Present v0.2