Present v0.2

Improvement of lossless Compression for JPEG ﬁles

Irina Bocharova, Kirill Yurkov,
Mikhail Bogdanov, Roman Bolshakov, Alexander Buslaev,
Yuri Konoplev, Anrew Tereskin, Oleg Finkelshteyn
ITMO

autumn 2010 - spring 2011

-: big team from ITMO :- () Compression of JPEG autumn 2010 - spring 2011 1 / 27

Agenda

Purpose
Schemes of encoder and decoder
encoding DC
encoding RUN’s and AC
Levenstein encoder
Arithmetic encoder
Results
Problems


Purpose

Realize a recoder of JPEG to reduce bit stream
Requirements: bit-to-bit corrsepondense


Scheme of encoder


Scheme of decoder


encoding DC (DC Prediction)

B C

?

?
A X

DCC , |DCB − DCA | < |DCB − DCC |
P=
DCA , otherwise

x - P encoded by arithmetic encoder.


encoding DC ( zero map, numbers of nonzero encoding )

y0 y1 y2

y3 x

Context for encoding x:

y 0 + λ1 y 1 + λ2 y 2 + λ3 y 3


AC blocks encoding


Runs and levels encoding

We need to encode the pairs: (l0 , r0 ), (l1 , r1 ), . . . , (ln , rn , )
The value n known to encoder. For encoding pair (li , ri ) we construct
two dimensional context:
n
n−i


Arithmetic coding

Arithmetic + Adaptive
model

autumn 2010 - spring 2011 10 /
-: big team from ITMO :- () Compression of JPEG 27

Levenstein code

A universal code encoding the non-negative integers
It works so:
code of 0 is "0 and if we want to encode a positive number we do
next:
1 Init the step count var C to 1
2 Write a binary representation of the number without the leading "1"to
the beginning of the code.
3 Let M be the number of bits written in step 2.
4 If M is not 0, increment C, repeat from step 2 with M as the new
number.
5 Write C "1"bits and a "0"to the beginning of the code.


Some samples


Some information about samples


Results and Comparison

Picture Size PackJpg PCAR
A10 842 KB 19.2 % 11.5 %
Aﬁsha 213 KB 28.6 % 20.0 %
Bird 82 KB 17.7 % 9.4 %
Document 103 KB 29.7 % 25.4 %
Flower 5 KB 18.5 % 6.0 %
Monkey 30 KB 30.6 % 24.7 %
Portrait 63 KB 25.5 % 25.0 %


Problems (bit-to-bit)

We need to read and write JFIF (JPEG) files maintatining bitwise
identity.
Two possible implementation paths:
Full parser: file → internal structrures → file
Pros: very flexible, easy to process once we have the structure
Cons: implementing a writer adhering to the bitwise identity
requirement is difficult. High serialization overhead.
Stream encoder: leaves most of non-interesting metadata as is
(compressing using general-purpose stream methods)
Pros: faster, no serialization code (decoder reuses the jpeg header
parser from encoder), guarantees exactness in metadata
Cons: we lose flexibility, save some redundant information (e.g.
standard Huffman tables)
After several attempts, we settled on the latter solution which works
for an estimate of 95% of JPEG files in the wild (for those we are
unable to process, a diagnostic is provided)

Problems (Unknown alphabet size)

Starts from alphabet contains one symbol Ω = {ζ},
where ζ is escape symbol
For each new input symbol at+1
1 a ∈ Ω,
τ (a)
encode a with probality distribution p(a) = t+1
2 a∈Ω
/
τ (a)
encode escape symbol with probability distribution p(a) = t+1
encode a with Levenstein code
Ω = Ω ∪ {a}


Thanks

Questions ?


References

[Rissanen, J.J.; Langdon, G.G., 1979]
Arithmetic coding
IBM Journal of Research and Development, p: 149-162.

[Levenstein V.I., 1968]
About redundancy and slowdown of diﬀerence coding of natural
numbers
Problems of cybernetics, Moscow, Science, p: 173-179.

[Krichevsky, R.E.; Troﬁmov V.K., 1981]
The Performance of Universal Encoding
IEEE Trans. Information Theory, Vol. IT-27, No. 2, pp. 199–207.


other information


Present v0.2

More Related Content

What's hot

Viewers also liked

Similar to Present v0.2

Present v0.2