3. - reverse engineering since 1989
- author of Corkami
- File Formats For Ever
at PoC or GTFO
- malware analysis
- infosec engineer
About the author
my license plate is a CPU,
my phone case is a PDF doc,
my PDF resume is a SNES/MD rom.
My own views
and opinions.
3
4. Tl:Dr;
A lot of confusion regarding Zlib/Gzip/Zip/Deflate.
Is Deflate “Zip’s algorithm” ?
This deck is not about explaining compression algorithms.
THE CURRENT SLIDE IS AN
A CORKAMI ORIGINAL PRODUCTION
HONEST TALK TRAILER
zlib — Compression compatible with gzip
4
5. Standards timeline
1989-2020 Zip file format (AppNote)
1996/05 - RFC 1950 - Zlib Compressed Data Format Specification
1996/05 - RFC 1951 - Deflate Compressed Data Format Specification
1996/05 - RFC 1952 - Gzip file format
Zip is much older.
All related RFCs were submitted together, which is confusing.
5
7. Zip supports a lot more than Deflate
Since 1992,
Deflate is ZIP’s standard ‘generic’ compression.
Some tools only support Deflate (and reject other methods):
-> using older compressions is an easy security bypass.
7
8. Ok, we know that Deflate is
one of Zip’s algorithm
The standard one
8
10. The minimal Deflate stream
Deflate stream of an empty stream:
Tiny, but already complex for empty data!
03 00
Deflate data:
- Last/Type
- Length
True/Dynamic Huffman
0
00 01
01 00 00 FF FF
Deflate data:
- Last/Type
- Length
- !Length
True/No Compression
0
-1
00 01 02 03 04
Compressed form
Raw form
10
11. Zip Store method
Pure raw data - the original file as-is.
(useful to keep payloads still useable)
Zip Storing is not the same as
Deflate with no compression.
Last/Type
Length
!Length
True/NC
0
0xFFFF
Zip-Stored empty string “”
Deflate-stored empty string: 01 00 00 FF FF
The other standard ZIP method.
“No Compression”.
11
13. A minimal Zlib stream (simplified)
78 DA 03 00 00 00 00 01
00 01 02 03 04 05 06 07
[4 bits]
Method
[1 byte]
Deflate data
[4 bytes]
Simplified contents:
- Some parameters
- including the Compression Method
- Deflate data
- a footer
Always 2 bytes before, 4 bytes after.
13
14. A minimal Zlib stream
78 DA 03 00 00 00 00 01
00 01 02 03 04 05 06 07
Window Size
Method
Flags
Checksum
Deflate data:
- Last/Type
- Length
Adler32
7 = 32Kb
8 = Deflate
No Dictionary Extra
0x78DA % 31 = 0
True/Dynamic Huffman
0
0x00000001
CM (Compression method)
This identifies the compression method used in the file. CM = 8
denotes the "deflate" compression method with a window size up
to 32K. This is the method used by gzip and PNG (see
references [1] and [2] in Chapter 3, below, for the reference
documents). CM = 15 is reserved. It might be used in a future
version of this specification to indicate the presence of an
extra field before the compressed data.
14
15. 0x
1x
1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00
00 00 00 00
0 1 2 3 4 5 6 7 8 9 A B C D E F
[2 bytes]
Compression Method
[variable]
Deflate data
[8 bytes]
A minimal Gzip archive
Compression method is always 08 (Deflate).
15
1F 8B
8 = Deflate
CM (Compression Method)
This identifies the compression method used in the file. CM
= 0-7 are reserved. CM = 8 denotes the "deflate"
compression method, which is the one customarily used by
gzip and which is documented elsewhere.
16. In details…
0x
1x
1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00
00 00 00 00
0 1 2 3 4 5 6 7 8 9 A B C D E F
Magic
Method
Flags
ModTime
Extra Flags
OS
Deflate data:
- Last/Type
- Length
CRC32
lenUncomp
Some fixed length information is required before and after the Deflate data.
FileName, Comments, Extra Field are variable and optional (not used here).
16
1F 8B
8 = Deflate
None
0/0/1980 00:00
Max compression
Unknown
True/Dynamic Huffman
0
0x00000000
0
17. Zlib <-> Gzip
2 different ways to store a Deflate data stream.
Both with data before and after.
The compressed data can be tranferred,
but both formats aren’t compatible.
17
18. 78 DA 03 00 00 00 00 01
0 1 2 3 4 5 6 7
[4 bits]
Method
[1 byte]
Deflate data
[4 bytes]
8 = Deflate
0x
1x
1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00
00 00 00 00
0 1 2 3 4 5 6 7 8 9 A B C D E F
[2 bytes]
Method
[variable]
Deflate data
[8 bytes]
8 = Deflate
Zlib data stream
GZip “member”
Deflate data
18
19. 78 DA 03 00 00 00 00 01
0 1 2 3 4 5 6 7
Window Size
Method
Flags
Checksum
Deflate data:
- Last/Type
- Length
Adler32
7 = 32Kb
8 = Deflate
No Dictionary Extra
0x78DA % 31 = 0
True/Dynamic Huffman
0
0x00000001
0x
1x
1F 8B 08 00 00 00 00 00 02 FF 03 00 00 00 00 00
00 00 00 00
0 1 2 3 4 5 6 7 8 9 A B C D E F
Magic
Method
Flags
ModTime
Extra Flags
OS
Deflate data:
- Last/Type
- Length
CRC32
lenUncomp
1F 8B
8 = Deflate
None
0/0/1980 00:00
Max compression
Unknown
True/Dynamic Huffman
0
0x00000000
0
Zlib data stream
GZip “member”
Deflate data
19
22. Disambiguation
Deflate is a compression algorithm.
Zip usually uses Deflate, but not necessarily.
Zlib and Gzip are both wrapping only Deflate,
but in a different way.
Same exchangeable data, but no direct compatibility.
22
24. 3 different wrappers around Deflate
Zlib GZIP
member
ZIP
Local
File
Header
Store
Deflate64
Bzip2…
Deflate
25. Conclusion
Deflate is a very standard compression algorithm.
Zip can use Deflate, but other algorithms too (Storing…)
Zip can use a different compression per file.
Zlib is a wrapper around a Deflate stream.
A Gzip member is a wrapper around a Deflate stream.
A Gzip file is one or more members.
25
26. Moving data around
Deflate data can be moved from/to:
- Zlib
- Gzip
- Zip using Deflate
2 bytes before // 4 bytes after.
Variable header // 8 bytes after.
26
32. A Gzip file (with a filename before the Deflate data) 32
33. Magic
Method
Flags
ModTime
Extra Flags
OS
Extra Field:
- Size16
- SubField:
- Type
- Size16
- Data
Filename
- Data
Comment
- Data
Deflate data:
- Last/Type
- Length
- !Length
- Data
CRC32
lenUncomp
1F 8B 08 1C 26 F7 4F 62 00 FF 14 00 G Z 10 00
e x t r a f i e l d d a t a
f i l e n a m e 0 c o m m e n t
0 01 0C 00 F3 FF H e l l o W o r l
d ! A3 1C 29 1C 0C 00 00 00
Extra Field, Filename, Comment: set in Flags
stored between OS and Deflate data.
Filename & Comment: Null-terminated.
Extra field: Size16 first, then SubFields
0x
1x
2x
3x
4x
+0 +1 +2 +3 +4 +5 +6 +7 +8 +9 +A +B +C +D +E +F
1F 8B
8 = Deflate
Extra Field, Filename, Comment
1980/4/8 10:49
None
Unknown
20
GZ
16
“extra field data”
“filename0”
“comment0”
True/Raw
12
0xFFF3
Hello World!
0x1c291ca3
12
33
A full-featured GZIP 4 8 10
TEXT and CRC16 are
not usually supported
36. How can you prove
that it’s the same data?
Make files that are both simultaneously,
with the Deflate data in common 😱😉
ZGip: Zip/Gzip polyglots, with shared Deflate data.
36
37. ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂ ∂
The End G
2
D2
E1
T1
A1
E1
B3
G2
Z10
I1
P3
L1
I1
F4