SlideShare a Scribd company logo
1 of 20
Download to read offline
HPDC’ 23
Rapidgzip: Parallel Decompression and Seeking in
Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Technische Universität Dresden
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 2 of 20
Motivation
Accessing huge datasets, e.g., from academictorrents.com:
●
wikidata-20220103-all.json.gz: gzip-compressed JSON, 109 GB, 1.4 TB uncompressed
●
ImageNet21K: gzip-compressed TAR archive, 1.2 TB, 14 million images averaging 9 KiB.
Solutions:
●
: Random access TAR mount.
Make (huge) archives’ contents available via FUSE.
●
, indexed_bzip2: Backends for ratarmount for parallel decompression and fast
seeking inside compressed gzip and bzip2 files.
They also offer command line tools for parallelized decompression.
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 3 of 20
Requirements
Data Stream
Data Stream
Data Stream
●
Parallelize gzip decompression
●
Without additional metadata
●
After any seeking
●
For concurrent accesses at two offsets
●
Decompress all kinds of gzip files
●
Enable fast backward and forward seeking
●
After the index has been created
●
While the index has only been partially created
●
Usable as a (Python-)library
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 4 of 20
●
: Random access parallel (indexed)
decompression for gzip files
●
Parallel decompression of gzip files
●
Decompression is faster and less memory intensive
with an existing index
●
The index enables seeking without having to start
decompression from the file beginning
●
Header-only C++ library with Python bindings:
> pip install rapidgzip
●
Also a has a command line interface that can be used
as a drop-in replacement for decompression:
gzip -d → rapidgzip -d
●
Not for compression
● https://github.com/mxmlnkn/rapidgzip
Decompression benchmarks on a 12 GB
FASTQ file using 64 cores of an AMD EPYC
7702 @ 2.0 GHz processor.
g
z
i
p
p
i
g
z
i
g
z
i
p
p
u
g
z
r
a
p
i
d
g
z
i
p
r
a
p
i
d
g
z
i
p
(
i
n
d
e
x
)
0
2
4
6
8
10
12
14
Bandwidth
/
(GB/s)
0.177 0.301 0.78 1.4
5.3
13.1
Introducing rapidgzip
74×
30×
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 5 of 20
Tools Overview
●
1992: gzip by Jean-loup Gailly and Mark Adler
●
2005: zlib/examples/zran.c example by Mark Adler
●
Shows how to resume decompression in the middle of a gzip stream
●
2007: pigz (parallel implementation of gzip) by Mark Adler
●
Compresses in parallel
●
2008: Blocked GNU Zip Format (BGZF) and the command line tool bgzip, part of HTSlib
●
Compresses in parallel to gzip files with additional metadata
●
Can decompress files containing such metadata in parallel
●
James K. Bonfield and others, "HTSlib: C library for reading/writing high-throughput sequencing data", GigaScience, Volume 10, Issue 2, February 2021
●
2016: indexed_gzip: Python module for random access based on zran.c
●
2019: pugz
●
Can decompress gzip-compressed files in parallel if it only contains characters 9–126
●
Kerbiriou, Maël, and Rayan Chikhi. "Parallel decompression of gzip-compressed files and random access to DNA sequences." 2019 IEEE International
Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2019.
→ What makes parallel decompression so difficult?
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 6 of 20
Challenges for Parallel
Decompression:
●
Find deflate block start offsets in bit
stream
●
Handling references to unknown
data
→ Two-Staged Decompression as
introduced by Kerbiriou and Chikhi
(2019)
●
Non-resolvable references result in
markers that get resolved in the
second stage after the 32 KiB
window has become known
Deflate Compression
⎵
o
0 1
0 1
1
0
How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵
if⎵a⎵woodchuck⎵could⎵chuck⎵wood?
LZSS
How⎵much⎵wood⎵would⎵a(13,5)chuck⎵(6,6)
if(21,13)c(39,5)(12,6)(22,4)?
Huffman Coding
11110|10|11111|0|11101|...
H o w ⎵ m
Deflate
Compression
First Stage
How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵
if 20 21 22 23 24 25 26 27 28 29 30 31 32 c 16 17 18 19 20 chuck⎵wood?
1 10 20 30
if(21,13)c(39,5)(12,6)(22,4)?
Second Stage
Let Thread 2
Decompress
Line 2
if⎵a⎵woodchuck⎵could⎵chuck⎵wood?
Marker Replacement Window For Line 2
Distance Length
Huffman Tree
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 7 of 20
Parallelized Deflate Decompression
⎵
o
0 1
0 1
1
0
How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵
if⎵a⎵woodchuck⎵could⎵chuck⎵wood?
LZSS
How⎵much⎵wood⎵would⎵a(13,5)chuck⎵(6,6)
if(21,13)c(39,5)(12,6)(22,4)?
Huffman Coding
11110|10|11111|0|11101|...
H o w ⎵ m
Deflate
Compression
First Stage
How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵
if 20 21 22 23 24 25 26 27 28 29 30 31 32 c 16 17 18 19 20 chuck⎵wood?
1 10 20 30
if(21,13)c(39,5)(12,6)(22,4)?
Second Stage
Let Thread 2
Decompress
Line 2
if⎵a⎵woodchuck⎵could⎵chuck⎵wood?
Marker Replacement Window For Line 2
Distance Length
Huffman Tree
Challenges for Parallel
Decompression:
●
Find offsets in bit stream to start
decompression from
●
Handling references to unknown
data
→ Two-Staged Decompression as
introduced by Kerbiriou and Chikhi
(2019)
●
Non-resolvable references result in
markers that get resolved in the
second stage after the 32 KiB
window has become known
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 8 of 20
Granularities for Parallelization
Gzip File
Gzip Stream
Gzip Stream
...
Deflate Stream
Deflate Block
Deflate Block
...
Deflate Stream
Gzip Stream
Gzip Header
Gzip Footer
Deflate Block
Final Block Flag
Block Data
Block Type
Block Type: 00
Non-Compressed
Length
Non-Compressed
Data
~Length
Block Type: 01
Dynamic Huffman
Tree Lengths
Compressed
Data
Huffman Trees
Block Type: 10
Fixed Huffman
Compressed Data
Often, there is
only one Gzip
stream per file.
Parallelize
decompression of
Deflate blocks
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 9 of 20
Redundancies that Help in Finding Deflate Blocks
Gzip File
Gzip Stream
Gzip Stream
...
Deflate Stream
Deflate Block
Deflate Block
...
Deflate Stream
Gzip Stream
Gzip Header
Gzip Footer
Deflate Block
Final Block Flag
Block Data
Block Type
Block Type: 00
Non-Compressed
Length
Non-Compressed
Data
~Length
Block Type: 01
Dynamic Huffman
Tree Lengths
Compressed
Data
Huffman Trees
Block Type: 10
Fixed Huffman
Compressed Data
Fixed Huffman
Blocks are
mostly used for
very small files
and the tail end
of streams.
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 10 of 20
How Often Will Valid-Looking Block Headers be Found in Random Data?
Invalid Non-optimal Valid and optimal
Code Lengths:
A: 1, B: 1, C: 1
Code Lengths:
A: 2, B: 2, C: 2
Code Lengths:
A: 2, B: 2, C: 1
●
Pugz reduces false positives further by checking that the decompressed data only
contains characters in the range 9-126, an assumption made for FASTQ files
●
Deflate Blocks with Fixed Huffman codings: Ignore because they are rare
●
Non-Compressed Deflate blocks: Look for 16-bit lengths and their one’s complement
→ 1 false positive per 525 kB
●
Deflate Blocks with Dynamic Huffman codings:
Look for valid Deflate block headers and valid and optimal Huffman codings.
~200 offsets pass this test given 1 Tbits of random data → 1 false positive per 625 MB
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 11 of 20
False Positives Can Be Crafted Using Non-Compressed Blocks
Example: Consider a gzip file that is compressed a second time.
Deflate
Stream
example.gz.gz
Gzip Header
Gzip Footer
Length
Non-Compressed
Data
~Length
Final Block Flag
Block Type:
Non-Compressed
Deflate
Stream
example.gz
Gzip Header
Gzip Footer
Block Data
Final Block Flag
Block Type
Block Boundary
of Interest
False Positive
Sequences inside the compressed Block Data might be recognized as Deflate block headers
→ These cases need to be detected and handled
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 12 of 20
Implementation
C₁
0 100 200 300
C₂ C₃ C₄
① Request chunk
at offset 0
Window
First Block
Offset
False
Positive
Thread Pool
C₁
C₃
Guessed Chunk
Boundary
C₄
C₂
C₁
C₃ C₄
C₂
Marker
Compressed Input Data
② Dispatch
③ Prefetch
④ Find block start ⑤ First-stage decompression
Cache
0
Offset Chunk
111
220
305
C₁
C₂
C₃
C₄
⑥ Periodically check for ready chunks
and move them into the cache
until C₁ has become ready
⑧ Return decompressed C₁
Index
0
Offset Size Win.
Dec.
Size
111 340
⑦ Add C₁ information
to the index
⑦
C₁
0 100 200 300
C₂ C₃ C₄
① Request chunk at 111 directly after C₁
② Found chunk in cache but with markers
③ Resolve markers in windows
C₁ C₂ C₃
C₄
No prefetched chunk
matches the end offset
of C₃. The chunk after
will be decompressed
on demand.
⑤ Resolve the markers inside
each chunk in parallel
using the thread pool
④ Add resolved
windows to the index
Index
0
Offset Size Win.
Dec.
Size
111 340
111 109 421
220 87 123
111 220
305 327
Cache
0
Offset Chunk
111
220
305
C₁
C₂
C₃
C₄
⑥ Return decompressed C₂
●
Prefetching to generate work
that can be processed in
parallel
●
Thread pool for work
balancing
●
Cache to speed up seeking
and concurrent
decompression
●
Use block offset as cache key
to catch false positives
●
On-demand cache fill to
recover from errors
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 13 of 20
Implementation
C₁
0 100 200 300
C₂ C₃ C₄
① Request chunk
at offset 0
Window
First Block
Offset
False
Positive
Thread Pool
C₁
C₃
Chunk
Boundary
C₄
C₂
C₁
C₃ C₄
C₂
Marker
Compressed Input Data
② Dispatch Chunk
③ Prefetch Chunk
④ Find block start ⑤ First-stage decompression
Cache
0
Offset Chunk
111
220
305
C₁
C₂
C₃
C₄
⑥ Periodically check for ready chunks
and move them into the cache
until C₁ has become ready
⑧ Return decompressed C₁
Index
0
Offset Size Win.
Dec.
Size
111 340
⑦ Add C₁ information
to the index
⑦
C₁
0 100 200 300
C₂ C₃ C₄
① Request chunk at 111 directly after C₁
② Found chunk in cache but with markers
③ Resolve markers in windows
C₁ C₂ C₃
C₄
No prefetched chunk
matches the end offset
of C₃. The chunk after
will be decompressed
on demand.
⑤ Resolve the markers inside
each chunk in parallel
using the thread pool
④ Add resolved
windows to the index
Index
0
Offset Size Win.
Dec.
Size
111 340
111 109 421
220 107 123
111 220
305 327
Cache
0
Offset Chunk
111
220
305
C₁
C₂
C₃
C₄
⑥ Return decompressed C₂
Chunk 4
●
Prefetching to generate work
that can be processed in
parallel
●
Thread pool for work
balancing
●
Cache to speed up seeking
and concurrent
decompression
●
Use block offset as cache key
to catch false positives
●
On-demand cache fill to
recover from errors
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 14 of 20
Implementation
C₁
0 100 200 300
C₂ C₃ C₄
① Request chunk
at offset 0
Window
First Block
Offset
False
Positive
Thread Pool
C₁
C₃
Chunk
Boundary
C₄
C₂
C₁
C₃ C₄
C₂
Marker
Compressed Input Data
② Dispatch Chunk
③ Prefetch Chunk
④ Find block start ⑤ First-stage decompression
Cache
0
Offset Chunk
111
220
305
C₁
C₂
C₃
C₄
⑥ Periodically check for ready chunks
and move them into the cache
until C₁ has become ready
⑧ Return decompressed C₁
Index
0
Offset Size Win.
Dec.
Size
111 340
⑦ Add C₁ information
to the index
⑦
C₁
0 100 200 300
C₂ C₃ C₄
① Request chunk at 111 directly after C₁
② Found chunk in cache but with markers
③ Resolve markers in windows
C₁ C₂ C₃
C₄
No prefetched chunk
matches the end offset
of C₃. The chunk after
will be decompressed
on demand.
⑤ Resolve the markers inside
each chunk in parallel
using the thread pool
④ Add resolved
windows to the index
Index
0
Offset Size Win.
Dec.
Size
111 340
111 109 421
220 107 123
111 220
305 327
Cache
0
Offset Chunk
111
220
305
C₁
C₂
C₃
C₄
⑥ Return decompressed C₂
Chunk 4
●
Prefetching to generate work
that can be processed in
parallel
●
Thread pool for work
balancing
●
Cache to speed up seeking
and concurrent
decompression
●
Use block offset as cache key
to catch false positives
●
On-demand cache fill to
recover from errors
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 15 of 20
Optimal Chunk Size
0.1250.25 0.5 1 2 4 8 16 32 64 128 256 512 1024
Chunk Size / MiB
1000
2000
1500
3000
Bandwidth
/
(MB/s)
rapidgzip
pugz
49808 12452 3113 779 390 195 98 49 25 13 7
Theoretical Number of Chunks
●
For smaller chunk sizes, the
block finder overhead leads to
worse performance
●
Larger chunk sizes lead to work
balancing issues and also might
adversely affect the cache
behavior and allocation speed
Decompression bandwidth using 16 cores and a 6.08 GiB test file,
which decompresses to 8 GiB.
Optimal chunk sizes: Rapidgzip: 4-8 MiB, Pugz: 32-64 MiB
DBF ... Dynamic Block Finder
NBF ... Non-Compressed Block Finder
3.8×
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 16 of 20
Decompression Benchmark: FASTQ File
Weak-scaling and writing the
results to /dev/null
●
gzip is the slowest, half as
slow as zlib-based pigz
●
rapidgzip with an index
scales up to 20 GB/s, without
an index up to 5 GB/s
●
pugz tops out at 1.4 GB/s
and crashes for 96+ cores
●
igzip by Intel shows leading
single-core performance
●
pigz does not parallelize
1 2 3 4 6 8 12 16 24 32 48 64 96 128
Number of Cores
100
1000
500
200
10000
5000
2000
20000
Bandwidth
/
(MB/s)
rapidgzip (index)
rapidgzip (no index)
linear scal. (no index)
pugz
pigz
igzip
gzip
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 18 of 20
0 2 4 6 8 10 12
Decompression Bandwidth / (GB/s)
pigz -9
pigz -6
pigz -3
pigz -1
igzip -3
igzip -2
igzip -1
igzip -0
gzip -9
gzip -6
gzip -3
gzip -1
bgzip -l 9
bgzip -l 6
bgzip -l 3
bgzip -l 0
bgzip -l -1
Tool
Used
for
Compression
3.73
3.76
3.81
3.82
6.52
6.42
6.15
0.1586
5.03
5.17
5.55
6.05
5.64
5.67
5.9
10.6
5.65
Benchmark: Various Gzip Compressors Rapidgzip Decompression
→
●
rapidgzip can parallelize
decompression for gzip files
produced with a wide variety of tools
and compression levels
●
Contains only Non-Compressed
Deflate blocks so that decompression
is reduced to a fast copy and some
accounting
●
Contains only a single Deflate block
with Fixed Huffman coding and
therefore cannot be parallelized
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 19 of 20
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0
Decompression Bandwidth / (GB/s)
pzstd pzstd
gzip rapidgzip
gzip rapidgzip
bgzip bgzip
pzstd pzstd
zstd pzstd
gzip rapidgzip
gzip rapidgzip
gzip bgzip
bgzip bgzip
pzstd pzstd
zstd pzstd
zstd zstd
gzip igzip
gzip rapidgzip
gzip bgzip
bgzip bgzip
Compression
Tool
Decompression
Tool
8.8
16.43
5.13
5.5
6.78
0.882
4.25
1.86
0.3017
2.82
0.811
0.816
0.82
0.656
0.1527
0.2965
0.2977
1 core
16 cores
128 cores
Benchmark: Competing File Formats
●
igzip is surprisingly competitive to zstd
●
zstd is the fastest in single-core
decompression
●
bgzip and pzstd can only decompress
gzip/zstd files produced by themselves in
parallel
●
zstd-compressed (TAR) files are not eligible
for random access and parallel
decompression. pzstd is recommended
instead to create multi-frame Zstandard files.
●
Applying the rapidgzip approach to
arbitrary Zstandard files might be infeasible
because their window size is not limited to 32
KiB.
3×
+25%
ⁱ rapidgzip decompression with an existing index
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 21 of 20
Improvements Since Submission
●
High memory usage has been alleviated by limiting the decompressed chunk size
Worst case (compression ratio ~1000)
was: ~ 9 GB per thread
now: ~ 200 MB per thread (configurable)
●
The Inflate implementation has been improved for high compression ratios
→ 25 % faster for Silesia by using memcpy/memset for long references
●
CRC32 computation has been added
The slice-by-16 algorithm has been implemented and parallelized using crc32_combine.
→
Achieves ~ 4 GB/s per core (~ 6 % overhead independent of parallelism)
Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching
Maximilian Knespel, Holger Brunst
Slide 22 of 20
●
We have shown that the specialized approach for
parallelized gzip decompression introduced by
Kerbiriou and Chikhi (2019) can be generalized
without affecting performance and stability.
●
Our architecture achieves better performance,
scales to more cores, adds robustness against
false positives, and also increases versatility by
adding fast seeking capabilities.
●
An index is created internally on first time
decompression and it can be exported and
imported to speed up subsequent
decompression and seeking.
●
Can be used with ratarmount to mount .tar.gz
archives.
●
Available at https://github.com/mxmlnkn/rapidgzip
Decompression benchmarks on a 12 GB
FASTQ file using 64 cores of an AMD EPYC
7702 @ 2.0 GHz processor.
g
z
i
p
p
i
g
z
i
g
z
i
p
p
u
g
z
r
a
p
i
d
g
z
i
p
r
a
p
i
d
g
z
i
p
(
i
n
d
e
x
)
0
2
4
6
8
10
12
14
Bandwidth
/
(GB/s)
0.177 0.301 0.78 1.4
5.3
13.1
Summary
74×
30×

More Related Content

What's hot

PayPal Pitch
PayPal PitchPayPal Pitch
PayPal PitchMikeLiu75
 
Airbyte - Seed deck
Airbyte  - Seed deckAirbyte  - Seed deck
Airbyte - Seed deckAirbyte
 
Kredivo Investor Presentation
Kredivo Investor PresentationKredivo Investor Presentation
Kredivo Investor PresentationTech in Asia
 
RetireGuide™ from Betterment - Investing Made Better
RetireGuide™ from Betterment - Investing Made BetterRetireGuide™ from Betterment - Investing Made Better
RetireGuide™ from Betterment - Investing Made BetterBetterment
 
Kryptomon pitch deck: $10M Series A for NFT-based P2E gaming
Kryptomon pitch deck: $10M Series A for NFT-based P2E gamingKryptomon pitch deck: $10M Series A for NFT-based P2E gaming
Kryptomon pitch deck: $10M Series A for NFT-based P2E gamingPitch Decks
 
Deep dive on Amazon Managed Blockchain
Deep dive on Amazon Managed BlockchainDeep dive on Amazon Managed Blockchain
Deep dive on Amazon Managed BlockchainAmazon Web Services
 
Pitch Deck Teardown: Netmaker's $2.3M Seed deck
Pitch Deck Teardown: Netmaker's $2.3M Seed deckPitch Deck Teardown: Netmaker's $2.3M Seed deck
Pitch Deck Teardown: Netmaker's $2.3M Seed deckHajeJanKamps
 
Urjakart.com - Pitch Deck
Urjakart.com - Pitch DeckUrjakart.com - Pitch Deck
Urjakart.com - Pitch DeckVikram Varshney
 
Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅
Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅
Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅Atlassian 대한민국
 
Chambers cisco live keynote external june2012
Chambers cisco live keynote external june2012Chambers cisco live keynote external june2012
Chambers cisco live keynote external june2012Leslie Rubin
 
Harvest Pre-Seed Pitch Deck
Harvest Pre-Seed Pitch DeckHarvest Pre-Seed Pitch Deck
Harvest Pre-Seed Pitch DeckTory Reiss
 
Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)
Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)
Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)Svetlin Nakov
 
Airbyte - Series-A deck
Airbyte - Series-A deckAirbyte - Series-A deck
Airbyte - Series-A deckAirbyte
 

What's hot (20)

Pitch deck v3
Pitch deck v3Pitch deck v3
Pitch deck v3
 
PayPal Pitch
PayPal PitchPayPal Pitch
PayPal Pitch
 
Airbyte - Seed deck
Airbyte  - Seed deckAirbyte  - Seed deck
Airbyte - Seed deck
 
Kredivo Investor Presentation
Kredivo Investor PresentationKredivo Investor Presentation
Kredivo Investor Presentation
 
RetireGuide™ from Betterment - Investing Made Better
RetireGuide™ from Betterment - Investing Made BetterRetireGuide™ from Betterment - Investing Made Better
RetireGuide™ from Betterment - Investing Made Better
 
Pitch deck
Pitch deckPitch deck
Pitch deck
 
Shippo pitch deck
Shippo pitch deckShippo pitch deck
Shippo pitch deck
 
Kryptomon pitch deck: $10M Series A for NFT-based P2E gaming
Kryptomon pitch deck: $10M Series A for NFT-based P2E gamingKryptomon pitch deck: $10M Series A for NFT-based P2E gaming
Kryptomon pitch deck: $10M Series A for NFT-based P2E gaming
 
Deep dive on Amazon Managed Blockchain
Deep dive on Amazon Managed BlockchainDeep dive on Amazon Managed Blockchain
Deep dive on Amazon Managed Blockchain
 
Emma Pre-Seed Deck
Emma Pre-Seed DeckEmma Pre-Seed Deck
Emma Pre-Seed Deck
 
Pitch Deck Teardown: Netmaker's $2.3M Seed deck
Pitch Deck Teardown: Netmaker's $2.3M Seed deckPitch Deck Teardown: Netmaker's $2.3M Seed deck
Pitch Deck Teardown: Netmaker's $2.3M Seed deck
 
Urjakart.com - Pitch Deck
Urjakart.com - Pitch DeckUrjakart.com - Pitch Deck
Urjakart.com - Pitch Deck
 
Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅
Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅
Atlassian 및 오픈소스를 이용한 DevOps 구축 - 한국정보컨설팅
 
Chambers cisco live keynote external june2012
Chambers cisco live keynote external june2012Chambers cisco live keynote external june2012
Chambers cisco live keynote external june2012
 
Demat Account
Demat AccountDemat Account
Demat Account
 
Harvest Pre-Seed Pitch Deck
Harvest Pre-Seed Pitch DeckHarvest Pre-Seed Pitch Deck
Harvest Pre-Seed Pitch Deck
 
VOI Micro Mobility
VOI Micro MobilityVOI Micro Mobility
VOI Micro Mobility
 
Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)
Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)
Blockchain Cryptography for Developers (Nakov @ BGWebSummit 2018)
 
FNTL deck
FNTL deck  FNTL deck
FNTL deck
 
Airbyte - Series-A deck
Airbyte - Series-A deckAirbyte - Series-A deck
Airbyte - Series-A deck
 

Similar to HPDC'23 Rapidgzip

bup backup system (2011-04)
bup backup system (2011-04)bup backup system (2011-04)
bup backup system (2011-04)apenwarr
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formatsAnge Albertini
 
Advanced Debugging with GDB
Advanced Debugging with GDBAdvanced Debugging with GDB
Advanced Debugging with GDBDavid Khosid
 
Gl tf siggraph-2013
Gl tf siggraph-2013Gl tf siggraph-2013
Gl tf siggraph-2013Khaled MAMOU
 
Debugging Applications with GNU Debugger
Debugging Applications with GNU DebuggerDebugging Applications with GNU Debugger
Debugging Applications with GNU DebuggerPriyank Kapadia
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a FoeHaim Yadid
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightGluster.org
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linuxPavel Klimiankou
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderGregSmith458515
 
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...NETWAYS
 
Container con toronto
Container con torontoContainer con toronto
Container con torontoDan Lambright
 
HadoopCompression
HadoopCompressionHadoopCompression
HadoopCompressionDemet Aksoy
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseEric Evans
 
High Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go AssemblyHigh Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go AssemblyMinio
 
Info gdal 20150915
Info gdal 20150915Info gdal 20150915
Info gdal 20150915GeoMedeelel
 
ELC-E Linux Awareness
ELC-E Linux AwarenessELC-E Linux Awareness
ELC-E Linux AwarenessPeter Griffin
 
SEMLA_logging_infra
SEMLA_logging_infraSEMLA_logging_infra
SEMLA_logging_infraswy351
 

Similar to HPDC'23 Rapidgzip (20)

bup backup system (2011-04)
bup backup system (2011-04)bup backup system (2011-04)
bup backup system (2011-04)
 
Cram
CramCram
Cram
 
Relations between archive formats
Relations between archive formatsRelations between archive formats
Relations between archive formats
 
Advanced Debugging with GDB
Advanced Debugging with GDBAdvanced Debugging with GDB
Advanced Debugging with GDB
 
Gl tf siggraph-2013
Gl tf siggraph-2013Gl tf siggraph-2013
Gl tf siggraph-2013
 
Debugging Applications with GNU Debugger
Debugging Applications with GNU DebuggerDebugging Applications with GNU Debugger
Debugging Applications with GNU Debugger
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
 
lab1-ppt.pdf
lab1-ppt.pdflab1-ppt.pdf
lab1-ppt.pdf
 
Troubleshooting .net core on linux
Troubleshooting .net core on linuxTroubleshooting .net core on linux
Troubleshooting .net core on linux
 
Speedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql LoaderSpeedrunning the Open Street Map osm2pgsql Loader
Speedrunning the Open Street Map osm2pgsql Loader
 
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...
 
Container con toronto
Container con torontoContainer con toronto
Container con toronto
 
HadoopCompression
HadoopCompressionHadoopCompression
HadoopCompression
 
Wikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-caseWikimedia Content API: A Cassandra Use-case
Wikimedia Content API: A Cassandra Use-case
 
High Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go AssemblyHigh Performance Scaling Techniques in Golang Using Go Assembly
High Performance Scaling Techniques in Golang Using Go Assembly
 
Info gdal 20150915
Info gdal 20150915Info gdal 20150915
Info gdal 20150915
 
GDB Rocks!
GDB Rocks!GDB Rocks!
GDB Rocks!
 
ELC-E Linux Awareness
ELC-E Linux AwarenessELC-E Linux Awareness
ELC-E Linux Awareness
 
SEMLA_logging_infra
SEMLA_logging_infraSEMLA_logging_infra
SEMLA_logging_infra
 

Recently uploaded

OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...NETWAYS
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024eCommerce Institute
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝soniya singh
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...NETWAYS
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...NETWAYS
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@vikas rana
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AITatiana Gurgel
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptssuser319dad
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesPooja Nehwal
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...NETWAYS
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )Pooja Nehwal
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfhenrik385807
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxFamilyWorshipCenterD
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...NETWAYS
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Salam Al-Karadaghi
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfhenrik385807
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrsaastr
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Kayode Fayemi
 

Recently uploaded (20)

OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
OSCamp Kubernetes 2024 | Zero-Touch OS-Infrastruktur für Container und Kubern...
 
George Lever - eCommerce Day Chile 2024
George Lever -  eCommerce Day Chile 2024George Lever -  eCommerce Day Chile 2024
George Lever - eCommerce Day Chile 2024
 
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
Call Girls in Sarojini Nagar Market Delhi 💯 Call Us 🔝8264348440🔝
 
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
OSCamp Kubernetes 2024 | A Tester's Guide to CI_CD as an Automated Quality Co...
 
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Vaishnavi 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Vaishnavi 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
Open Source Camp Kubernetes 2024 | Monitoring Kubernetes With Icinga by Eric ...
 
call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@call girls in delhi malviya nagar @9811711561@
call girls in delhi malviya nagar @9811711561@
 
Microsoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AIMicrosoft Copilot AI for Everyone - created by AI
Microsoft Copilot AI for Everyone - created by AI
 
Philippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.pptPhilippine History cavite Mutiny Report.ppt
Philippine History cavite Mutiny Report.ppt
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
OSCamp Kubernetes 2024 | SRE Challenges in Monolith to Microservices Shift at...
 
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Rohini Delhi 💯Call Us 🔝8264348440🔝
 
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
WhatsApp 📞 9892124323 ✅Call Girls In Juhu ( Mumbai )
 
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdfCTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
 
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptxGenesis part 2 Isaiah Scudder 04-24-2024.pptx
Genesis part 2 Isaiah Scudder 04-24-2024.pptx
 
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
Open Source Camp Kubernetes 2024 | Running WebAssembly on Kubernetes by Alex ...
 
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
Exploring protein-protein interactions by Weak Affinity Chromatography (WAC) ...
 
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdfOpen Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
Open Source Strategy in Logistics 2015_Henrik Hankedvz-d-nl-log-conference.pdf
 
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStrSaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
SaaStr Workshop Wednesday w: Jason Lemkin, SaaStr
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 

HPDC'23 Rapidgzip

  • 1. HPDC’ 23 Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Technische Universität Dresden
  • 2. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 2 of 20 Motivation Accessing huge datasets, e.g., from academictorrents.com: ● wikidata-20220103-all.json.gz: gzip-compressed JSON, 109 GB, 1.4 TB uncompressed ● ImageNet21K: gzip-compressed TAR archive, 1.2 TB, 14 million images averaging 9 KiB. Solutions: ● : Random access TAR mount. Make (huge) archives’ contents available via FUSE. ● , indexed_bzip2: Backends for ratarmount for parallel decompression and fast seeking inside compressed gzip and bzip2 files. They also offer command line tools for parallelized decompression.
  • 3. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 3 of 20 Requirements Data Stream Data Stream Data Stream ● Parallelize gzip decompression ● Without additional metadata ● After any seeking ● For concurrent accesses at two offsets ● Decompress all kinds of gzip files ● Enable fast backward and forward seeking ● After the index has been created ● While the index has only been partially created ● Usable as a (Python-)library
  • 4. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 4 of 20 ● : Random access parallel (indexed) decompression for gzip files ● Parallel decompression of gzip files ● Decompression is faster and less memory intensive with an existing index ● The index enables seeking without having to start decompression from the file beginning ● Header-only C++ library with Python bindings: > pip install rapidgzip ● Also a has a command line interface that can be used as a drop-in replacement for decompression: gzip -d → rapidgzip -d ● Not for compression ● https://github.com/mxmlnkn/rapidgzip Decompression benchmarks on a 12 GB FASTQ file using 64 cores of an AMD EPYC 7702 @ 2.0 GHz processor. g z i p p i g z i g z i p p u g z r a p i d g z i p r a p i d g z i p ( i n d e x ) 0 2 4 6 8 10 12 14 Bandwidth / (GB/s) 0.177 0.301 0.78 1.4 5.3 13.1 Introducing rapidgzip 74× 30×
  • 5. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 5 of 20 Tools Overview ● 1992: gzip by Jean-loup Gailly and Mark Adler ● 2005: zlib/examples/zran.c example by Mark Adler ● Shows how to resume decompression in the middle of a gzip stream ● 2007: pigz (parallel implementation of gzip) by Mark Adler ● Compresses in parallel ● 2008: Blocked GNU Zip Format (BGZF) and the command line tool bgzip, part of HTSlib ● Compresses in parallel to gzip files with additional metadata ● Can decompress files containing such metadata in parallel ● James K. Bonfield and others, "HTSlib: C library for reading/writing high-throughput sequencing data", GigaScience, Volume 10, Issue 2, February 2021 ● 2016: indexed_gzip: Python module for random access based on zran.c ● 2019: pugz ● Can decompress gzip-compressed files in parallel if it only contains characters 9–126 ● Kerbiriou, Maël, and Rayan Chikhi. "Parallel decompression of gzip-compressed files and random access to DNA sequences." 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2019. → What makes parallel decompression so difficult?
  • 6. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 6 of 20 Challenges for Parallel Decompression: ● Find deflate block start offsets in bit stream ● Handling references to unknown data → Two-Staged Decompression as introduced by Kerbiriou and Chikhi (2019) ● Non-resolvable references result in markers that get resolved in the second stage after the 32 KiB window has become known Deflate Compression ⎵ o 0 1 0 1 1 0 How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵ if⎵a⎵woodchuck⎵could⎵chuck⎵wood? LZSS How⎵much⎵wood⎵would⎵a(13,5)chuck⎵(6,6) if(21,13)c(39,5)(12,6)(22,4)? Huffman Coding 11110|10|11111|0|11101|... H o w ⎵ m Deflate Compression First Stage How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵ if 20 21 22 23 24 25 26 27 28 29 30 31 32 c 16 17 18 19 20 chuck⎵wood? 1 10 20 30 if(21,13)c(39,5)(12,6)(22,4)? Second Stage Let Thread 2 Decompress Line 2 if⎵a⎵woodchuck⎵could⎵chuck⎵wood? Marker Replacement Window For Line 2 Distance Length Huffman Tree
  • 7. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 7 of 20 Parallelized Deflate Decompression ⎵ o 0 1 0 1 1 0 How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵ if⎵a⎵woodchuck⎵could⎵chuck⎵wood? LZSS How⎵much⎵wood⎵would⎵a(13,5)chuck⎵(6,6) if(21,13)c(39,5)(12,6)(22,4)? Huffman Coding 11110|10|11111|0|11101|... H o w ⎵ m Deflate Compression First Stage How⎵much⎵wood⎵would⎵a⎵woodchuck⎵chuck⎵ if 20 21 22 23 24 25 26 27 28 29 30 31 32 c 16 17 18 19 20 chuck⎵wood? 1 10 20 30 if(21,13)c(39,5)(12,6)(22,4)? Second Stage Let Thread 2 Decompress Line 2 if⎵a⎵woodchuck⎵could⎵chuck⎵wood? Marker Replacement Window For Line 2 Distance Length Huffman Tree Challenges for Parallel Decompression: ● Find offsets in bit stream to start decompression from ● Handling references to unknown data → Two-Staged Decompression as introduced by Kerbiriou and Chikhi (2019) ● Non-resolvable references result in markers that get resolved in the second stage after the 32 KiB window has become known
  • 8. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 8 of 20 Granularities for Parallelization Gzip File Gzip Stream Gzip Stream ... Deflate Stream Deflate Block Deflate Block ... Deflate Stream Gzip Stream Gzip Header Gzip Footer Deflate Block Final Block Flag Block Data Block Type Block Type: 00 Non-Compressed Length Non-Compressed Data ~Length Block Type: 01 Dynamic Huffman Tree Lengths Compressed Data Huffman Trees Block Type: 10 Fixed Huffman Compressed Data Often, there is only one Gzip stream per file. Parallelize decompression of Deflate blocks
  • 9. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 9 of 20 Redundancies that Help in Finding Deflate Blocks Gzip File Gzip Stream Gzip Stream ... Deflate Stream Deflate Block Deflate Block ... Deflate Stream Gzip Stream Gzip Header Gzip Footer Deflate Block Final Block Flag Block Data Block Type Block Type: 00 Non-Compressed Length Non-Compressed Data ~Length Block Type: 01 Dynamic Huffman Tree Lengths Compressed Data Huffman Trees Block Type: 10 Fixed Huffman Compressed Data Fixed Huffman Blocks are mostly used for very small files and the tail end of streams.
  • 10. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 10 of 20 How Often Will Valid-Looking Block Headers be Found in Random Data? Invalid Non-optimal Valid and optimal Code Lengths: A: 1, B: 1, C: 1 Code Lengths: A: 2, B: 2, C: 2 Code Lengths: A: 2, B: 2, C: 1 ● Pugz reduces false positives further by checking that the decompressed data only contains characters in the range 9-126, an assumption made for FASTQ files ● Deflate Blocks with Fixed Huffman codings: Ignore because they are rare ● Non-Compressed Deflate blocks: Look for 16-bit lengths and their one’s complement → 1 false positive per 525 kB ● Deflate Blocks with Dynamic Huffman codings: Look for valid Deflate block headers and valid and optimal Huffman codings. ~200 offsets pass this test given 1 Tbits of random data → 1 false positive per 625 MB
  • 11. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 11 of 20 False Positives Can Be Crafted Using Non-Compressed Blocks Example: Consider a gzip file that is compressed a second time. Deflate Stream example.gz.gz Gzip Header Gzip Footer Length Non-Compressed Data ~Length Final Block Flag Block Type: Non-Compressed Deflate Stream example.gz Gzip Header Gzip Footer Block Data Final Block Flag Block Type Block Boundary of Interest False Positive Sequences inside the compressed Block Data might be recognized as Deflate block headers → These cases need to be detected and handled
  • 12. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 12 of 20 Implementation C₁ 0 100 200 300 C₂ C₃ C₄ ① Request chunk at offset 0 Window First Block Offset False Positive Thread Pool C₁ C₃ Guessed Chunk Boundary C₄ C₂ C₁ C₃ C₄ C₂ Marker Compressed Input Data ② Dispatch ③ Prefetch ④ Find block start ⑤ First-stage decompression Cache 0 Offset Chunk 111 220 305 C₁ C₂ C₃ C₄ ⑥ Periodically check for ready chunks and move them into the cache until C₁ has become ready ⑧ Return decompressed C₁ Index 0 Offset Size Win. Dec. Size 111 340 ⑦ Add C₁ information to the index ⑦ C₁ 0 100 200 300 C₂ C₃ C₄ ① Request chunk at 111 directly after C₁ ② Found chunk in cache but with markers ③ Resolve markers in windows C₁ C₂ C₃ C₄ No prefetched chunk matches the end offset of C₃. The chunk after will be decompressed on demand. ⑤ Resolve the markers inside each chunk in parallel using the thread pool ④ Add resolved windows to the index Index 0 Offset Size Win. Dec. Size 111 340 111 109 421 220 87 123 111 220 305 327 Cache 0 Offset Chunk 111 220 305 C₁ C₂ C₃ C₄ ⑥ Return decompressed C₂ ● Prefetching to generate work that can be processed in parallel ● Thread pool for work balancing ● Cache to speed up seeking and concurrent decompression ● Use block offset as cache key to catch false positives ● On-demand cache fill to recover from errors
  • 13. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 13 of 20 Implementation C₁ 0 100 200 300 C₂ C₃ C₄ ① Request chunk at offset 0 Window First Block Offset False Positive Thread Pool C₁ C₃ Chunk Boundary C₄ C₂ C₁ C₃ C₄ C₂ Marker Compressed Input Data ② Dispatch Chunk ③ Prefetch Chunk ④ Find block start ⑤ First-stage decompression Cache 0 Offset Chunk 111 220 305 C₁ C₂ C₃ C₄ ⑥ Periodically check for ready chunks and move them into the cache until C₁ has become ready ⑧ Return decompressed C₁ Index 0 Offset Size Win. Dec. Size 111 340 ⑦ Add C₁ information to the index ⑦ C₁ 0 100 200 300 C₂ C₃ C₄ ① Request chunk at 111 directly after C₁ ② Found chunk in cache but with markers ③ Resolve markers in windows C₁ C₂ C₃ C₄ No prefetched chunk matches the end offset of C₃. The chunk after will be decompressed on demand. ⑤ Resolve the markers inside each chunk in parallel using the thread pool ④ Add resolved windows to the index Index 0 Offset Size Win. Dec. Size 111 340 111 109 421 220 107 123 111 220 305 327 Cache 0 Offset Chunk 111 220 305 C₁ C₂ C₃ C₄ ⑥ Return decompressed C₂ Chunk 4 ● Prefetching to generate work that can be processed in parallel ● Thread pool for work balancing ● Cache to speed up seeking and concurrent decompression ● Use block offset as cache key to catch false positives ● On-demand cache fill to recover from errors
  • 14. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 14 of 20 Implementation C₁ 0 100 200 300 C₂ C₃ C₄ ① Request chunk at offset 0 Window First Block Offset False Positive Thread Pool C₁ C₃ Chunk Boundary C₄ C₂ C₁ C₃ C₄ C₂ Marker Compressed Input Data ② Dispatch Chunk ③ Prefetch Chunk ④ Find block start ⑤ First-stage decompression Cache 0 Offset Chunk 111 220 305 C₁ C₂ C₃ C₄ ⑥ Periodically check for ready chunks and move them into the cache until C₁ has become ready ⑧ Return decompressed C₁ Index 0 Offset Size Win. Dec. Size 111 340 ⑦ Add C₁ information to the index ⑦ C₁ 0 100 200 300 C₂ C₃ C₄ ① Request chunk at 111 directly after C₁ ② Found chunk in cache but with markers ③ Resolve markers in windows C₁ C₂ C₃ C₄ No prefetched chunk matches the end offset of C₃. The chunk after will be decompressed on demand. ⑤ Resolve the markers inside each chunk in parallel using the thread pool ④ Add resolved windows to the index Index 0 Offset Size Win. Dec. Size 111 340 111 109 421 220 107 123 111 220 305 327 Cache 0 Offset Chunk 111 220 305 C₁ C₂ C₃ C₄ ⑥ Return decompressed C₂ Chunk 4 ● Prefetching to generate work that can be processed in parallel ● Thread pool for work balancing ● Cache to speed up seeking and concurrent decompression ● Use block offset as cache key to catch false positives ● On-demand cache fill to recover from errors
  • 15. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 15 of 20 Optimal Chunk Size 0.1250.25 0.5 1 2 4 8 16 32 64 128 256 512 1024 Chunk Size / MiB 1000 2000 1500 3000 Bandwidth / (MB/s) rapidgzip pugz 49808 12452 3113 779 390 195 98 49 25 13 7 Theoretical Number of Chunks ● For smaller chunk sizes, the block finder overhead leads to worse performance ● Larger chunk sizes lead to work balancing issues and also might adversely affect the cache behavior and allocation speed Decompression bandwidth using 16 cores and a 6.08 GiB test file, which decompresses to 8 GiB. Optimal chunk sizes: Rapidgzip: 4-8 MiB, Pugz: 32-64 MiB DBF ... Dynamic Block Finder NBF ... Non-Compressed Block Finder 3.8×
  • 16. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 16 of 20 Decompression Benchmark: FASTQ File Weak-scaling and writing the results to /dev/null ● gzip is the slowest, half as slow as zlib-based pigz ● rapidgzip with an index scales up to 20 GB/s, without an index up to 5 GB/s ● pugz tops out at 1.4 GB/s and crashes for 96+ cores ● igzip by Intel shows leading single-core performance ● pigz does not parallelize 1 2 3 4 6 8 12 16 24 32 48 64 96 128 Number of Cores 100 1000 500 200 10000 5000 2000 20000 Bandwidth / (MB/s) rapidgzip (index) rapidgzip (no index) linear scal. (no index) pugz pigz igzip gzip
  • 17. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 18 of 20 0 2 4 6 8 10 12 Decompression Bandwidth / (GB/s) pigz -9 pigz -6 pigz -3 pigz -1 igzip -3 igzip -2 igzip -1 igzip -0 gzip -9 gzip -6 gzip -3 gzip -1 bgzip -l 9 bgzip -l 6 bgzip -l 3 bgzip -l 0 bgzip -l -1 Tool Used for Compression 3.73 3.76 3.81 3.82 6.52 6.42 6.15 0.1586 5.03 5.17 5.55 6.05 5.64 5.67 5.9 10.6 5.65 Benchmark: Various Gzip Compressors Rapidgzip Decompression → ● rapidgzip can parallelize decompression for gzip files produced with a wide variety of tools and compression levels ● Contains only Non-Compressed Deflate blocks so that decompression is reduced to a fast copy and some accounting ● Contains only a single Deflate block with Fixed Huffman coding and therefore cannot be parallelized
  • 18. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 19 of 20 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 Decompression Bandwidth / (GB/s) pzstd pzstd gzip rapidgzip gzip rapidgzip bgzip bgzip pzstd pzstd zstd pzstd gzip rapidgzip gzip rapidgzip gzip bgzip bgzip bgzip pzstd pzstd zstd pzstd zstd zstd gzip igzip gzip rapidgzip gzip bgzip bgzip bgzip Compression Tool Decompression Tool 8.8 16.43 5.13 5.5 6.78 0.882 4.25 1.86 0.3017 2.82 0.811 0.816 0.82 0.656 0.1527 0.2965 0.2977 1 core 16 cores 128 cores Benchmark: Competing File Formats ● igzip is surprisingly competitive to zstd ● zstd is the fastest in single-core decompression ● bgzip and pzstd can only decompress gzip/zstd files produced by themselves in parallel ● zstd-compressed (TAR) files are not eligible for random access and parallel decompression. pzstd is recommended instead to create multi-frame Zstandard files. ● Applying the rapidgzip approach to arbitrary Zstandard files might be infeasible because their window size is not limited to 32 KiB. 3× +25% ⁱ rapidgzip decompression with an existing index
  • 19. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 21 of 20 Improvements Since Submission ● High memory usage has been alleviated by limiting the decompressed chunk size Worst case (compression ratio ~1000) was: ~ 9 GB per thread now: ~ 200 MB per thread (configurable) ● The Inflate implementation has been improved for high compression ratios → 25 % faster for Silesia by using memcpy/memset for long references ● CRC32 computation has been added The slice-by-16 algorithm has been implemented and parallelized using crc32_combine. → Achieves ~ 4 GB/s per core (~ 6 % overhead independent of parallelism)
  • 20. Rapidgzip: Parallel Decompression and Seeking in Gzip Files Using Cache Prefetching Maximilian Knespel, Holger Brunst Slide 22 of 20 ● We have shown that the specialized approach for parallelized gzip decompression introduced by Kerbiriou and Chikhi (2019) can be generalized without affecting performance and stability. ● Our architecture achieves better performance, scales to more cores, adds robustness against false positives, and also increases versatility by adding fast seeking capabilities. ● An index is created internally on first time decompression and it can be exported and imported to speed up subsequent decompression and seeking. ● Can be used with ratarmount to mount .tar.gz archives. ● Available at https://github.com/mxmlnkn/rapidgzip Decompression benchmarks on a 12 GB FASTQ file using 64 cores of an AMD EPYC 7702 @ 2.0 GHz processor. g z i p p i g z i g z i p p u g z r a p i d g z i p r a p i d g z i p ( i n d e x ) 0 2 4 6 8 10 12 14 Bandwidth / (GB/s) 0.177 0.301 0.78 1.4 5.3 13.1 Summary 74× 30×