How GZIP works... in 10 minutes

1,383
-1

Published on

Slides of the talk at the deSymfonyDay unconference

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,383
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
18
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

How GZIP works... in 10 minutes

  1. 1. How GZIP Compression Works Raul Fraile …in 10 minutes
  2. 2. About me • PHP/Symfony2 developer at • PHP 5.3 Zend Certified Engineer • Symfony Certified Developer • BS in Computer Science. Ms(Res) student in Computing Technologies. • Open source: LadybugPHP
  3. 3. What is GZIP? • GZIP is a lossless compression method, we can recover the original data once decompressed. • It has become the de-facto lossless compression method for compressing textual data in websites.
  4. 4. What is GZIP? Web server GET index.html Accept-Encoding: gzip
  5. 5. How it works? • It is based on the DEFLATE algorithm, which is a combination of LZ77 and Huffman coding. • First, the LZ77 algorithm replaces repeated occurrences of data with references. • Second, Huffman coding assigns shorter codes to more frequent “characters”.
  6. 6. How it works? This file is huge! That's because the file is not compressed <33, 9> LZ77
  7. 7. How it works? “compressed” Huffman coding c: 1 o: 1 m: 1 p: 1 r: 1 e: 2 s: 2 d: 1 01100011 01101111 01101101 01110000 01110010 01100101 01110011 01110011 01100101 01100100 1100 011 010 000 001 111 10 10 111 1101
  8. 8. Why GZIP? • GZIP is not the best compression method, but there are a few good reasons to use it. • Provides a good tradeoff between speed and ratio. • Difficulty to add newer compression methods.
  9. 9. Implementations GNU GZIP 7-zip Zopfli Different implementations, different results
  10. 10. GZIP + PHP $originalFile = __DIR__ . '/jquery-1.11.0.min.js'; $gzipFile = __DIR__ . '/jquery-1.11.0.min.js.gz'; $originalData = file_get_contents($originalFile); $gzipData = gzencode($originalData, 9); file_put_contents($gzipFile, $gzipData); var_dump(filesize($originalFile)); // int(96380) var_dump(filesize($gzipFile)); // int(33305)
  11. 11. Beyond GZIP • Preprocessing the text can have an impact on the compression ratio. • How? Optimizing matches.
  12. 12. Beyond GZIP
  13. 13. Beyond GZIP { "name": "Raul", "country": "Spain" }, { "name": "Pablo", "country": "USA" }, { "name": "Pedro", "country": "Spain" } Transposing JSON { "name": [ "Raul", "Pablo", "Pedro" ], "country": [ "Spain", "USA", "Spain" ] }
  14. 14. Beyond GZIP Ordering XML/HTML attributes <input id='f1' class='field' name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" /> <input id="f1" class="field" name="f1" type="text" /> <input class="field" id="f2" type="text" name="f2" /> <input id="f1" class="field" name="f1" type="text" /> <input id="f2" class="field" name="f2" type="text" /> 17,76 % 27,10 % 38,32 % <input type="text" class="field" id="f1" name="f1" /> <input type="text" class="field" id="f2" name="f2" /> 38,32 %
  15. 15. Thank you!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×