SlideShare a Scribd company logo

Data Compression 2020

Presented at the 2nd GameTech Meetup in Vienna.

1 of 36
Download to read offline
Smaller is better
Data Compression
Dietmar Hauser | roborodent e.U. | 2020
Why data size matters
Save money
Initial download / updates
Continuous connections
Expand reach
Decreased loading times
Smaller app size
Isn’t this handled by the platform?
Little incentive
„Good enough“ attitude
CPU / Memory gap
Bandwidth / Fidelity gap
Stand out from competition!
Compression in theory
Wikipedia:
„[...] encoding information
using fewer bits
than the original representation“
Two flavours of compression
Lossless
All information is retained
Lossy
An approximation is retained
History & Concepts
Information Theory, ~1948
Claude Shannon
Entropy
Shannon limit

Recommended

How to optimize your windows computer
How to optimize your windows computerHow to optimize your windows computer
How to optimize your windows computerMalik Browne
 
10 cool features in defrag 10
10 cool features in defrag 1010 cool features in defrag 10
10 cool features in defrag 10aosborne
 
Improving The Performance of Your Web App
Improving The Performance of Your Web AppImproving The Performance of Your Web App
Improving The Performance of Your Web AppJoe Stump
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystemsroyans
 
Flickr Php
Flickr PhpFlickr Php
Flickr Phproyans
 
Web Hosting - Web Hosting Curriculum [1/10]
Web Hosting - Web Hosting Curriculum [1/10]Web Hosting - Web Hosting Curriculum [1/10]
Web Hosting - Web Hosting Curriculum [1/10]Web Hosting for Students
 
Why flash storage caching
Why flash storage cachingWhy flash storage caching
Why flash storage cachingTim Fletcher
 
Filesystems
FilesystemsFilesystems
Filesystemsroyans
 

More Related Content

Similar to Data Compression 2020

Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)
Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)
Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)Ontico
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystemsroyans
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystemsroyans
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystemsguest18a0f1
 
Beyond the File System: Designing Large-Scale File Storage and Serving
 	Beyond the File System: Designing Large-Scale File Storage and Serving 	Beyond the File System: Designing Large-Scale File Storage and Serving
Beyond the File System: Designing Large-Scale File Storage and Servingmclee
 
State of the Art Thin Provisioning
State of the Art Thin ProvisioningState of the Art Thin Provisioning
State of the Art Thin ProvisioningStephen Foskett
 
Hadoop compression strata conference
Hadoop compression strata conferenceHadoop compression strata conference
Hadoop compression strata conferencenkabra
 
Beyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and ServingBeyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and Servingmclee
 
Hadoop at a glance
Hadoop at a glanceHadoop at a glance
Hadoop at a glanceTan Tran
 
G zip compresser ppt
G zip compresser pptG zip compresser ppt
G zip compresser pptgaurav kumar
 
Elements of Streamlined Online Course Design
Elements of Streamlined Online Course DesignElements of Streamlined Online Course Design
Elements of Streamlined Online Course DesignD2L Barry
 
Domino server and application performance in the real world
Domino server and application performance in the real worldDomino server and application performance in the real world
Domino server and application performance in the real worlddominion
 
FOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack WorkshopFOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack Workshopdlieberman
 
S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2Tony Pearson
 

Similar to Data Compression 2020 (20)

Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)
Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)
Ужимай и властвуй алгоритмы компрессии в базах данных / Петр Зайцев (Percona)
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
 
Web20expo Filesystems
Web20expo FilesystemsWeb20expo Filesystems
Web20expo Filesystems
 
Beyond the File System: Designing Large-Scale File Storage and Serving
 	Beyond the File System: Designing Large-Scale File Storage and Serving 	Beyond the File System: Designing Large-Scale File Storage and Serving
Beyond the File System: Designing Large-Scale File Storage and Serving
 
State of the Art Thin Provisioning
State of the Art Thin ProvisioningState of the Art Thin Provisioning
State of the Art Thin Provisioning
 
Hadoop compression strata conference
Hadoop compression strata conferenceHadoop compression strata conference
Hadoop compression strata conference
 
Coping with Cyber Monday
Coping with Cyber MondayCoping with Cyber Monday
Coping with Cyber Monday
 
Frontend Caching - The "new" frontier
Frontend Caching - The "new" frontierFrontend Caching - The "new" frontier
Frontend Caching - The "new" frontier
 
Beyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and ServingBeyond the File System - Designing Large Scale File Storage and Serving
Beyond the File System - Designing Large Scale File Storage and Serving
 
Hadoop at a glance
Hadoop at a glanceHadoop at a glance
Hadoop at a glance
 
G zip compresser ppt
G zip compresser pptG zip compresser ppt
G zip compresser ppt
 
Elements of Streamlined Online Course Design
Elements of Streamlined Online Course DesignElements of Streamlined Online Course Design
Elements of Streamlined Online Course Design
 
Domino server and application performance in the real world
Domino server and application performance in the real worldDomino server and application performance in the real world
Domino server and application performance in the real world
 
Caching for Cash, part 4 DPC 2009
Caching for Cash, part 4 DPC 2009Caching for Cash, part 4 DPC 2009
Caching for Cash, part 4 DPC 2009
 
Real world capacity
Real world capacityReal world capacity
Real world capacity
 
FOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack WorkshopFOWA Scaling The Lamp Stack Workshop
FOWA Scaling The Lamp Stack Workshop
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
 
S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2S de2784 footprint-reduction-edge2015-v2
S de2784 footprint-reduction-edge2015-v2
 
Ce202 Storage
Ce202 StorageCe202 Storage
Ce202 Storage
 

More from Dietmar Hauser

The Case Against Human Readability
The Case Against Human ReadabilityThe Case Against Human Readability
The Case Against Human ReadabilityDietmar Hauser
 
More Intuitive Programming Through Better Code Completion
More Intuitive Programming Through Better Code CompletionMore Intuitive Programming Through Better Code Completion
More Intuitive Programming Through Better Code CompletionDietmar Hauser
 
Going Rogue - 8 Months On My Own
Going Rogue - 8 Months On My OwnGoing Rogue - 8 Months On My Own
Going Rogue - 8 Months On My OwnDietmar Hauser
 
The Rocky Road to KISS Rock City
The Rocky Road to KISS Rock CityThe Rocky Road to KISS Rock City
The Rocky Road to KISS Rock CityDietmar Hauser
 
A Half Life in Game Development
A Half Life in Game DevelopmentA Half Life in Game Development
A Half Life in Game DevelopmentDietmar Hauser
 
The Unusual Rendering Pipeline of Sigils - Battle for Raios
The Unusual Rendering Pipeline of Sigils - Battle for RaiosThe Unusual Rendering Pipeline of Sigils - Battle for Raios
The Unusual Rendering Pipeline of Sigils - Battle for RaiosDietmar Hauser
 
Toolchain Independent Distributed Compilation
Toolchain Independent Distributed CompilationToolchain Independent Distributed Compilation
Toolchain Independent Distributed CompilationDietmar Hauser
 
The Difficulty of Going Mobile
The Difficulty of Going MobileThe Difficulty of Going Mobile
The Difficulty of Going MobileDietmar Hauser
 
Handling Many Platforms with a Small Development Team
Handling Many Platforms with a Small Development TeamHandling Many Platforms with a Small Development Team
Handling Many Platforms with a Small Development TeamDietmar Hauser
 

More from Dietmar Hauser (13)

The Case Against Human Readability
The Case Against Human ReadabilityThe Case Against Human Readability
The Case Against Human Readability
 
Every Tool Sucks
Every Tool SucksEvery Tool Sucks
Every Tool Sucks
 
More Intuitive Programming Through Better Code Completion
More Intuitive Programming Through Better Code CompletionMore Intuitive Programming Through Better Code Completion
More Intuitive Programming Through Better Code Completion
 
The Abstraction Trap
The Abstraction TrapThe Abstraction Trap
The Abstraction Trap
 
The Settlers Returns
The Settlers ReturnsThe Settlers Returns
The Settlers Returns
 
Going Rogue - 8 Months On My Own
Going Rogue - 8 Months On My OwnGoing Rogue - 8 Months On My Own
Going Rogue - 8 Months On My Own
 
The Rocky Road to KISS Rock City
The Rocky Road to KISS Rock CityThe Rocky Road to KISS Rock City
The Rocky Road to KISS Rock City
 
A Half Life in Game Development
A Half Life in Game DevelopmentA Half Life in Game Development
A Half Life in Game Development
 
Devil Dentist
Devil DentistDevil Dentist
Devil Dentist
 
The Unusual Rendering Pipeline of Sigils - Battle for Raios
The Unusual Rendering Pipeline of Sigils - Battle for RaiosThe Unusual Rendering Pipeline of Sigils - Battle for Raios
The Unusual Rendering Pipeline of Sigils - Battle for Raios
 
Toolchain Independent Distributed Compilation
Toolchain Independent Distributed CompilationToolchain Independent Distributed Compilation
Toolchain Independent Distributed Compilation
 
The Difficulty of Going Mobile
The Difficulty of Going MobileThe Difficulty of Going Mobile
The Difficulty of Going Mobile
 
Handling Many Platforms with a Small Development Team
Handling Many Platforms with a Small Development TeamHandling Many Platforms with a Small Development Team
Handling Many Platforms with a Small Development Team
 

Recently uploaded

SATHVIKA A AD21049 SELF INTRODUCTION.pdf
SATHVIKA A AD21049 SELF INTRODUCTION.pdfSATHVIKA A AD21049 SELF INTRODUCTION.pdf
SATHVIKA A AD21049 SELF INTRODUCTION.pdfSathvikaAlagar
 
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...GauravBhartie
 
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.pptMohanumar S
 
MAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEM
MAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEMMAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEM
MAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEMArunkumar Tulasi
 
Introduction communication assignmen.pdf
Introduction communication assignmen.pdfIntroduction communication assignmen.pdf
Introduction communication assignmen.pdfKannigaSaraswathyM
 
my goal is place in mnc's companies and got good salary
my goal is place in mnc's companies and got good salarymy goal is place in mnc's companies and got good salary
my goal is place in mnc's companies and got good salarymonoarul2004
 
Critical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdfCritical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdfMollyWinterbottom
 
Module 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptxModule 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptxnikshaikh786
 
Architectural Preservation - Heritage, focused on Saudi Arabia
Architectural Preservation - Heritage, focused on Saudi ArabiaArchitectural Preservation - Heritage, focused on Saudi Arabia
Architectural Preservation - Heritage, focused on Saudi ArabiaIgnacio J. Palma, Arch PhD.
 
chap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processignchap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processignteddymebratie
 
self introduction sri balaji
self introduction sri balajiself introduction sri balaji
self introduction sri balajiSriBalaji891607
 
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...GauravBhartie
 
PM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptxPM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptxnissamant
 
Presentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptxPresentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptxasmitaTele2
 
ROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdf
ROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdfROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdf
ROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdfRudraPratapSingh871925
 
Deluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdfDeluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdfartpoa9
 
Pointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptxPointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptxAnanthi Palanisamy
 
Introduction to Binary Tree and Conersion of General tree to Binary Tree
Introduction to Binary Tree  and Conersion of General tree to Binary TreeIntroduction to Binary Tree  and Conersion of General tree to Binary Tree
Introduction to Binary Tree and Conersion of General tree to Binary TreeSwarupaDeshpande4
 
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
S. Kim,  NeurIPS 2023,  MLILAB,  KAISTAIS. Kim,  NeurIPS 2023,  MLILAB,  KAISTAI
S. Kim, NeurIPS 2023, MLILAB, KAISTAIMLILAB
 

Recently uploaded (20)

SATHVIKA A AD21049 SELF INTRODUCTION.pdf
SATHVIKA A AD21049 SELF INTRODUCTION.pdfSATHVIKA A AD21049 SELF INTRODUCTION.pdf
SATHVIKA A AD21049 SELF INTRODUCTION.pdf
 
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
 
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
20CE501PE – INDUSTRIAL WASTE MANAGEMENT.ppt
 
MAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEM
MAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEMMAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEM
MAXIMUM POWER POINT TRACKING ALGORITHMS APPLIED TO WIND-SOLAR HYBRID SYSTEM
 
Introduction communication assignmen.pdf
Introduction communication assignmen.pdfIntroduction communication assignmen.pdf
Introduction communication assignmen.pdf
 
my goal is place in mnc's companies and got good salary
my goal is place in mnc's companies and got good salarymy goal is place in mnc's companies and got good salary
my goal is place in mnc's companies and got good salary
 
Going Staff
Going StaffGoing Staff
Going Staff
 
Critical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdfCritical Literature Review Final -MW.pdf
Critical Literature Review Final -MW.pdf
 
Module 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptxModule 2_ Divide and Conquer Approach.pptx
Module 2_ Divide and Conquer Approach.pptx
 
Architectural Preservation - Heritage, focused on Saudi Arabia
Architectural Preservation - Heritage, focused on Saudi ArabiaArchitectural Preservation - Heritage, focused on Saudi Arabia
Architectural Preservation - Heritage, focused on Saudi Arabia
 
chap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processignchap. 3. lipid deterioration oil and fat processign
chap. 3. lipid deterioration oil and fat processign
 
self introduction sri balaji
self introduction sri balajiself introduction sri balaji
self introduction sri balaji
 
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
Microstrip Bandpass Filter Design using EDA Tolol such as keysight ADS and An...
 
PM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptxPM24_Oral_Presentation_Template_Guidelines.pptx
PM24_Oral_Presentation_Template_Guidelines.pptx
 
Presentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptxPresentation of Helmet Detection Using Machine Learning.pptx
Presentation of Helmet Detection Using Machine Learning.pptx
 
ROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdf
ROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdfROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdf
ROBOT PERCEPTION FOR AGRICULTURE AND GOOD PRODUCTION1.1.pdf
 
Deluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdfDeluck Technical Works Company Profile.pdf
Deluck Technical Works Company Profile.pdf
 
Pointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptxPointers and Array, pointer and String.pptx
Pointers and Array, pointer and String.pptx
 
Introduction to Binary Tree and Conersion of General tree to Binary Tree
Introduction to Binary Tree  and Conersion of General tree to Binary TreeIntroduction to Binary Tree  and Conersion of General tree to Binary Tree
Introduction to Binary Tree and Conersion of General tree to Binary Tree
 
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
S. Kim,  NeurIPS 2023,  MLILAB,  KAISTAIS. Kim,  NeurIPS 2023,  MLILAB,  KAISTAI
S. Kim, NeurIPS 2023, MLILAB, KAISTAI
 

Data Compression 2020

  • 1. Smaller is better Data Compression Dietmar Hauser | roborodent e.U. | 2020
  • 2. Why data size matters Save money Initial download / updates Continuous connections Expand reach Decreased loading times Smaller app size
  • 3. Isn’t this handled by the platform? Little incentive „Good enough“ attitude CPU / Memory gap Bandwidth / Fidelity gap Stand out from competition!
  • 4. Compression in theory Wikipedia: „[...] encoding information using fewer bits than the original representation“
  • 5. Two flavours of compression Lossless All information is retained Lossy An approximation is retained
  • 6. History & Concepts Information Theory, ~1948 Claude Shannon Entropy Shannon limit
  • 7. History & Concepts Prefix code, ~1952 Variable length code Translated with a dictionary Constructed with Huffman tree Fast and efficient Still used today
  • 8. History & Concepts Lempel-Ziv, 1977 Base for the LZ-family Refers back to already processed data „Sliding Window“ Implicit dictionary creation
  • 9. History & Concepts Deflate, 1991 LZ77 + Huffman Used everywhere! http://zlib.net 29 years old!
  • 12. Smaller Apps Platform owners enforce package format .apk, .ipa, .appx, … Actually just .zip files Built in compression far from optimal Compress before packaging Bonus: Less storage space used!
  • 13. Smaller Apps Textures Best compression: JPEG (or H.26X) Most pitfalls: PNG Don’t use Photoshop output for final images! Use compressed texture formats if possible Don’t forget to apply regular compression Consider custom image format
  • 15. Smaller Apps Textures – Teh Future RDO – Rate-distortion optimization Crunch: https://github.com/BinomialLLC/crunch Transcoding between compressed formats Basis: http://www.binomial.info/ https://github.com/BinomialLLC/basis_universal New compressed GPU formats glTF: https://www.khronos.org/gltf/ ASTC - Adaptive Scalable Texture Compression
  • 16. Smaller Apps Geometry & Animation Highly format dependent Strip unneeded data Tangents, Binormals, Extra Uvs,… Lossy animation compression Compress using a generic algorithm
  • 17. Smaller Apps Sound and Music Use lossy compression MP3, Ogg/Vorbis, BINKA, … Depends on audio platform Check back with provider Consider mono for music
  • 18. Smaller Apps Config, Settings, Loca,… HTML, JSON, XML,… Human readable  low entropy Strip whitespace and comments Brotli is optimized for these Consider binary formats i.e. MsgPack, ProtoBuffers, Binary XML, BSON,… Consider creating your own format
  • 19. Smaller Apps Further complications Certain files have fixed formats App icons, splash screens, … Exe is encrypted / signed Consider interpreted code Only workarounds are possible… Lobby platform owners?
  • 21. Smaller Downloads HTTP is usually a must (CDN) HTTP 1.1 has compression built in! Likely already available to you Only GZIP widely supported Google is pushing Brotli! Make sure it‘s turned on! Content-Encoding: br Accept-Encoding: br, gzip
  • 22. Smaller Downloads HTTP Compression is not optimal! Data is rarely changed Compression time is not relevant Use strongest compression available Don’t forget to turn off HTTP compression
  • 23. Smaller Downloads Compression Options Free: LZMA, XZ, LZHAM Commercial: Oodle (Kraken, Leviathan, …) Slow to very slow compression Very high compression ratios Slow to fast decompression
  • 25. Smaller Downloads General Hints Consider keeping files compressed locally HTTP request delays and limits Few big files > many small files Use parallel downloads, if possible Don‘t forget about decompression time
  • 27. Less Network Traffic Data treatment options Separate static from dynamic data Transfer static data once (or never) i.e. replace Strings with Ids Use binary data formats Ditch HTTP, Base64 re-adds ~25% Use TCP/UDP, WebSocket instead Per packet vs. stream compression
  • 28. Less Network Traffic Fast compression options Free: LZ4, Density Commercial: LZO, Oodle (Selkie, LZB16) Much (!) faster than GZIP Lower to equal compression ratio
  • 29. Less Network Traffic Strong compression options Free: ZStd, BROTLI Commercial: Oodle (Mermaid) Faster decompression speed Slower to equal compression speed Equal to higher compression ratio
  • 31. Less Network Traffic Teh Future HTTP/2 & 3 will be binary protocols Shared dictionaries SDCH or home made (i.e. using ZStd) Brotli has a generic dictionary built in
  • 33. Conclusions Take care of your data from day 1 There is more than Deflate / Zlib Smaller data makes people happy!
  • 34. Resources Yann Collet Blog: http://fastcompression.blogspot.com/ LZ4: http://www.lz4.org/ ZStd: http://www.zstd.net/ Oodle Official: http://www.radgametools.com/oodle.htm Charles Bloom: http://cbloomrants.blogspot.com/ Fabian Giesen: https://fgiesen.wordpress.com/
  • 35. Resources BROTLI Standard: https://www.ietf.org/rfc/rfc7932.txt Source: https://github.com/google/brotli Misc Rich Geldreich (LZHAM): http://richg42.blogspot.com/ Crunch: https://github.com/BinomialLLC/crunch Basis: https://github.com/binomialLLC/basis_universal LZO: http://www.oberhumer.com/ 7z / LZMA / XZ: http://www.7-zip.org/ Density: https://github.com/centaurean/density
  • 36. roborodent Dietmar Hauser P r o g r a m m e r Dietmar Hauser | roborodent e.U. | 2020 Software Solutions | Creative Consulting https://www.roborodent.com @rattenhirn dietmar.hauser@roborodent.com https://slideshare.net/DietmarHauser https://fb.me/roborodent https://github.com/rattenhirn/ https://www.linkedin.com/in/rattenhirn/

Editor's Notes

  1. Why should you care? Reduced bandwidth benefits you and the customers
  2. Platform providers pay highly discounted bulk rates You have likely already heard of… CPUs got over 10.000 times faster, memory only 10 times In addition, more CPU core are being added, that compete for memory The chance of idling CPUs is high What might we use those idle CPUs for? I have „invented“ a second gap VR, 4K, high framerates, MMO, …. 720p -> 0.9 MP 1080p -> 2 MP 2160p -> 8.3 MP
  3. Now that I hopefully have made my case, let‘s review the basics So it is literally „shrinking data“
  4. Two things you probably already know, but I recap them anyways Lossy: First reduce entropy, then apply lossless compression I‘ll mostly talk about lossless compression Rule of thumb: Very human senses are involved, lossy compression can be used
  5. Now that we know what it is, let‘s look at how it roughly works by reviewing its history briefly Would‘ve been 100 years old in 2016 Entropy: Don‘t mix up with physical and chemical entropy H(X) = Entropy in „Shannons“ Pr(X=1) = Probability that the coin will land on head In this case Entropy == number of bits needed to store If I want to store the outcome of 100 coin tosses I need at least 100 bits at H(X) == 1 But only at least 50 bits at H(X) == 0,5 So, the more predictable data is, the better it can be compressed
  6. David A. Huffman Not the first prefix code, but the best at the time „Universal codes“, prefix code to use when data is not known
  7. We jump forward 25 years, skip over arithmetic and range coding Abraham Lempel, Jacob Ziv, IEEE Milestone 2004 Have contributed more to the efficient storage of cat images than anyone else Notable LZ-family members: Originals: LZ77, LZ78 Well known: LZW (1984) -> GIF Modern: LZ4, LZO, LZMA, Oodle,…
  8. This is where for many people the history of compression ends As we‘ll see it‘s the default in a lot of places Zlib is considered „good enough“ Research hasn‘t stopped since then Some theoretical progress, a LOT of implementation improvements
  9. Our first use case (of three, for the impatient)
  10. A jarring example FlappyBird.apk -> 895 KiB TappyChicken.apk -> 26,408 KiB
  11. Deliciously named IPA format .apk: Data is kept in archive, code is extracted .ipa: Everything is extracted .appx: Nothing is extracted DEFLATE again, remember, 27 years old LZMA / 7z would be 20-30% smaller at FlappyBird.apk -> 895 KiB -> 604 KiB ~= 300 KiB saved TappyChicken.apk -> 26,408 KiB -> 17,793 KiB ~= 10 MiB saved!! It‘s public domain and free!
  12. Different kinds of data should be treated differently JPEG is lossy and has superior compression rates Alpha channel needs to be stored separately if needed PNGs output from graphics packages are not optimal Photoshop adds random data to PNGs Uses deflate to compress lines Run them through an optimizer like PNG crush, Tiny PNG, PNGquant Consider using the palette feature Compressed textures: Saves disk space, memory and GPU bandwidth ETC2 is available on all OpenGL ES 3.0 devices Texture compression is lossy and has a fixed compression ratio It's worth it to compress them again using a generic algorithm Custom image format: Save raw pixel data in the desired pixel format (including compressed ones) Add required meta data (format, height, width) Compress
  13. uncompressed 280 MiB original: 85 MiB png: 41 MiB (less then half!) jpeg 80: 8 MiB etc2: 106 MiB etc2 comp: 24 – 19 MiB More about the compressors used later
  14. Crunch is DXT5 only Basis is only available when being licensed gITF, helped by the ppl behind Binomial
  15. BINKA comes with Miles Sound System and Bink
  16. Omitting whitespace and comments: Lossy compression  JavaScript libraries use this technique extensively Conversion wastes CPU / memory Binary format moves compression to creation time Use generic compressor BROTLI is aimed at text
  17. Certain files need to be in certain formats i.e. App startup image, app icons, ... need to be 32 Bit PNG even though they have not transparency Make them as simple as possible (large monochrome patches) The executable is encrypted before compression Not much one can do, except keeping the executable small Consider using interpreted code Can be compressed Can also be updated without a certification pass Lobby platform owners to change this and/or provide options
  18. Our second use case (of three, for the impatient)
  19. The good news: What is GZIP? Deflate! HTTP is used a lot, for some good and many bad reasons
  20. Initial download, patches, DLCs, updates,…
  21. Silesia corpus, all compressors on max compression setting XZ == LZMA LZHAM is missing Kraken blows everything out of the water Leviathan (also Oodle) is even better Free: Zstd or LZMA, depends if speed or ratio is more important Compression times not included because we don‘t care. Up to 10 minutes for 200 MiB
  22. If possible, store files compressed locally as well - May help with loading times if local transfer rate is low - Makes users happy, especially on mobile devices Data flow considerations - HTTP requests will have some delay before starting, especially on CDNs (due to redirections and back end stuff) - How to cope: - Run as many parallel request as possible, but - Per RFC, servers are not obliged to service more than 2 at a time - Proxies or even the platform may also be a limiting factor Decompression takes some time as well - Especially on weak CPU (mobile) platforms - Try to parallelize with download - Keep an eye on memory consumption
  23. Data transfers in online games
  24. Data separation also improves cache friendlyness and memory pressure
  25. Two ways to improve compression, number 1: Links at the end
  26. Number 2 BROTLI is designed to handle text well, the others are general purpose
  27. Here is how this plays out Shorter bar is better „Sai-Lee-Sha“ =~ 200 MiB of representative data Brotli, LZO and Density are missing Brotli comparable to Zstd, better on text (20% improvment over deflate) Silesia corpus, all compressors on „default“ setting Should be the best trade off between speed and ratio Max compression time is 2 minutes for 200 MiB We‘ll look at even stronger compressors later
  28. SDCH -> Shared Dictionary Compression for HTTP Pre shared dictionaries are a generic approach to separate static from dynamic data BROTLI dictionary in Appendix A.
  29. Only useful for small packages and non-streaming connections