SlideShare a Scribd company logo
1 of 40
PRESENTATION INFRASTRUCTURE
Dzmitry Markovich
SDCH - Shared Dictionary
Compression Over HTTP
PRESENTATION INFRASTRUCTURE
Dzmitry Markovich
LinkedIn Engineering, Traffic Team
PRESENTATION INFRASTRUCTURE
FORTUNITY
Every day we use Google Search...
Request URL:
https://www.google.com/s?output=search...
accept-encoding: gzip, deflate, sdch
Response:
content-encoding: gzip
get-dictionary: /sdch/j_fzWU8F.dct
Bootstrapping
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
LET'S CHECK
Without SDCH…
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
Looks like there is
no real difference
LET’S CHECK
…with SDCH
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
IT STILL LOOKS INTERESTING - SO, MOVING ON
get-dictionary:
/sdch/j_fzWU8F.dct
Dictionary?
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
OK, NOW WHEN IT IS REALLY INTERESTING
SDCH protocol was proposed in 2008 (Velocity 2008 Web
Performance and Operations Conference)
The goal of the protocol is to compress HTTP
responses and increase the performance of users in
the slow internet regions
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
LOTS OF USERS STILL SUFFER FROM SLOW
NETWORKS. FOR EXAMPLE, IN DEVELOPING
COUNTRIES.
 Data compression
 Gzip works great for individual responses
 What about common data shared by a group of pages (inter-
response redundancy) or pages that change a little bit frequently?
 Only transmit the data that is common to each response once.
 Thereafter, send only the parts of the response that differ.
©2015 LinkedIn Corporation. All Rights Reserved.
Reduce data transmition time
PRESENTATION INFRASTRUCTURE
NEGOTIATIONS
©2015 LinkedIn Corporation. All Rights Reserved.
browser: Hi! I need to GET /page.html. And BTW, I support SDCH
server: Hi! Here is the page. And BTW, here’s URL to get SDCH
dictionary!
… (browser downloads dictionary in the background) …
browser: Hi I need to GET /another.page.html. And BTW I support
SDCH -here is my client-hash (see below)
server: Hi! Here is the page! And BTW, since your dictionary is up to
date, the page is SDCH encoded!
PRESENTATION INFRASTRUCTURE
Request URL:
accept-encoding: gzip, deflate, sdch
avail-dictionary: j_fzWU8F
Response:
content-encoding: gzip
content-encoding: sdch
NORMAL SDCH
REQUEST
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
WHY NOT RFC 3229 - DELTA ENCODING IN HTTP?
 Only applicable to the same URL
 Discourages aggressive caching
 No benefit for similar pages that don’t share an
URL
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
STEPS ON WHAT NEEDS TO BE DONE TO CHECK
THE BENEFITS FOR LINKEDIN
Generate
dictionaries for
static content
1
Advertises the
dictionaries via http
response headers
2
Fetches and stores
dictionaries
3
On the next request
notifies server
about available
dictionaries
4
SDCH encoding
against valid
dictionary
5
SDCH decoding
against valid
dictionary
6
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
LETS FIGURE OUT WHAT WE SHOULD PUT INTO
THE DICTIONARY
Dictionary is available for public access,
so lets start with static CSS and JS files
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
LETS BUILD THE DICTIONARY
https://github.com/gtoubassi/femtozip
Opensourced library
FemtoZip
©2015 LinkedIn Corporation. All Rights Reserved.
Femtozip outputs a dictionary that can be used for SDCH with minor
modifications. You need to prepend it with the SDCH dictionary headers so
that the browser knows on which domain this dictionary can be used and
under which paths is this dictionary valid.
PRESENTATION INFRASTRUCTURE
DICTIONARY HASH
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
HASH IN THE RESPONSE
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
DICTIONARY
Metadata
dictionary-metadata = 1#dictionary-header "n"
dictionary-header = "domain" ":" value "n"
| "path" ":" value "n"
| "format-version" ":" value "n"
| "max-age" ":" value "n"
| "port" ":" <"> portlist <"> "n"
portlist = 1#portnum
portnum = 1*DIGIT
Full dictionary
dictionary-definition = dictionary-metadata payload
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
CSS DICTIONARY EXAMPLE
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
One small problem - dictionary
generation takes time...
PRESENTATION INFRASTRUCTURE
ATS PLUGIN
What it should do
• Check if client
supports SDCH
• Advertise a dictionary
to the client
• Encode the response
based on the
dictionary
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
ENCODING
For this Google selected already standardized VCDIFF protocol.
VCDIFF is a format and an algorithm for delta encoding, described in
RFC 3284
http://code.google.com/p/open-vcdiff/ OPEN-VCDIFF
library that supports
encoding/decoding
for VCDIFF
(RFC3284) format
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
VCDIFF ENCODING
Replacement of the most common long strings with short instructions.
©2015 LinkedIn Corporation. All Rights Reserved.
The basic encoding format compactly represents compressed or delta
files. Applications can further extend the basic encoding format with
"secondary encoders" to achieve more compression.
Output compactness:
Data portability:
The basic encoding format is free from machine byte order and word size
issues. This allows data to be encoded on one machine and decoded on
a different machine with different architecture.
Algorithm genericity:
The decoding algorithm is independent from string matching and
windowing algorithms. This allows competition among implementations
of the encoder while keeping the same decoder.
PRESENTATION INFRASTRUCTURE
BENTLEY/MCILROY TECHNIQUE FOR FINDING
MATCHES BETWEEN THE SOURCE AND TARGET DATA
©2015 LinkedIn Corporation. All Rights Reserved.
Input Output
abcdefghijklmnopq<12345 abcdefghijklmnopq<<12345
abcdefghijabcdefghij abcdefghij<0,10>
abcdefghijklmnopqrstuvwxijklmnopabcdefghqrs
tuvwxaaaaaaaaaaaaaaaaaaaaa
abcdefghijklmnopqrstuvwx<8,8><0,8><16,8>a<
0,20>
Compression Bible Bible+Bible
Input 4460056 8920112
gzip 1321495 2642389
com 50 4384403 4384414
com 20 3906771 3906782
com 50 | gzip 1318687 1318699
com 20 | gzip 1362413 1362422
PRESENTATION INFRASTRUCTURE
Encoding
vcdiff encode -dictionary file.dict < target_file > delta_file
Decoding
vcdiff decode -dictionary file.dict < delta_file > target
©2015 LinkedIn Corporation. All Rights Reserved.
TYPICAL USAGE OF VCDIFF IS AS FOLLOWS (THE
< AND > ARE FILE REDIRECT OPERATIONS, NOT
OPTIONAL ARGUMENTS)
PRESENTATION INFRASTRUCTURE
REAL LINKEDIN EXAMPLES
abook_remarketing_base_promo_en_US.css:
on disk: 4198 bytes
on wire: 809 bytes
registration_subs_upsell_en_US.css:
on disk: 9189 bytes
on wire: 3220 bytes
footer_en_US.css:
on disk: 1941 bytes
on wire: 1245 bytes
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
COMPRESSION RATIO
% compression
Small files can
become bigger after
encoding
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
COMPRESSION RATIO
% compression
We removed files
with negative
compression
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
FOR BIGGER FILES WE SEE BETTER BENEFITS
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
WHAT TO DO WITH FILES THAT DO NOT
COMPRESS WELL?
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
CACHING, CDN
REQUIREMENTS:
cache tier should be able to
support Vary on custom headers
FOR SDCH IT IS
Vary: Avail-Dictionary
RESPONSE
content-encoding: gzip
content-encoding: sdch
vary: avail-dictionary
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
CACHING, CDN
REQUIREMENTS:
cache tier should not do any gzip
normalizations (Akamai)
accept-encoding: gzip,deflate,sdch
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
REBUILDING DICTIONARY
Deployment delays and small
Cache hit ratio
After one month the dictionary is
still acurate
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
HOW TO ADVERTIZE A NEW DICTIONARY?
We simply return
get-dictionary: /sdch/j_fzWU8F.dct
and browser fetches the dictionary in offline mode
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
SECURITY
1. Dictionary is hashed on the client and
server
2. Dictionary is valid only for specified domain
and path
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
PROXY AND FIREWALL
 Distribution of bad content to the client
 No way to verify content on the fly
 Changes on the proxy might invalidate the whole
response
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
DICTIONARY AND RESPONSE CAN BE "SAFE",
BUT WHAT HAPPENED ONCE WE MERGE
THEM TOGETHER?
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
SOLUTIONS FOR PROXIES AND FIREWALLS
 Remove sdch value from Accept-Encoding header :)
 Implement sdch client (expensive, non realtime)
SDCH encoding takes ~400 microseconds
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
SERVER LOAD
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE
SDCH compression is
whitelisted for some
users at LinkedIn
©2015 LinkedIn Corporation. All Rights Reserved.
PRESENTATION INFRASTRUCTURE©2015 LinkedIn Corporation. All Rights Reserved.
RESULTS
• additional 30% data compression on the top of Gzip
• only small files dont have benefits from sdch
• content download time decrease in the regions with slow
internet
• for bigger web portals this technology works much better

More Related Content

What's hot

The importance of standards
The importance of standardsThe importance of standards
The importance of standardsiText Group nv
 
Toppling Domino - 44CON 4012
Toppling Domino - 44CON 4012Toppling Domino - 44CON 4012
Toppling Domino - 44CON 401244CON
 
SiriusCon 2017 - Document Generation with M2Doc
SiriusCon 2017 - Document Generation with M2DocSiriusCon 2017 - Document Generation with M2Doc
SiriusCon 2017 - Document Generation with M2DocObeo
 
DOT NET TRaining
DOT NET TRainingDOT NET TRaining
DOT NET TRainingsunil kumar
 
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...SSIMeetup
 
Kiva protocol: building the credit bureau of the future using SSI
Kiva protocol: building the credit bureau of the future using SSIKiva protocol: building the credit bureau of the future using SSI
Kiva protocol: building the credit bureau of the future using SSISSIMeetup
 

What's hot (9)

The importance of standards
The importance of standardsThe importance of standards
The importance of standards
 
Toppling Domino - 44CON 4012
Toppling Domino - 44CON 4012Toppling Domino - 44CON 4012
Toppling Domino - 44CON 4012
 
SiriusCon 2017 - Document Generation with M2Doc
SiriusCon 2017 - Document Generation with M2DocSiriusCon 2017 - Document Generation with M2Doc
SiriusCon 2017 - Document Generation with M2Doc
 
DOT NET TRaining
DOT NET TRainingDOT NET TRaining
DOT NET TRaining
 
WordLift 2.0
WordLift 2.0WordLift 2.0
WordLift 2.0
 
Corporate Shenanigans
Corporate ShenanigansCorporate Shenanigans
Corporate Shenanigans
 
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
The DID Report 1: The First Official W3C DID Working Group Meeting (Japan)- D...
 
Kiva protocol: building the credit bureau of the future using SSI
Kiva protocol: building the credit bureau of the future using SSIKiva protocol: building the credit bureau of the future using SSI
Kiva protocol: building the credit bureau of the future using SSI
 
bdd behaviour driven development
bdd behaviour driven developmentbdd behaviour driven development
bdd behaviour driven development
 

Viewers also liked

Codificador
CodificadorCodificador
Codificadorjhojan48
 
Cal OES Active Shooter Awareness Guidance (2016 update)
Cal OES Active Shooter Awareness Guidance (2016 update)Cal OES Active Shooter Awareness Guidance (2016 update)
Cal OES Active Shooter Awareness Guidance (2016 update)Vance Taylor
 
Terror-Defense LLC-Campus
Terror-Defense LLC-CampusTerror-Defense LLC-Campus
Terror-Defense LLC-CampusJeffrey Luse
 
What is Leadership Coaching
What is Leadership CoachingWhat is Leadership Coaching
What is Leadership CoachingAndrew Scantland
 
ร้านอาหารแนะนำ ภูเก็ต
ร้านอาหารแนะนำ ภูเก็ตร้านอาหารแนะนำ ภูเก็ต
ร้านอาหารแนะนำ ภูเก็ตpyopyo
 
スライドテスト
スライドテストスライドテスト
スライドテストecybertest00
 
4.2 modelos estacionarios
4.2 modelos estacionarios4.2 modelos estacionarios
4.2 modelos estacionariossergio fonseca
 
Comparative essay
Comparative essayComparative essay
Comparative essaydagallardo
 
13 agua subterranea
13 agua subterranea13 agua subterranea
13 agua subterraneaJuan Soto
 
PMP
PMPPMP
PMPuni
 
Guia de aprendizaje limpieza del computador
Guia de aprendizaje limpieza del computadorGuia de aprendizaje limpieza del computador
Guia de aprendizaje limpieza del computadorSantiagotabaresDim
 

Viewers also liked (18)

Vocabulary words
Vocabulary wordsVocabulary words
Vocabulary words
 
Resume
ResumeResume
Resume
 
Codificador
CodificadorCodificador
Codificador
 
Cal OES Active Shooter Awareness Guidance (2016 update)
Cal OES Active Shooter Awareness Guidance (2016 update)Cal OES Active Shooter Awareness Guidance (2016 update)
Cal OES Active Shooter Awareness Guidance (2016 update)
 
Terror-Defense LLC-Campus
Terror-Defense LLC-CampusTerror-Defense LLC-Campus
Terror-Defense LLC-Campus
 
What is Leadership Coaching
What is Leadership CoachingWhat is Leadership Coaching
What is Leadership Coaching
 
Taller power point
Taller power pointTaller power point
Taller power point
 
Task 7 3º 6º
Task 7 3º 6ºTask 7 3º 6º
Task 7 3º 6º
 
Task 5 flashcards
Task 5 flashcardsTask 5 flashcards
Task 5 flashcards
 
ร้านอาหารแนะนำ ภูเก็ต
ร้านอาหารแนะนำ ภูเก็ตร้านอาหารแนะนำ ภูเก็ต
ร้านอาหารแนะนำ ภูเก็ต
 
Computer system 3
Computer system 3Computer system 3
Computer system 3
 
スライドテスト
スライドテストスライドテスト
スライドテスト
 
4.2 modelos estacionarios
4.2 modelos estacionarios4.2 modelos estacionarios
4.2 modelos estacionarios
 
Comparative essay
Comparative essayComparative essay
Comparative essay
 
(638079899) indice
(638079899) indice(638079899) indice
(638079899) indice
 
13 agua subterranea
13 agua subterranea13 agua subterranea
13 agua subterranea
 
PMP
PMPPMP
PMP
 
Guia de aprendizaje limpieza del computador
Guia de aprendizaje limpieza del computadorGuia de aprendizaje limpieza del computador
Guia de aprendizaje limpieza del computador
 

Similar to sdch

Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014nyolles
 
Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014Nate Yolles
 
Building a Stock Prediction system with Machine Learning using Geode, SpringX...
Building a Stock Prediction system with Machine Learning using Geode, SpringX...Building a Stock Prediction system with Machine Learning using Geode, SpringX...
Building a Stock Prediction system with Machine Learning using Geode, SpringX...William Markito Oliveira
 
Vbrownbag container networking for real workloads
Vbrownbag container networking for real workloadsVbrownbag container networking for real workloads
Vbrownbag container networking for real workloadsCisco DevNet
 
HTTP/2 and a Faster Web
HTTP/2 and a Faster WebHTTP/2 and a Faster Web
HTTP/2 and a Faster WebC4Media
 
Cisco InterCloud Strategy
Cisco InterCloud StrategyCisco InterCloud Strategy
Cisco InterCloud StrategyOmar Nawaz
 
LOC presentation 2020: Future of openBIM standards
LOC presentation 2020: Future of openBIM standardsLOC presentation 2020: Future of openBIM standards
LOC presentation 2020: Future of openBIM standardsLéon Berlo
 
Hybrid Integration with SAP
Hybrid Integration with SAPHybrid Integration with SAP
Hybrid Integration with SAPBizTalk360
 
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e... Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...VMware Tanzu
 
Application development using Zend Framework
Application development using Zend FrameworkApplication development using Zend Framework
Application development using Zend FrameworkMahmud Ahsan
 
Rcs project Training Bangalore
Rcs project Training BangaloreRcs project Training Bangalore
Rcs project Training BangaloreSunil Kumar
 
Enterprise DevOps Series: Using VS Code & Zowe
Enterprise DevOps Series: Using VS Code & ZoweEnterprise DevOps Series: Using VS Code & Zowe
Enterprise DevOps Series: Using VS Code & ZoweDevOps.com
 
Brocade Software Networking (SDN NFV Day ITB 2016)
Brocade Software Networking (SDN NFV Day ITB 2016)Brocade Software Networking (SDN NFV Day ITB 2016)
Brocade Software Networking (SDN NFV Day ITB 2016)SDNRG ITB
 
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)Nedelcho Delchev
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020InfluxData
 
BootsFaces, AngularFaces und ein Blck unter die Motorhaube
BootsFaces, AngularFaces und ein Blck unter die MotorhaubeBootsFaces, AngularFaces und ein Blck unter die Motorhaube
BootsFaces, AngularFaces und ein Blck unter die MotorhaubeOPITZ CONSULTING Deutschland
 
Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)
Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)
Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)Codit
 

Similar to sdch (20)

Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014
 
Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014Adobe CQ at LinkedIn Meetup February 2014
Adobe CQ at LinkedIn Meetup February 2014
 
Building a Stock Prediction system with Machine Learning using Geode, SpringX...
Building a Stock Prediction system with Machine Learning using Geode, SpringX...Building a Stock Prediction system with Machine Learning using Geode, SpringX...
Building a Stock Prediction system with Machine Learning using Geode, SpringX...
 
Vbrownbag container networking for real workloads
Vbrownbag container networking for real workloadsVbrownbag container networking for real workloads
Vbrownbag container networking for real workloads
 
HTTP/2 and a Faster Web
HTTP/2 and a Faster WebHTTP/2 and a Faster Web
HTTP/2 and a Faster Web
 
Cisco InterCloud Strategy
Cisco InterCloud StrategyCisco InterCloud Strategy
Cisco InterCloud Strategy
 
LOC presentation 2020: Future of openBIM standards
LOC presentation 2020: Future of openBIM standardsLOC presentation 2020: Future of openBIM standards
LOC presentation 2020: Future of openBIM standards
 
Hybrid Integration with SAP
Hybrid Integration with SAPHybrid Integration with SAP
Hybrid Integration with SAP
 
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e... Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
Cloud-Native .Net des applications containerisées .Net sur Linux, Windows e...
 
DevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDKDevCon5 (July 2014) - Acision SDK
DevCon5 (July 2014) - Acision SDK
 
Application development using Zend Framework
Application development using Zend FrameworkApplication development using Zend Framework
Application development using Zend Framework
 
Rcs project Training Bangalore
Rcs project Training BangaloreRcs project Training Bangalore
Rcs project Training Bangalore
 
Enterprise serverless
Enterprise serverlessEnterprise serverless
Enterprise serverless
 
Enterprise DevOps Series: Using VS Code & Zowe
Enterprise DevOps Series: Using VS Code & ZoweEnterprise DevOps Series: Using VS Code & Zowe
Enterprise DevOps Series: Using VS Code & Zowe
 
Brocade Software Networking (SDN NFV Day ITB 2016)
Brocade Software Networking (SDN NFV Day ITB 2016)Brocade Software Networking (SDN NFV Day ITB 2016)
Brocade Software Networking (SDN NFV Day ITB 2016)
 
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)
Dirigible powered by Orion for Cloud Development (EclipseCon EU 2015)
 
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience NA 2020
 
BootsFaces, AngularFaces und ein Blck unter die Motorhaube
BootsFaces, AngularFaces und ein Blck unter die MotorhaubeBootsFaces, AngularFaces und ein Blck unter die Motorhaube
BootsFaces, AngularFaces und ein Blck unter die Motorhaube
 
Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)
Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)
Hybrid integrationwithsap (Glenn Colpaert @ Integration Monday)
 
SPDY
SPDY SPDY
SPDY
 

sdch

  • 1. PRESENTATION INFRASTRUCTURE Dzmitry Markovich SDCH - Shared Dictionary Compression Over HTTP
  • 3. PRESENTATION INFRASTRUCTURE FORTUNITY Every day we use Google Search... Request URL: https://www.google.com/s?output=search... accept-encoding: gzip, deflate, sdch Response: content-encoding: gzip get-dictionary: /sdch/j_fzWU8F.dct Bootstrapping ©2015 LinkedIn Corporation. All Rights Reserved.
  • 4. PRESENTATION INFRASTRUCTURE LET'S CHECK Without SDCH… ©2015 LinkedIn Corporation. All Rights Reserved.
  • 5. PRESENTATION INFRASTRUCTURE Looks like there is no real difference LET’S CHECK …with SDCH ©2015 LinkedIn Corporation. All Rights Reserved.
  • 6. PRESENTATION INFRASTRUCTURE IT STILL LOOKS INTERESTING - SO, MOVING ON get-dictionary: /sdch/j_fzWU8F.dct Dictionary? ©2015 LinkedIn Corporation. All Rights Reserved.
  • 7. PRESENTATION INFRASTRUCTURE OK, NOW WHEN IT IS REALLY INTERESTING SDCH protocol was proposed in 2008 (Velocity 2008 Web Performance and Operations Conference) The goal of the protocol is to compress HTTP responses and increase the performance of users in the slow internet regions ©2015 LinkedIn Corporation. All Rights Reserved.
  • 8. PRESENTATION INFRASTRUCTURE LOTS OF USERS STILL SUFFER FROM SLOW NETWORKS. FOR EXAMPLE, IN DEVELOPING COUNTRIES.  Data compression  Gzip works great for individual responses  What about common data shared by a group of pages (inter- response redundancy) or pages that change a little bit frequently?  Only transmit the data that is common to each response once.  Thereafter, send only the parts of the response that differ. ©2015 LinkedIn Corporation. All Rights Reserved. Reduce data transmition time
  • 9. PRESENTATION INFRASTRUCTURE NEGOTIATIONS ©2015 LinkedIn Corporation. All Rights Reserved. browser: Hi! I need to GET /page.html. And BTW, I support SDCH server: Hi! Here is the page. And BTW, here’s URL to get SDCH dictionary! … (browser downloads dictionary in the background) … browser: Hi I need to GET /another.page.html. And BTW I support SDCH -here is my client-hash (see below) server: Hi! Here is the page! And BTW, since your dictionary is up to date, the page is SDCH encoded!
  • 10. PRESENTATION INFRASTRUCTURE Request URL: accept-encoding: gzip, deflate, sdch avail-dictionary: j_fzWU8F Response: content-encoding: gzip content-encoding: sdch NORMAL SDCH REQUEST ©2015 LinkedIn Corporation. All Rights Reserved.
  • 11. PRESENTATION INFRASTRUCTURE WHY NOT RFC 3229 - DELTA ENCODING IN HTTP?  Only applicable to the same URL  Discourages aggressive caching  No benefit for similar pages that don’t share an URL ©2015 LinkedIn Corporation. All Rights Reserved.
  • 12. PRESENTATION INFRASTRUCTURE STEPS ON WHAT NEEDS TO BE DONE TO CHECK THE BENEFITS FOR LINKEDIN Generate dictionaries for static content 1 Advertises the dictionaries via http response headers 2 Fetches and stores dictionaries 3 On the next request notifies server about available dictionaries 4 SDCH encoding against valid dictionary 5 SDCH decoding against valid dictionary 6 ©2015 LinkedIn Corporation. All Rights Reserved.
  • 13. PRESENTATION INFRASTRUCTURE LETS FIGURE OUT WHAT WE SHOULD PUT INTO THE DICTIONARY Dictionary is available for public access, so lets start with static CSS and JS files ©2015 LinkedIn Corporation. All Rights Reserved.
  • 14. PRESENTATION INFRASTRUCTURE LETS BUILD THE DICTIONARY https://github.com/gtoubassi/femtozip Opensourced library FemtoZip ©2015 LinkedIn Corporation. All Rights Reserved. Femtozip outputs a dictionary that can be used for SDCH with minor modifications. You need to prepend it with the SDCH dictionary headers so that the browser knows on which domain this dictionary can be used and under which paths is this dictionary valid.
  • 15. PRESENTATION INFRASTRUCTURE DICTIONARY HASH ©2015 LinkedIn Corporation. All Rights Reserved.
  • 16. PRESENTATION INFRASTRUCTURE HASH IN THE RESPONSE ©2015 LinkedIn Corporation. All Rights Reserved.
  • 17. PRESENTATION INFRASTRUCTURE DICTIONARY Metadata dictionary-metadata = 1#dictionary-header "n" dictionary-header = "domain" ":" value "n" | "path" ":" value "n" | "format-version" ":" value "n" | "max-age" ":" value "n" | "port" ":" <"> portlist <"> "n" portlist = 1#portnum portnum = 1*DIGIT Full dictionary dictionary-definition = dictionary-metadata payload ©2015 LinkedIn Corporation. All Rights Reserved.
  • 18. PRESENTATION INFRASTRUCTURE CSS DICTIONARY EXAMPLE ©2015 LinkedIn Corporation. All Rights Reserved.
  • 19. PRESENTATION INFRASTRUCTURE One small problem - dictionary generation takes time...
  • 20. PRESENTATION INFRASTRUCTURE ATS PLUGIN What it should do • Check if client supports SDCH • Advertise a dictionary to the client • Encode the response based on the dictionary ©2015 LinkedIn Corporation. All Rights Reserved.
  • 21. PRESENTATION INFRASTRUCTURE ENCODING For this Google selected already standardized VCDIFF protocol. VCDIFF is a format and an algorithm for delta encoding, described in RFC 3284 http://code.google.com/p/open-vcdiff/ OPEN-VCDIFF library that supports encoding/decoding for VCDIFF (RFC3284) format ©2015 LinkedIn Corporation. All Rights Reserved.
  • 22. PRESENTATION INFRASTRUCTURE VCDIFF ENCODING Replacement of the most common long strings with short instructions. ©2015 LinkedIn Corporation. All Rights Reserved. The basic encoding format compactly represents compressed or delta files. Applications can further extend the basic encoding format with "secondary encoders" to achieve more compression. Output compactness: Data portability: The basic encoding format is free from machine byte order and word size issues. This allows data to be encoded on one machine and decoded on a different machine with different architecture. Algorithm genericity: The decoding algorithm is independent from string matching and windowing algorithms. This allows competition among implementations of the encoder while keeping the same decoder.
  • 23. PRESENTATION INFRASTRUCTURE BENTLEY/MCILROY TECHNIQUE FOR FINDING MATCHES BETWEEN THE SOURCE AND TARGET DATA ©2015 LinkedIn Corporation. All Rights Reserved. Input Output abcdefghijklmnopq<12345 abcdefghijklmnopq<<12345 abcdefghijabcdefghij abcdefghij<0,10> abcdefghijklmnopqrstuvwxijklmnopabcdefghqrs tuvwxaaaaaaaaaaaaaaaaaaaaa abcdefghijklmnopqrstuvwx<8,8><0,8><16,8>a< 0,20> Compression Bible Bible+Bible Input 4460056 8920112 gzip 1321495 2642389 com 50 4384403 4384414 com 20 3906771 3906782 com 50 | gzip 1318687 1318699 com 20 | gzip 1362413 1362422
  • 24. PRESENTATION INFRASTRUCTURE Encoding vcdiff encode -dictionary file.dict < target_file > delta_file Decoding vcdiff decode -dictionary file.dict < delta_file > target ©2015 LinkedIn Corporation. All Rights Reserved. TYPICAL USAGE OF VCDIFF IS AS FOLLOWS (THE < AND > ARE FILE REDIRECT OPERATIONS, NOT OPTIONAL ARGUMENTS)
  • 25. PRESENTATION INFRASTRUCTURE REAL LINKEDIN EXAMPLES abook_remarketing_base_promo_en_US.css: on disk: 4198 bytes on wire: 809 bytes registration_subs_upsell_en_US.css: on disk: 9189 bytes on wire: 3220 bytes footer_en_US.css: on disk: 1941 bytes on wire: 1245 bytes ©2015 LinkedIn Corporation. All Rights Reserved.
  • 26. PRESENTATION INFRASTRUCTURE COMPRESSION RATIO % compression Small files can become bigger after encoding ©2015 LinkedIn Corporation. All Rights Reserved.
  • 27. PRESENTATION INFRASTRUCTURE COMPRESSION RATIO % compression We removed files with negative compression ©2015 LinkedIn Corporation. All Rights Reserved.
  • 28. PRESENTATION INFRASTRUCTURE FOR BIGGER FILES WE SEE BETTER BENEFITS ©2015 LinkedIn Corporation. All Rights Reserved.
  • 29. PRESENTATION INFRASTRUCTURE WHAT TO DO WITH FILES THAT DO NOT COMPRESS WELL? ©2015 LinkedIn Corporation. All Rights Reserved.
  • 30. PRESENTATION INFRASTRUCTURE CACHING, CDN REQUIREMENTS: cache tier should be able to support Vary on custom headers FOR SDCH IT IS Vary: Avail-Dictionary RESPONSE content-encoding: gzip content-encoding: sdch vary: avail-dictionary ©2015 LinkedIn Corporation. All Rights Reserved.
  • 31. PRESENTATION INFRASTRUCTURE CACHING, CDN REQUIREMENTS: cache tier should not do any gzip normalizations (Akamai) accept-encoding: gzip,deflate,sdch ©2015 LinkedIn Corporation. All Rights Reserved.
  • 32. PRESENTATION INFRASTRUCTURE REBUILDING DICTIONARY Deployment delays and small Cache hit ratio After one month the dictionary is still acurate ©2015 LinkedIn Corporation. All Rights Reserved.
  • 33. PRESENTATION INFRASTRUCTURE HOW TO ADVERTIZE A NEW DICTIONARY? We simply return get-dictionary: /sdch/j_fzWU8F.dct and browser fetches the dictionary in offline mode ©2015 LinkedIn Corporation. All Rights Reserved.
  • 34. PRESENTATION INFRASTRUCTURE SECURITY 1. Dictionary is hashed on the client and server 2. Dictionary is valid only for specified domain and path ©2015 LinkedIn Corporation. All Rights Reserved.
  • 35. PRESENTATION INFRASTRUCTURE PROXY AND FIREWALL  Distribution of bad content to the client  No way to verify content on the fly  Changes on the proxy might invalidate the whole response ©2015 LinkedIn Corporation. All Rights Reserved.
  • 36. PRESENTATION INFRASTRUCTURE DICTIONARY AND RESPONSE CAN BE "SAFE", BUT WHAT HAPPENED ONCE WE MERGE THEM TOGETHER? ©2015 LinkedIn Corporation. All Rights Reserved.
  • 37. PRESENTATION INFRASTRUCTURE SOLUTIONS FOR PROXIES AND FIREWALLS  Remove sdch value from Accept-Encoding header :)  Implement sdch client (expensive, non realtime) SDCH encoding takes ~400 microseconds ©2015 LinkedIn Corporation. All Rights Reserved.
  • 38. PRESENTATION INFRASTRUCTURE SERVER LOAD ©2015 LinkedIn Corporation. All Rights Reserved.
  • 39. PRESENTATION INFRASTRUCTURE SDCH compression is whitelisted for some users at LinkedIn ©2015 LinkedIn Corporation. All Rights Reserved.
  • 40. PRESENTATION INFRASTRUCTURE©2015 LinkedIn Corporation. All Rights Reserved. RESULTS • additional 30% data compression on the top of Gzip • only small files dont have benefits from sdch • content download time decrease in the regions with slow internet • for bigger web portals this technology works much better