Using Content Delivery Networks
with Drupal
TriDUG – March 2015
What areCDNs
A content delivery network or content distribution
network (CDN) isalarge distributed system of
serversdeployed in multipledatacentersacrossthe
Internet. Thegoal of aCDN isto servecontent to
end-userswith high availability and high
performance.
Traditional Delivery
Theserver in theimage
representseither a
singleserver or asingle
datacenter with aproxy
front end (e.g. Varnish /
nGinx )
CDN Delivery
Theseserversrepresent
different datacenters
that can belocated in
different areasof the
world.
Clientsget served
information from the
best site.
Why UseaCDN
● Increasethenumber of usersthat can accessthesite
without degradation.
● Ensurequality user experienceby having all or parts
of apageloaded from serverscloseto them
● Lower theimpact of usersaccessing largefiles
CDN Basics
● Theterm origin refersto thesourceserver
● CDN clustersinstancesaredefined by aspecial DNS
host name, e.g. <dist id>.cloudfront.net
● TheCDN'sDNSentry will berouted thru an
(expensive) load balancing routing system that
determinesthebest server to servicetherequest.
● DNSsettingsand URL rewriting areused to cause
clientsto get theinformation from thecluster.
● Filescached on theCDN all have“expiration” times
that can beset/controlled in variousmanners.
Anatomy of A CDN Request
● Client browser requestsaCDN cached fileviaan
embedded URL or just entering theURL.
● TheCDN'sload balancing routerswill determinetheCDN
server to usebased on clientslocation, etc.
● TheCDN server selected looksto seeif thisfileexists.
● If thefileexists, it checksthe“expiration” date.
● If thefileexistsand hasnot expired, it returnsthecache
version. Theorigin server doesno work.
● Otherwise, if possible, theCDN will ask theorigin server
for afresh copy of thefileand send it to theclient.
CDN Services
● Akamai Technologies
● Amazon CloudFront
● WindowsAzureCDN
● EdgeCast Networks
● RackspaceCloud Files
● Vimeo
● YouTube
● ...and many more
Selecting aCDN Service
● Dependson your needs, e.g. Mediaonly, web only,
and thelike.
● Regionsyou need served.
● Dependson theclient'spricerangeand quality of
serviceneeds.
● HTTPSsupport / costs
(Note: Googleand other search enginesarestarting to
givepreferenceto sitesthat areavailableviaHTTPS)
Common waysCDNs
areused with Drupal
● Mediaand largefiledelivery (e.g. YouTube, Vimeo,
and others)
● Static filedelivery (CSS, Images, JS, and thelike)
● Full anonymoussitecaching
MediaDelivery
● SiteDNSpointsto origin
● Content pointsto CDN DNS
when needed
● Filesareadded viaCDNsadmin
interface
Pros
● Largefilesdelivered efficiently
● Normal Drupal behavior
Cons
● CDN delivered filesmay haveto
bemanually set up/managed.
Static FileDelivery
● SiteDNSpointsto origin
● CDN modulecan automatically
rewritestatic URLsto CDN
DNS
● Filesareautomatically cached
Pros
● Server load reduced
● Set and (almost) forget
Cons
● Can beadelay updating files
that areREPLACED.
Full SiteDelivery
● SiteDNSpointsto CDN cluster
● No special moduleslikeCDN
needed, filesareautomatically
added to thecache.
Pros
● Supportsmost clientswith least
impact on origin
Cons
● Morecomplex set up
● Need to defineprocessesto allow
content managersto refresh
updated content
Challengesto using CDNsand Drupal
● CDNsservecached content based on URLs
● User based Drupal sitescan havedifferent content
displayed using thesameURL, e.g. /user
● Content editorswant to seechangesimmediately
and not wait for cacheto refresh
● Network managerswant cacheto last along timeto
lower server load
Selecting aCDN mode
A quick ruleof thumb for deciding between Full site
delivery and Static filedelivery is:
● Doesyour sitesupport individual users?
Yes- UseStatic filedelivery
No – UseFull site
Note: CDNsalso havevarioussettingsthat may let
you createahybrid site, e.g. deliver certain areasvia
CDN but let Cart or Forum or... areasbestatic only.
SomeUseful Modules
● CDN isuseful for setting up static fileservices
http://drupal.org/project/cdn
● CloudFront Refresh (my module:) )
http://drupal.org/sandbox/cgmonroe/2454357
● AdvancedAggregation – Doesabetter job of css
and javascript aggregation
http://drupal.org/project/advagg
A CaseStudy
● A collegehasacoach who lovesto set sports
records... likeaplayer scoring 130+ point in agame,
which leadsto massiveload on their main web site.
● Thecommunicationsdepartment isvery proactivein
updating thecontent and want visitorsto seethings
immediately
● Themain websiteisaccessed by anonymoususers
with theexception of variouscontent manager
● They want to support HTTPSto get the“Google”
ranking boost.
Strategy Used
● UseCloudFront CDN serviceasafull sitedelivery
service.
● UseServer NameIndication Certificatesto allow
HTTPSon multipleDNSnames
● Use“origin” DNSentry to bypassCDN for content
editors
● Writeacustom moduleto allow content to easily be
refreshed on theCDN cluster
Set up CloudFront Distro
Set up CloudFront Distro
Set UpApplicationAWSUser
Set up DNS
● Thewww.college.edu siteisset up with aCNAME
entry thepointsto theCloudfront.net DNSentry
● The“edit” siteusesorigin.college.edu and pointsto
theserver'sIPaddress
Setup theCertificates
● Get and SNI SSL certificatewith all theDNSnames
you want thesiteto beknown as. E.g.
origin.college.edu and www.college.edu
● Install thecertificateon theAmazon CF Distribution
(seeAmazon docs)
● Install thecertificateon theorigin server
.htaccessSetup
To set aspecific expiresheader / time... look for thesection likethisand modify asneeded.
NOTE: Drupal pagesexpiretimebased on Pagecachetimesetting in config->devel-
>performance.
# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
  # Enable expirations.
  ExpiresActive On
  # Short expires for testing.
  ExpiresDefault "access plus 1 minutes"
  <FilesMatch .php$>
    # Do not allow PHP scripts to be cached unless they explicitly send cache
    # headers themselves. Otherwise all scripts would have to overwrite the
    # headers setby mod_expires if they want another caching behavior. This may
    # fail if an error occurs early in the bootstrap process, and it may cause
    # problems if a non­Drupal PHP file is installed in a subdirectory.
    ExpiresActive Off
  </FilesMatch>
</IfModule>
.htaccessSetup
Redirect login'sto origin site:
 # Redirect logins to non­CDN site
  RewriteCond %{HTTP_HOST} !^origin.college.edu [NC]
  RewriteRule ^user/login$ https://origin.college.edu/user/login [R=301,L]
  RewriteCond %{HTTP_HOST} !^origin.college.edu [NC]
  RewriteRule ^user$ https://origin.college.edu/user [R=301,L]
.htaccessSetup
If you usecustom fonts, you will need to prevent CORS
security errors, usethefollowing. NOTE: needsto be
tweaked if you want to limit to specific sites.
<IfModule mod_headers.c>
   Header set Access­Control­Allow­Origin *
#  SetEnvIf Origin "^(.*.college.edu)$" 
ORIGIN_SUB_DOMAIN=$1
#  <FilesMatch ".woff$">
#    Header set Access­Control­Allow­Origin "%
{ORIGIN_SUB_DOMAIN}e" env=ORIGIN_SUB_DOMAIN
#  </FilesMatch>
</IfModule>
Settings.php Setup
Thefollowing codeletstheCDN support HTTPS. Needsto bein
thesettings.php file.
if (isset($_SERVER['HTTP_CLOUDFRONT_FORWARDED_PROTO']) &&
     $_SERVER['HTTP_CLOUDFRONT_FORWARDED_PROTO'] == 'https') {
  $_SERVER['HTTPS'] = 'on';
  $_SERVER['HTTP_X_FORWARDED_PROTO'] = 'https';
}
// The following are only needed if the 'non­CDN' site can be 
// accessed by more than 1 host name E.g., initially an internal
// DNS entry and then moved to a client DNS entry.
if ( isset($_SERVER['HTTP_HOST']) && 
      $_SERVER['HTTP_HOST'] == 'college­origin.longsight.com') {
  $_SERVER['HTTP_HOST'] = 'origin.college.edu';
}
Setup CloudFront Refresh Module
● Get thenew CloudFront Refresh modulefrom itssandbox:
http://drupal.org/sandbox/cgmonroe/2454357 
● Get dependancies:
http://drupal.org/project/libraries(if needed)
http://drupal.org/project/awssdk
● Follow theCloudFront Refresh install instructions
CloudFront Refresh
● Tracksupdated nodesand
sendsinvalidaterequest
when sitecacheiscleared
● Allowsmanually entering
URL to refresh non-html
files
● Statuspageto track refresh
requests
● Someother codeto improve
CDN hit rates
Results
● Entiresiteisserviced viaCDN with low hit rateon
main server
● Fast responseeven under load
● Content editor can instantly seethechangesthey are
editing becauseloginsget redirected to theorigin
site.
● CloudFront Refresh modulemakeiseasy for
updatesto be“pushed” out to CDN.
SomeTroubleshooting/Testing Tips
● curl -I -L http://www.college.edu/<path>
Thisgetstheheadersfor thepath with expiretimeand other
information. For example, you can check theexpiretimeon
pagescoming from theorigin server.
● Chrome/ Firebug network load tab
Thiscan beused to determinewhat isbeing loaded from
which server (or not being loaded).
● Think ahead.
CDNsaresimilar to DNSserverswith propagation delays. If a
siteisgoing to havemajor changes, set theorigin expiretime
to alow valueaday or so before.
Questions?

Using Content Delivery Networks with Drupal

  • 1.
    Using Content DeliveryNetworks with Drupal TriDUG – March 2015
  • 2.
    What areCDNs A contentdelivery network or content distribution network (CDN) isalarge distributed system of serversdeployed in multipledatacentersacrossthe Internet. Thegoal of aCDN isto servecontent to end-userswith high availability and high performance.
  • 3.
    Traditional Delivery Theserver intheimage representseither a singleserver or asingle datacenter with aproxy front end (e.g. Varnish / nGinx )
  • 4.
    CDN Delivery Theseserversrepresent different datacenters thatcan belocated in different areasof the world. Clientsget served information from the best site.
  • 5.
    Why UseaCDN ● Increasethenumberof usersthat can accessthesite without degradation. ● Ensurequality user experienceby having all or parts of apageloaded from serverscloseto them ● Lower theimpact of usersaccessing largefiles
  • 6.
    CDN Basics ● Thetermorigin refersto thesourceserver ● CDN clustersinstancesaredefined by aspecial DNS host name, e.g. <dist id>.cloudfront.net ● TheCDN'sDNSentry will berouted thru an (expensive) load balancing routing system that determinesthebest server to servicetherequest. ● DNSsettingsand URL rewriting areused to cause clientsto get theinformation from thecluster. ● Filescached on theCDN all have“expiration” times that can beset/controlled in variousmanners.
  • 7.
    Anatomy of ACDN Request ● Client browser requestsaCDN cached fileviaan embedded URL or just entering theURL. ● TheCDN'sload balancing routerswill determinetheCDN server to usebased on clientslocation, etc. ● TheCDN server selected looksto seeif thisfileexists. ● If thefileexists, it checksthe“expiration” date. ● If thefileexistsand hasnot expired, it returnsthecache version. Theorigin server doesno work. ● Otherwise, if possible, theCDN will ask theorigin server for afresh copy of thefileand send it to theclient.
  • 8.
    CDN Services ● AkamaiTechnologies ● Amazon CloudFront ● WindowsAzureCDN ● EdgeCast Networks ● RackspaceCloud Files ● Vimeo ● YouTube ● ...and many more
  • 9.
    Selecting aCDN Service ●Dependson your needs, e.g. Mediaonly, web only, and thelike. ● Regionsyou need served. ● Dependson theclient'spricerangeand quality of serviceneeds. ● HTTPSsupport / costs (Note: Googleand other search enginesarestarting to givepreferenceto sitesthat areavailableviaHTTPS)
  • 10.
    Common waysCDNs areused withDrupal ● Mediaand largefiledelivery (e.g. YouTube, Vimeo, and others) ● Static filedelivery (CSS, Images, JS, and thelike) ● Full anonymoussitecaching
  • 11.
    MediaDelivery ● SiteDNSpointsto origin ●Content pointsto CDN DNS when needed ● Filesareadded viaCDNsadmin interface Pros ● Largefilesdelivered efficiently ● Normal Drupal behavior Cons ● CDN delivered filesmay haveto bemanually set up/managed.
  • 12.
    Static FileDelivery ● SiteDNSpointstoorigin ● CDN modulecan automatically rewritestatic URLsto CDN DNS ● Filesareautomatically cached Pros ● Server load reduced ● Set and (almost) forget Cons ● Can beadelay updating files that areREPLACED.
  • 13.
    Full SiteDelivery ● SiteDNSpointstoCDN cluster ● No special moduleslikeCDN needed, filesareautomatically added to thecache. Pros ● Supportsmost clientswith least impact on origin Cons ● Morecomplex set up ● Need to defineprocessesto allow content managersto refresh updated content
  • 14.
    Challengesto using CDNsandDrupal ● CDNsservecached content based on URLs ● User based Drupal sitescan havedifferent content displayed using thesameURL, e.g. /user ● Content editorswant to seechangesimmediately and not wait for cacheto refresh ● Network managerswant cacheto last along timeto lower server load
  • 15.
    Selecting aCDN mode Aquick ruleof thumb for deciding between Full site delivery and Static filedelivery is: ● Doesyour sitesupport individual users? Yes- UseStatic filedelivery No – UseFull site Note: CDNsalso havevarioussettingsthat may let you createahybrid site, e.g. deliver certain areasvia CDN but let Cart or Forum or... areasbestatic only.
  • 16.
    SomeUseful Modules ● CDNisuseful for setting up static fileservices http://drupal.org/project/cdn ● CloudFront Refresh (my module:) ) http://drupal.org/sandbox/cgmonroe/2454357 ● AdvancedAggregation – Doesabetter job of css and javascript aggregation http://drupal.org/project/advagg
  • 17.
    A CaseStudy ● Acollegehasacoach who lovesto set sports records... likeaplayer scoring 130+ point in agame, which leadsto massiveload on their main web site. ● Thecommunicationsdepartment isvery proactivein updating thecontent and want visitorsto seethings immediately ● Themain websiteisaccessed by anonymoususers with theexception of variouscontent manager ● They want to support HTTPSto get the“Google” ranking boost.
  • 18.
    Strategy Used ● UseCloudFrontCDN serviceasafull sitedelivery service. ● UseServer NameIndication Certificatesto allow HTTPSon multipleDNSnames ● Use“origin” DNSentry to bypassCDN for content editors ● Writeacustom moduleto allow content to easily be refreshed on theCDN cluster
  • 19.
  • 20.
  • 21.
  • 22.
    Set up DNS ●Thewww.college.edu siteisset up with aCNAME entry thepointsto theCloudfront.net DNSentry ● The“edit” siteusesorigin.college.edu and pointsto theserver'sIPaddress
  • 23.
    Setup theCertificates ● Getand SNI SSL certificatewith all theDNSnames you want thesiteto beknown as. E.g. origin.college.edu and www.college.edu ● Install thecertificateon theAmazon CF Distribution (seeAmazon docs) ● Install thecertificateon theorigin server
  • 24.
    .htaccessSetup To set aspecificexpiresheader / time... look for thesection likethisand modify asneeded. NOTE: Drupal pagesexpiretimebased on Pagecachetimesetting in config->devel- >performance. # Requires mod_expires to be enabled. <IfModule mod_expires.c>   # Enable expirations.   ExpiresActive On   # Short expires for testing.   ExpiresDefault "access plus 1 minutes"   <FilesMatch .php$>     # Do not allow PHP scripts to be cached unless they explicitly send cache     # headers themselves. Otherwise all scripts would have to overwrite the     # headers setby mod_expires if they want another caching behavior. This may     # fail if an error occurs early in the bootstrap process, and it may cause     # problems if a non­Drupal PHP file is installed in a subdirectory.     ExpiresActive Off   </FilesMatch> </IfModule>
  • 25.
    .htaccessSetup Redirect login'sto originsite:  # Redirect logins to non­CDN site   RewriteCond %{HTTP_HOST} !^origin.college.edu [NC]   RewriteRule ^user/login$ https://origin.college.edu/user/login [R=301,L]   RewriteCond %{HTTP_HOST} !^origin.college.edu [NC]   RewriteRule ^user$ https://origin.college.edu/user [R=301,L]
  • 26.
    .htaccessSetup If you usecustomfonts, you will need to prevent CORS security errors, usethefollowing. NOTE: needsto be tweaked if you want to limit to specific sites. <IfModule mod_headers.c>    Header set Access­Control­Allow­Origin * #  SetEnvIf Origin "^(.*.college.edu)$"  ORIGIN_SUB_DOMAIN=$1 #  <FilesMatch ".woff$"> #    Header set Access­Control­Allow­Origin "% {ORIGIN_SUB_DOMAIN}e" env=ORIGIN_SUB_DOMAIN #  </FilesMatch> </IfModule>
  • 27.
    Settings.php Setup Thefollowing codeletstheCDNsupport HTTPS. Needsto bein thesettings.php file. if (isset($_SERVER['HTTP_CLOUDFRONT_FORWARDED_PROTO']) &&      $_SERVER['HTTP_CLOUDFRONT_FORWARDED_PROTO'] == 'https') {   $_SERVER['HTTPS'] = 'on';   $_SERVER['HTTP_X_FORWARDED_PROTO'] = 'https'; } // The following are only needed if the 'non­CDN' site can be  // accessed by more than 1 host name E.g., initially an internal // DNS entry and then moved to a client DNS entry. if ( isset($_SERVER['HTTP_HOST']) &&        $_SERVER['HTTP_HOST'] == 'college­origin.longsight.com') {   $_SERVER['HTTP_HOST'] = 'origin.college.edu'; }
  • 28.
    Setup CloudFront RefreshModule ● Get thenew CloudFront Refresh modulefrom itssandbox: http://drupal.org/sandbox/cgmonroe/2454357  ● Get dependancies: http://drupal.org/project/libraries(if needed) http://drupal.org/project/awssdk ● Follow theCloudFront Refresh install instructions
  • 29.
    CloudFront Refresh ● Tracksupdatednodesand sendsinvalidaterequest when sitecacheiscleared ● Allowsmanually entering URL to refresh non-html files ● Statuspageto track refresh requests ● Someother codeto improve CDN hit rates
  • 30.
    Results ● Entiresiteisserviced viaCDNwith low hit rateon main server ● Fast responseeven under load ● Content editor can instantly seethechangesthey are editing becauseloginsget redirected to theorigin site. ● CloudFront Refresh modulemakeiseasy for updatesto be“pushed” out to CDN.
  • 31.
    SomeTroubleshooting/Testing Tips ● curl-I -L http://www.college.edu/<path> Thisgetstheheadersfor thepath with expiretimeand other information. For example, you can check theexpiretimeon pagescoming from theorigin server. ● Chrome/ Firebug network load tab Thiscan beused to determinewhat isbeing loaded from which server (or not being loaded). ● Think ahead. CDNsaresimilar to DNSserverswith propagation delays. If a siteisgoing to havemajor changes, set theorigin expiretime to alow valueaday or so before.
  • 32.