Using Content Delivery Networks with Drupal

Using Content Delivery Networks
with Drupal
TriDUG – March 2015

What areCDNs
A content delivery network or content distribution
network (CDN) isalarge distributed system of
serversdeployed in multipledatacentersacrossthe
Internet. Thegoal of aCDN isto servecontent to
end-userswith high availability and high
performance.

Traditional Delivery
Theserver in theimage
representseither a
singleserver or asingle
datacenter with aproxy
front end (e.g. Varnish /
nGinx )

CDN Delivery
Theseserversrepresent
different datacenters
that can belocated in
different areasof the
world.
Clientsget served
information from the
best site.

Why UseaCDN
● Increasethenumber of usersthat can accessthesite
without degradation.
● Ensurequality user experienceby having all or parts
of apageloaded from serverscloseto them
● Lower theimpact of usersaccessing largefiles

CDN Basics
● Theterm origin refersto thesourceserver
● CDN clustersinstancesaredefined by aspecial DNS
host name, e.g. <dist id>.cloudfront.net
● TheCDN'sDNSentry will berouted thru an
(expensive) load balancing routing system that
determinesthebest server to servicetherequest.
● DNSsettingsand URL rewriting areused to cause
clientsto get theinformation from thecluster.
● Filescached on theCDN all have“expiration” times
that can beset/controlled in variousmanners.

Anatomy of A CDN Request
● Client browser requestsaCDN cached fileviaan
embedded URL or just entering theURL.
● TheCDN'sload balancing routerswill determinetheCDN
server to usebased on clientslocation, etc.
● TheCDN server selected looksto seeif thisfileexists.
● If thefileexists, it checksthe“expiration” date.
● If thefileexistsand hasnot expired, it returnsthecache
version. Theorigin server doesno work.
● Otherwise, if possible, theCDN will ask theorigin server
for afresh copy of thefileand send it to theclient.

CDN Services
● Akamai Technologies
● Amazon CloudFront
● WindowsAzureCDN
● EdgeCast Networks
● RackspaceCloud Files
● Vimeo
● YouTube
● ...and many more

Selecting aCDN Service
● Dependson your needs, e.g. Mediaonly, web only,
and thelike.
● Regionsyou need served.
● Dependson theclient'spricerangeand quality of
serviceneeds.
● HTTPSsupport / costs
(Note: Googleand other search enginesarestarting to
givepreferenceto sitesthat areavailableviaHTTPS)

Common waysCDNs
areused with Drupal
● Mediaand largefiledelivery (e.g. YouTube, Vimeo,
and others)
● Static filedelivery (CSS, Images, JS, and thelike)
● Full anonymoussitecaching

MediaDelivery
● SiteDNSpointsto origin
● Content pointsto CDN DNS
when needed
● Filesareadded viaCDNsadmin
interface
Pros
● Largefilesdelivered efficiently
● Normal Drupal behavior
Cons
● CDN delivered filesmay haveto
bemanually set up/managed.

Static FileDelivery
● SiteDNSpointsto origin
● CDN modulecan automatically
rewritestatic URLsto CDN
DNS
● Filesareautomatically cached
Pros
● Server load reduced
● Set and (almost) forget
Cons
● Can beadelay updating files
that areREPLACED.

Full SiteDelivery
● SiteDNSpointsto CDN cluster
● No special moduleslikeCDN
needed, filesareautomatically
added to thecache.
Pros
● Supportsmost clientswith least
impact on origin
Cons
● Morecomplex set up
● Need to defineprocessesto allow
content managersto refresh
updated content

Challengesto using CDNsand Drupal
● CDNsservecached content based on URLs
● User based Drupal sitescan havedifferent content
displayed using thesameURL, e.g. /user
● Content editorswant to seechangesimmediately
and not wait for cacheto refresh
● Network managerswant cacheto last along timeto
lower server load

Selecting aCDN mode
A quick ruleof thumb for deciding between Full site
delivery and Static filedelivery is:
● Doesyour sitesupport individual users?
Yes- UseStatic filedelivery
No – UseFull site
Note: CDNsalso havevarioussettingsthat may let
you createahybrid site, e.g. deliver certain areasvia
CDN but let Cart or Forum or... areasbestatic only.

SomeUseful Modules
● CDN isuseful for setting up static fileservices
http://drupal.org/project/cdn
● CloudFront Refresh (my module:) )
http://drupal.org/sandbox/cgmonroe/2454357
● AdvancedAggregation – Doesabetter job of css
and javascript aggregation
http://drupal.org/project/advagg

A CaseStudy
● A collegehasacoach who lovesto set sports
records... likeaplayer scoring 130+ point in agame,
which leadsto massiveload on their main web site.
● Thecommunicationsdepartment isvery proactivein
updating thecontent and want visitorsto seethings
immediately
● Themain websiteisaccessed by anonymoususers
with theexception of variouscontent manager
● They want to support HTTPSto get the“Google”
ranking boost.

Strategy Used
● UseCloudFront CDN serviceasafull sitedelivery
service.
● UseServer NameIndication Certificatesto allow
HTTPSon multipleDNSnames
● Use“origin” DNSentry to bypassCDN for content
editors
● Writeacustom moduleto allow content to easily be
refreshed on theCDN cluster

Set up DNS
● Thewww.college.edu siteisset up with aCNAME
entry thepointsto theCloudfront.net DNSentry
● The“edit” siteusesorigin.college.edu and pointsto
theserver'sIPaddress

Setup theCertificates
● Get and SNI SSL certificatewith all theDNSnames
you want thesiteto beknown as. E.g.
origin.college.edu and www.college.edu
● Install thecertificateon theAmazon CF Distribution
(seeAmazon docs)
● Install thecertificateon theorigin server

.htaccessSetup
To set aspecific expiresheader / time... look for thesection likethisand modify asneeded.
NOTE: Drupal pagesexpiretimebased on Pagecachetimesetting in config->devel-
>performance.
# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
  # Enable expirations.
  ExpiresActive On
  # Short expires for testing.
  ExpiresDefault "access plus 1 minutes"
  <FilesMatch .php$>
    # Do not allow PHP scripts to be cached unless they explicitly send cache
    # headers themselves. Otherwise all scripts would have to overwrite the
    # headers setby mod_expires if they want another caching behavior. This may
    # fail if an error occurs early in the bootstrap process, and it may cause
    # problems if a nonDrupal PHP file is installed in a subdirectory.
    ExpiresActive Off
  </FilesMatch>
</IfModule>

.htaccessSetup
Redirect login'sto origin site:
# Redirect logins to nonCDN site
  RewriteCond %{HTTP_HOST} !ôrigin.college.edu [NC]
  RewriteRule ûser/login$ https://origin.college.edu/user/login [R=301,L]
  RewriteCond %{HTTP_HOST} !ôrigin.college.edu [NC]
  RewriteRule ûser$ https://origin.college.edu/user [R=301,L]

.htaccessSetup
If you usecustom fonts, you will need to prevent CORS
security errors, usethefollowing. NOTE: needsto be
tweaked if you want to limit to specific sites.
<IfModule mod_headers.c>
   Header set AccessControlAllowOrigin *
#  SetEnvIf Origin "^(.*.college.edu)$"
ORIGIN_SUB_DOMAIN=$1
#  <FilesMatch ".woff$">
#    Header set AccessControlAllowOrigin "%
{ORIGIN_SUB_DOMAIN}e" env=ORIGIN_SUB_DOMAIN
#  </FilesMatch>
</IfModule>

Settings.php Setup
Thefollowing codeletstheCDN support HTTPS. Needsto bein
thesettings.php file.
if (isset($_SERVER['HTTP_CLOUDFRONT_FORWARDED_PROTO']) &&
     $_SERVER['HTTP_CLOUDFRONT_FORWARDED_PROTO'] == 'https') {
  $_SERVER['HTTPS'] = 'on';
  $_SERVER['HTTP_X_FORWARDED_PROTO'] = 'https';
}
// The following are only needed if the 'nonCDN' site can be
// accessed by more than 1 host name E.g., initially an internal
// DNS entry and then moved to a client DNS entry.
if ( isset($_SERVER['HTTP_HOST']) &&
      $_SERVER['HTTP_HOST'] == 'collegeorigin.longsight.com') {
  $_SERVER['HTTP_HOST'] = 'origin.college.edu';
}

Setup CloudFront Refresh Module
● Get thenew CloudFront Refresh modulefrom itssandbox:
http://drupal.org/sandbox/cgmonroe/2454357
● Get dependancies:
http://drupal.org/project/libraries(if needed)
http://drupal.org/project/awssdk
● Follow theCloudFront Refresh install instructions

CloudFront Refresh
● Tracksupdated nodesand
sendsinvalidaterequest
when sitecacheiscleared
● Allowsmanually entering
URL to refresh non-html
files
● Statuspageto track refresh
requests
● Someother codeto improve
CDN hit rates

Results
● Entiresiteisserviced viaCDN with low hit rateon
main server
● Fast responseeven under load
● Content editor can instantly seethechangesthey are
editing becauseloginsget redirected to theorigin
site.
● CloudFront Refresh modulemakeiseasy for
updatesto be“pushed” out to CDN.

SomeTroubleshooting/Testing Tips
● curl -I -L http://www.college.edu/<path>
Thisgetstheheadersfor thepath with expiretimeand other
information. For example, you can check theexpiretimeon
pagescoming from theorigin server.
● Chrome/ Firebug network load tab
Thiscan beused to determinewhat isbeing loaded from
which server (or not being loaded).
● Think ahead.
CDNsaresimilar to DNSserverswith propagation delays. If a
siteisgoing to havemajor changes, set theorigin expiretime
to alow valueaday or so before.

Using Content Delivery Networks with Drupal

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (18)

Similar to Using Content Delivery Networks with Drupal

Similar to Using Content Delivery Networks with Drupal (20)

More from cgmonroe

More from cgmonroe (12)

Recently uploaded

Recently uploaded (20)

Using Content Delivery Networks with Drupal