Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
ADOBE CUSTOMER BROWN BAG SERIES

TODAY’S TOPIC: Optimizing the CQ Dispatcher Cache
Andrew Khoury
Customer Support Engineer...
What‟s Covered

1.

Best practices for using the dispatcher

2.

Tips and tricks for improving performance

3.

Common Pit...
Topics Covered
1. The Basics


Dispatcher Caching
−
−

Why do we need it?

−

Which requests does dispatcher cache?

−

H...
What is the Dispatcher?


A web caching proxy and load balancing tool that works as a layer in front
of your CQ instances...
What is Dispatcher?

GET /en.html

GET /en.html

Site Visitor

Dispatcher
en.html

© 2012 Adobe Systems Incorporated. All ...
Why do I need a dispatcher?


Why?
−
−

Improve site performance.

−

Implement load balancing.

−



Reduce load on pub...
URL Decomposition


Resource Path – The path before the first „.‟ in the URL.



Selectors – Any items separated by „.‟ ...
The dispatcher will only cache files
that meet the following criteria…

© 2012 Adobe Systems Incorporated. All Rights Rese...
What does dispatcher cache?

#1
The URL must be allowed
by the /cache => /rules and /filter

sections of dispatcher.any

©...
What does dispatcher cache?

#2
The URL must have a file extension.
Example:
/content/foo.html ✔cached
/content/foo ✖ not ...
What does dispatcher cache?

#3
The URL must not contain any querystring
parameters (no “?” in the URL).
Example:
/content...
What does dispatcher cache?

#4
If the URL has a suffix path then the suffix path
must have a file extension.
Example:

/c...
What does dispatcher cache?

#5
The HTTP method must be GET or HEAD
Example:
GET /foo.html HTTP/1.1 ✔cached
HEAD /foo.html...
What does dispatcher cache?

#6
HTTP response status (from CQ)

must be 200 OK
Example:
HTTP/1.1 200 OK

✔cached

HTTP/1.1...
What does dispatcher cache?

#7
HTTP response (from CQ)

must not contain the response header
"Dispatcher: no-cache"
Examp...
What does dispatcher cache?

Suffix Exception #1
If the URL is a cacheable resource path

with an extension and a URL with...
What does dispatcher cache?

Suffix Exception #2
In the opposite case, when the suffix file

already exists in the cache a...
How does caching work?

Demonstration

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.

18
?
What should I cache?

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.

19
What should I cache?


Cache everything if possible.



What requests should I not cache?
−

You shouldn‟t cache files t...
Dispatcher Cache Flushing
Deleting and invalidating files in your dispatcher cache.

© 2012 Adobe Systems Incorporated. Al...
Flushing your Cache
What is a dispatcher cache flush?


It is a mechanism that allows you to selectively delete files fro...
Tip #1: Do cache flushing from publish
Dispatcher Flushing
GET /content/foo.html

GET /content/foo.html

Publish
Dispatche...
Dispatcher Flush
How does dispatcher cache flushing work?
1. User activates a page in CQ
2. CQ sends a flush request like ...
Cache Invalidation
What is cache invalidation?


Cache invalidation is the mechanism by which the dispatcher considers
ce...
What is Dispatcher?
GET /content/foo.html

GET /content/foo.html

YES
Site Visitor

Dispatcher

Publish

.stat
content
We ...
Cache Invalidation
dispatcher.any /cache => /invalidate section


Defines rules telling the dispatcher which files should...
Cache Invalidation
How do .stat files work?
1. If you set the /statfileslevel in dispatcher.any to 0 then only a single .s...
Stat Files Level
/statfileslevel


0 – single stat file at the root of the cache



1 – stat file at root of cache and u...
Stat Files Level Example


If we have statfileslevel set to 2 and our cache looks like this
/.stat
/content/.stat
/conten...
Summary of Cache Invalidation


The /statfileslevel allows you to isolate the invalidation to the branch
where the flush ...
Tips, Tricks and Best Practices
Getting the most out of your dispatcher.

© 2012 Adobe Systems Incorporated. All Rights Re...
Tip #1
Configure cache flushing on the
publish instance

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Con...
Tip #1: Do cache flushing from publish
Problem


When flushing is done from author the old version of file could get re-c...
Tip #1: Do cache flushing from publish
Replication in progress
GET /content/foo.html

GET /content/foo.html

Replicate
/co...
Stat Files Level

Demonstration:
How to configure flushing from publish

© 2012 Adobe Systems Incorporated. All Rights Res...
Tip #2
Use a re-fetching dispatcher flush
agent

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidentia...
Tip #2: Implement a Re-fetching Flush Agent
Problem


If a file is doesn‟t exist in the cache yet then simultaneous reque...
Problem with deleting files from the cache

GET /content/foo.html
GET /content/foo.html

Dispatcher

Publish

.stat
conten...
Same diagram with a re-fetching agent

GET /content/foo.html
GET /content/foo.html

Dispatcher
.stat
Site Visitors

conten...
Tip #2: Implement a Re-fetching Flush Agent
Dispatcher Flush with Re-Fetching
1. Touch nearest .stat file
2. Delete all fi...
Tip #2: Implement a Re-fetching Flush Agent

Demonstration:
How to install the re-fetching flush agent

© 2012 Adobe Syste...
Tip #3
Use /serveStaleOnError

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.

43
Tip #3: Set ServeStaleOnError
Problem


If all publish instances are unavailable, you will get "502 Bad response code". I...
Tip #3: Set ServeStaleOnError

Demonstration:
How to configure /serveStaleOnError

© 2012 Adobe Systems Incorporated. All ...
Tip #4
Cache custom error pages

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.

46
Tip #4: Implement Proper Error Handling
Problem


By default, the content served for custom error pages such as the 404 a...
Tip #4: Implement Proper Error Handling


Do the following to configure this:
1. In the Apache configuration, set Dispatc...
Tip #4: Implement Proper Error Handling

Demonstration:
How to implement error handling

© 2012 Adobe Systems Incorporated...
Tip #5
Block unwanted requests at the
dispatcher level

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Conf...
Tip #5: Filter Out Invalid Requests
Fine tune your dispatcher.any /filter section so that you block as much as
possible.
...
Tip #5: Filter Out Invalid Requests

Demonstration:
How to configure /filter section

© 2012 Adobe Systems Incorporated. A...
Tip #6
Block unwanted requests in publish

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.

53
Tip #6: Only Allow Valid Selectors, Suffixes, and Querystrings


Implement a javax.servlet.Filter to return 404 or 403 wh...
Tip #6: Only Allow Valid Selectors, Suffixes, and Querystrings

Demonstration:
How to implement a blocking filter in CQ

©...
Tip #7
Configure dispatcher to ignore
invalid querystrings

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe ...
Tip #6: Ignore Unused Querystring Params


If the URL has a querystring then page will not be cached.



However, you ca...
How to Design your Site with Caching in Mind
Building an effective cache strategy.

© 2012 Adobe Systems Incorporated. All...
Developing a Cache Strategy


Caching is something that should be considered during the design and
development phases of ...
Using Ajax and SSI


If you call the component's content path directly then you can request it
with a .html extension.

...
Server Side Includes
How to use SSI in Apache


http://httpd.apache.org/docs/2.2/howto/ssi.html



Set this configuratio...
SSI
How to use Ajax to Load Specific Components


Many different ajax libraries will work for this. However, I will demon...
Combining Ajax and SSI
Combining Ajax and SSI


Let's say you are using SSI for serving content through the webserver, bu...
Worked Example 1

Demonstration

© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.

64
Q& A + References


Dispatcher Documentation
−
−



http://dev.day.com/docs/en/cq/current/deploying/dispatcher.html
/cac...
© 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Upcoming SlideShare
Loading in …5
×

AEM (CQ) Dispatcher Caching Webinar 2013

22,139 views

Published on

Sample code: https://github.com/cqsupport/webinar-dispatchercache
Webinar Recording: http://my.adobeconnect.com/p7th2gf8k43/

Optimizing dispatcher cache covering:

Best practices for using the dispatcher
Tips and tricks for improving performance
Common pitfalls to avoid
How to design your site so you get the most out of your Dispatcher

Published in: Technology

AEM (CQ) Dispatcher Caching Webinar 2013

  1. 1. ADOBE CUSTOMER BROWN BAG SERIES TODAY’S TOPIC: Optimizing the CQ Dispatcher Cache Andrew Khoury Customer Support Engineer, Adobe  For the best listening experience, we recommend using a headset © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
  2. 2. What‟s Covered 1. Best practices for using the dispatcher 2. Tips and tricks for improving performance 3. Common Pitfalls to Avoid 4. How to design your site so you get the most out of your Dispatcher © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
  3. 3. Topics Covered 1. The Basics  Dispatcher Caching − − Why do we need it? − Which requests does dispatcher cache? − How does dispatcher cache files? −  What is the dispatcher? What files should I cache? Dispatcher Cache Flushing − How does dispatcher cache flushing work? − How do I configure cache flushing from publish? 2. Tips and Tricks 3. How to design your application with caching in mind © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 3
  4. 4. What is the Dispatcher?  A web caching proxy and load balancing tool that works as a layer in front of your CQ instances.  It can either be used as a load balancer or you can put it behind your hardware load balancer.  Works as a module installed to a web server such as Apache.  Works with these web servers: − Apache HTTP server 2.x − Microsoft IIS 6 and IIS 7.x − Oracle iPlanet Web Server © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 4
  5. 5. What is Dispatcher? GET /en.html GET /en.html Site Visitor Dispatcher en.html © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 5 Publish
  6. 6. Why do I need a dispatcher?  Why? − − Improve site performance. − Implement load balancing. −  Reduce load on publish instances. Filter out unwanted traffic. Can’t I just use a CDN instead? −  Yes, however… You should also use a dispatcher − Out of the box, the Dispatcher gives you more fine grained control over how you delete (“flush”) files from your cache. − Since the dispatcher is installed to a web server it adds the ability to rewrite URLs, and use SSI (Server Side Includes) before the requests hit the publish instance. − Dispatcher gives you added security because it allows you to block certain client HTTP headers, URL patterns, and to serve error pages from a cache. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 6
  7. 7. URL Decomposition  Resource Path – The path before the first „.‟ in the URL.  Selectors – Any items separated by „.‟ characters between the resource path and the file extension.  Extension – Comes after the resource path and selectors.  Suffix Path – A path that comes after the file extension. http://localhost:4502/content/foo/en.ab.cd.html/news/2012?a=1&b=2 protocol host port resource path © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. selectors extension 7 suffix path querystring
  8. 8. The dispatcher will only cache files that meet the following criteria… © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 8
  9. 9. What does dispatcher cache? #1 The URL must be allowed by the /cache => /rules and /filter sections of dispatcher.any © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 9
  10. 10. What does dispatcher cache? #2 The URL must have a file extension. Example: /content/foo.html ✔cached /content/foo ✖ not cached © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 10
  11. 11. What does dispatcher cache? #3 The URL must not contain any querystring parameters (no “?” in the URL). Example: /content/foo.html?queryparam=1 /content/foo.html © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 11 ✖ not cached ✔cached
  12. 12. What does dispatcher cache? #4 If the URL has a suffix path then the suffix path must have a file extension. Example: /content/foo.html/suffix/path.html /content/foo.html/suffix/path © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 12 ✔cached ✖ not cached
  13. 13. What does dispatcher cache? #5 The HTTP method must be GET or HEAD Example: GET /foo.html HTTP/1.1 ✔cached HEAD /foo.html HTTP/1.1 ✔cached POST /foo.html HTTP/1.1 ✖ not cached © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 13
  14. 14. What does dispatcher cache? #6 HTTP response status (from CQ) must be 200 OK Example: HTTP/1.1 200 OK ✔cached HTTP/1.1 500 Internal Server Error ✖ not cached HTTP/1.1 404 Not Found ✖ not cached © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 14
  15. 15. What does dispatcher cache? #7 HTTP response (from CQ) must not contain the response header "Dispatcher: no-cache" Example: HTTP/1.1 200 OK Content-Type: text/plain Content-Length: 42 ✔cached HTTP/1.1 200 OK Content-Type: text/plain Content-Length: 42 Dispatcher: no-cache ✖ not cached © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 15
  16. 16. What does dispatcher cache? Suffix Exception #1 If the URL is a cacheable resource path with an extension and a URL with a suffix path is requested then it will not be cached Example: /content/foo.html is already cached /content/foo.html/suffix/path.html © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 16 ✖ not cached
  17. 17. What does dispatcher cache? Suffix Exception #2 In the opposite case, when the suffix file already exists in the cache and the resource file is requested then the resource file is cached instead Example: /content/foo.html/suffix/path.html is already cached deleted /content/foo.html ✔cached © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 17
  18. 18. How does caching work? Demonstration © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 18
  19. 19. ? What should I cache? © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 19
  20. 20. What should I cache?  Cache everything if possible.  What requests should I not cache? − You shouldn‟t cache files that are expected to change on every request. − Personalized content. (User‟s name, location, preferences, etc.) − Do not cache files that will only be requested once. − Do not cache URLs that use selectors or suffixes where an infinite number of values are allowed. Especially when users would be unlikely to request the same value more than once. For example: /content/foo.selector.html   /content/foo.selector2.html  − /content/foo.selector1.html Etc… Do not allow caching on unwanted requests. Block caching of unwanted requests by returning 403 or 404 for requests that use an invalid selector or suffix. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 20
  21. 21. Dispatcher Cache Flushing Deleting and invalidating files in your dispatcher cache. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
  22. 22. Flushing your Cache What is a dispatcher cache flush?  It is a mechanism that allows you to selectively delete files from your dispatcher cache via an HTTP request.  Flushing allows you to bring your cache files up-to-date.  Here is a sample flush request for path /content/foo: − − CQ-Action: Activate − CQ-Handle: /content/foo − CQ-Path: /content/foo − Content-Length: 0 −  GET /dispatcher/invalidate.cache HTTP/1.1 Content-Type: application/octet-stream If you configure a flush agent in CQ then CQ will automatically flush that file from the cache on activation. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 22
  23. 23. Tip #1: Do cache flushing from publish Dispatcher Flushing GET /content/foo.html GET /content/foo.html Publish Dispatcher Activate Page /content/foo Replicate /content/foo Author Author User Flush /content/foo .stat content 1. Touch .stat file foo.html foo.selector.html foo _jcr_content © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 23 2. Delete /content/foo.* 3. Delete /content/foo/_jcr_content
  24. 24. Dispatcher Flush How does dispatcher cache flushing work? 1. User activates a page in CQ 2. CQ sends a flush request like this to the dispatcher: − GET /dispatcher/invalidate.cache HTTP/1.1 − CQ-Action: Flush − CQ-Handle: /content/foo − CQ-Path: /content/foo − Content-Length: 0 − Content-Type: application/octet-stream 3. This request tells dispatcher to do the following: i. Touch the .stat file to indicate the last time a cache flush occurred. (only touches the nearest ancestor .stat file) ii. Delete all files under {CQ-Path}.* . For example if the flushed path was /content/foo then /content/foo.* would be deleted. iii. Delete directory {CQ-Path}/_jcr_content iv. Dispatcher responds to CQ‟s flush request with 200 OK status © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 24
  25. 25. Cache Invalidation What is cache invalidation?  Cache invalidation is the mechanism by which the dispatcher considers certain cache files to be “outdated” or “stale”. 1. Dispatcher checks if the file is allowed by the rules in the /cache => invalidate section of dispatcher.any 2. The file‟s last modified time is compared to that of its nearest .stat file. 3. If the file is older then than .stat it is considered “outdated” and will be rerequested from publish. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 25
  26. 26. What is Dispatcher? GET /content/foo.html GET /content/foo.html YES Site Visitor Dispatcher Publish .stat content We assume that *.html is allowed in the /invalidate rules in dispatcher.any © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. foo.html 26 is foo.html older than .stat?
  27. 27. Cache Invalidation dispatcher.any /cache => /invalidate section  Defines rules telling the dispatcher which files should be auto-invalidated on flush requests.  For example, this is the default configuration, it is configured to allow auto-invalidation on all .html files: /farms { # first farm entry (label is not important, just for your convenience) /website { …. /cache { … /invalidate { /0000 { /glob "*" /type "deny" } /0001 { /glob "*.html" /type "allow" } } } … } } © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 27
  28. 28. Cache Invalidation How do .stat files work? 1. If you set the /statfileslevel in dispatcher.any to 0 then only a single .stat file will be created in the root of the cache. 2. A flush request will cause the .stat file to be touched 3. All files allowed by the /invalidate rules will be re-cached next time they are requested. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 28
  29. 29. Stat Files Level /statfileslevel  0 – single stat file at the root of the cache  1 – stat file at root of cache and under each direct child directory. For example: /.stat /content/.stat 2 – stat file at root of cache and under each direct child directory and each of their direct child directories: /.stat /content/.stat /content/foo/.stat /content/bar/.stat  And so on… © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 29
  30. 30. Stat Files Level Example  If we have statfileslevel set to 2 and our cache looks like this /.stat /content/.stat /content/foo/.stat /content/bar/.stat  Then if the path /content/foo/en is flushed then the following files are touched: /.stat /content/.stat /content/foo/.stat  Only files directly under /, /content, and /content/foo and its subdirectories would be affected by this. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 30
  31. 31. Summary of Cache Invalidation  The /statfileslevel allows you to isolate the invalidation to the branch where the flush occurs.  When the dispatcher decides if a file is “outdated” it will only look at the nearest ancestor .stat file in the file system tree.  You can leverage this feature if you know that your cache in one section will be unaffected by changes in another section. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 31
  32. 32. Tips, Tricks and Best Practices Getting the most out of your dispatcher. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
  33. 33. Tip #1 Configure cache flushing on the publish instance © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 33
  34. 34. Tip #1: Do cache flushing from publish Problem  When flushing is done from author the old version of file could get re-cached due to a race condition. Solution  Configure dispatcher flushing on the publish instances instead.  See here for how to configure this: − http://dev.day.com/docs/en/cq/current/deploying/dispatcher/page_invalidate.html#Invali dating%20Dispatcher%20Cache%20from%20a%20Publishing%20Instance − http://helpx.adobe.com/cq/kb/HowToFlushAssetsPublish.html © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 34
  35. 35. Tip #1: Do cache flushing from publish Replication in progress GET /content/foo.html GET /content/foo.html Replicate /content/foo Activate Page /content/foo OLD VERSION OF /content/foo.html IS Publish CACHED AGAIN!! Author Dispatcher Author User Flush /content/foo .stat content 1. Touch .stat file foo.html foo.selector.html foo _jcr_content © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 35 2. Delete /content/foo.* 3. Delete /content/foo/_jcr_content
  36. 36. Stat Files Level Demonstration: How to configure flushing from publish © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 36
  37. 37. Tip #2 Use a re-fetching dispatcher flush agent © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 37
  38. 38. Tip #2: Implement a Re-fetching Flush Agent Problem  If a file is doesn‟t exist in the cache yet then simultaneous requests for the file will get requested from the publish instances until it is cached. Solution  A feature called re-fetching exists in the dispatcher since version 4.0.9 which helps avoid this issue.  To install the re-fetching agent, do the following: − Install the sample flushing agent package found [here] to the CRX package manager − Go to flushing agent config page, for example http://localhost:4502/etc/replication/agents.publish/flush.html − Edit the configuration like this: General => Serialization Type = Custom Dispatcher Flush Extended => HTTP Method = POST © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 38
  39. 39. Problem with deleting files from the cache GET /content/foo.html GET /content/foo.html Dispatcher Publish .stat content Site Visitors foo.html foo.selector.html © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 39 Flush /content/foo
  40. 40. Same diagram with a re-fetching agent GET /content/foo.html GET /content/foo.html Dispatcher .stat Site Visitors content foo.html foo.selector.html © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 40 Publish
  41. 41. Tip #2: Implement a Re-fetching Flush Agent Dispatcher Flush with Re-Fetching 1. Touch nearest .stat file 2. Delete all files under {flush-path}.* excluding those in the re-fetch list. For example if the flushed path was /content/foo then /content/foo.* would be deleted. 3. Delete directory {flush-path}/jcr:content 4. Refetch the files in the refetch list © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 41
  42. 42. Tip #2: Implement a Re-fetching Flush Agent Demonstration: How to install the re-fetching flush agent © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 42
  43. 43. Tip #3 Use /serveStaleOnError © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 43
  44. 44. Tip #3: Set ServeStaleOnError Problem  If all publish instances are unavailable, you will get "502 Bad response code". It might possible that all your pages are already cached before publishers went down. In this case it makes sense to instead serve cached content. Solution  Set /serveStaleOnError “1” in your dispatcher.any © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 44
  45. 45. Tip #3: Set ServeStaleOnError Demonstration: How to configure /serveStaleOnError © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 45
  46. 46. Tip #4 Cache custom error pages © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 46
  47. 47. Tip #4: Implement Proper Error Handling Problem  By default, the content served for custom error pages such as the 404 and 500 are not cached. Solution  Configure DispatcherPassError and Apache‟s ErrorDocument directive to handle serving error documents from the web server level. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 47
  48. 48. Tip #4: Implement Proper Error Handling  Do the following to configure this: 1. In the Apache configuration, set DispatcherPassError 0 2. Configure error pages for each error code: For example, here is a snippet from an Apache httpd.conf file: <LocationMatch .html$> ErrorDocument 404 /errorpages/404.html ErrorDocument 500 /errorpages/500.html </LocationMatch> ErrorDocument * /errorpages/blank.html 3. In CQ, overlay the error handler script so that it doesn‟t do an authentication check for a non-existent page. Modify /libs/sling/servlet/errorhandler/404.jsp and /libs/sling/servlet/errorhandler/Throwable.jsp  See here for related Apache documentation http://httpd.apache.org/docs/2.2/mod/core.html#errordocument © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 48
  49. 49. Tip #4: Implement Proper Error Handling Demonstration: How to implement error handling © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 49
  50. 50. Tip #5 Block unwanted requests at the dispatcher level © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 50
  51. 51. Tip #5: Filter Out Invalid Requests Fine tune your dispatcher.any /filter section so that you block as much as possible.  Use a “whitelist” configuration  Deny everything and only allow what you need © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 51
  52. 52. Tip #5: Filter Out Invalid Requests Demonstration: How to configure /filter section © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 52
  53. 53. Tip #6 Block unwanted requests in publish © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 53
  54. 54. Tip #6: Only Allow Valid Selectors, Suffixes, and Querystrings  Implement a javax.servlet.Filter to return 404 or 403 when an invalid selector, querystring params or suffix path is used against your publish instance. − This will prevent caching of the unwanted requests. − Once again use a “whitelist” approach, block everything and only allow only the selectors, querystring params, and suffix paths you want. − A sample implementation is provided [here] © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 54
  55. 55. Tip #6: Only Allow Valid Selectors, Suffixes, and Querystrings Demonstration: How to implement a blocking filter in CQ © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 55
  56. 56. Tip #7 Configure dispatcher to ignore invalid querystrings © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 56
  57. 57. Tip #6: Ignore Unused Querystring Params  If the URL has a querystring then page will not be cached.  However, you can define exclusions using the /ignoreUrlParams  Allow all invalid querystrings to get ignored in /ignoreUrlParams rules in the dispatcher.any  This will cause invalid querystrings to be removed from the url when being considered for caching.  Here is an example configuration that ignores all querystrings except for the param “q”: /ignoreUrlParams { /0001 { /glob "*" /type "allow" } /0002 { /glob "q" /type "deny" } }  For example, with this configuration, when /content/foo.html?test=1 is requested the page would get cached and the publish instance would only see the request /content/foo.html However, for request /content/foo.html?q=1 the request would not get cached and the publish instance would see /content/foo.html?q=1 © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 57
  58. 58. How to Design your Site with Caching in Mind Building an effective cache strategy. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
  59. 59. Developing a Cache Strategy  Caching is something that should be considered during the design and development phases of your project.  In order to leverage the dispatcher cache to its full extent, your application must be designed to be “cacheable”. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 59
  60. 60. Using Ajax and SSI  If you call the component's content path directly then you can request it with a .html extension.  For example: http://localhost:4502/content/foo/_jcr_content/par/textimage1.html  We can use this for flexible caching: − Using SSI (Server Side Includes) or Ajax  This allows us to load the non-changing parts of the page from the cache and load dynamic sections separately. This will help improve page load times.  This is especially useful in sites that use a lot of personalization. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 60
  61. 61. Server Side Includes How to use SSI in Apache  http://httpd.apache.org/docs/2.2/howto/ssi.html  Set this configuration in your Apache httpd configuration within the VirtualHost for your CQ site: Options +IncludesNOEXEC AddOutputFilter INCLUDES .html Note: We use IncludesNOEXEC and not just Includes for security reasons. Using the option +Includes may open up your web server to remote execution vulnerabilities.  In your application code, only allow ssi selector on components where you allow ssi calls. For example, on components where you would like to request via SSI, you could rename html.jsp to ssi.html.jsp and create an html.jsp file like this: <!--#include virtual="<%= resource.getPath() %>.ssi.html" --> © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 61
  62. 62. SSI How to use Ajax to Load Specific Components  Many different ajax libraries will work for this. However, I will demonstrate using jquery: − Rename your component's html.jsp file to html.ssi.jsp − Create a new html.jsp script so that it only does the ajax call. For example: <div id="<%= divId %>”></div> <script> $(document).ready(function(){ $("#<%= divId %>").load("<%= resource.getPath() %>.ssi.html"); }); </script> © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 62
  63. 63. Combining Ajax and SSI Combining Ajax and SSI  Let's say you are using SSI for serving content through the webserver, but you still want to be able to view the page if you access CQ directly. We can handle this case by looking at the X-Forwarded-For header to verify whether or not the request is coming from the dispatcher and handle it appropriately. Following our SSI and Ajax examples above, here is what your component's html.jsp would look like: <% if(request.getHeader("X-Forwarded-For") != null) { %> <!--#include virtual="<%= resource.getPath() %>.ssi.html" --> <% } else { <div id="<%= divId %>”></div> <script> $(document).ready(function(){ $("#<%= divId %>").load("<%= resource.getPath() %>.ssi.html"); }); </script> <% } %> © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 63
  64. 64. Worked Example 1 Demonstration © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 64
  65. 65. Q& A + References  Dispatcher Documentation − −  http://dev.day.com/docs/en/cq/current/deploying/dispatcher.html /cache section of dispatcher.any http://dev.day.com/docs/en/cq/current/deploying/dispatcher/disp_config.html#par_146_44_0010 Apache HTTP Server − −  Error Document Directive - http://httpd.apache.org/docs/2.2/mod/core.html#errordocument SSI - http://httpd.apache.org/docs/2.2/howto/ssi.html Microsoft IIS − How to implement SSI in IIS  http://msdn.microsoft.com/en-us/library/ms525185%28v=vs.90%29.aspx  http://tech.mikeal.com/blog1.php/server-side-includes-for-html-in-iis7 © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential. 65
  66. 66. © 2012 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.

×