The document discusses using Varnish as a caching solution to improve website performance for sites with high traffic or transactions. It describes how Varnish can cache content, assets, and API responses to improve response times and handle more requests. It also discusses techniques like edge caching with CDNs, asynchronous processing of transactions, and breaking pages into cacheable and uncacheable parts.
OSDC 2012 | Ultra-performant dynamic websites with Varnish by Dr. Chriatian Winkler
1. München Berlin Hamburg Köln Nürnberg Grenoble Prag
Ultra-performant dynamic websites with Varnish
Many customers, lots of transactions AND nevertheless fast?
Talk at OSDC, Nürnberg
Dr. Christian Winkler, mgm technology partners GmbH
25.04.2012
2. About…
Christian Winkler
● Working as enterprise architect
● Expertise in high performance, scalability
● Working for…
mgm technology partners
● Software development projects
● Large projects, lots of data, scalability
● Customers like Elster Online, LIDL, Hewlett Packard, GfK
4. Performance problems of online shops
Problem of many online shops
Growing number of accesses
Growing number of visitors
Faster and faster Internet access
Less and less patience
Solution: buy more servers
Not efficient
Expensive buying and maintaining
Always consider peak performance
Most of the time idle
5. Most pages are identical for all
Especially true for homepage
Sometimes little personalization
might be included
Different content for different visitors
Differentiate between stateless
and stateful visitors
Visitors with state get dynamic
page
Stateless visitors get cached
static copy
Idea
Observation
6. Choosing a suitable cache
Requirements
Optimized hit rate
High performance and stability
Flexible configuration
Powerful cache management
Decision: Use Varnish
Super fast
Atomic cache management
Open Source Software for page caching
Extremely flexible, procedural configuration
Apache
httpd
Apache
Tomcat
JPG, GIF, JS,
CSS
JSP
AJP
HTTP
Varnish
7. Tasks for Varnish
HTTP access for all users
o Exposed in Internet (!)
o Directly behind Load Balancer
Distribution of HTTP accesses to backend systems
o Analyze requests
o Find out responsible system and route request correctly
o Maybe cache response
Maintain and expire cache
o Maximum size of cache
o Automatic and manual expiry, expiring data when cache is full
Write log files (!)
NOT part of Varnish‘s tasks
o Static content (web server‘s domain)
o SSL
8. Anatomy of a Varnish access (not cacheable)
HTTP
vcl_recv
vcl_
fetch
pass
vcl_
deliver
pass
Apache
httpd
9. Anatomy of a Varnish acces (cacheable)
HTTP
vcl_recv
Cache
vcl_
fetch
vcl_miss
vcl_
deliver
hit_for_pass Apache
httpd
10. Anatomy of a Varnish acces (in cache)
HTTP
vcl_recv
Cache
vcl_
deliver
11. Anatomy of a Varnish access (overview)
HTTP
vcl_recv
Cache
vcl_
fetch
pass
vcl_miss
vcl_
deliver
hit_for_pass
pass
Apache
httpd
12. Performance gain by caching
Example: Online shop
Naive Solution: many fatabase accesses
100 rrequests/s
Improved solution: database cache
500 requests/s
Using Varnish als HTML cache
10.000 requests/s (Gbit-Limit)
13. Performance gain by caching
Why is Varnish so fast?
Using virtual memory delivering objects from RAM
Compiled configuration no bottlenecks for requests
High hit rate is essential
Think very carefully before implementing
Monitoring extremely important
14. Take client cache into account (Expires)
Improving the hit rate
Differentiate between different types of cookies
Relevant Use as index or skip cache
Irrelevant Ignore or delete in vcl_recv
Partly relevant for caching delete irrelevant parts
Normalize the headers
Analyze which headers have an impact on content
Avoid server side browser checks and content negotiation
Don‘t use „Vary“ in the response
Reduce to strictly necessary headers and content
Use critical headers as index
Different compressions
Internet Explorer: Accept-Encoding: deflate,gzip
andere: Accept-Encoding: gzip,deflate
Can only achieve halt hit rate without normalization
15. Problems and solutions
Many websites are not stateless
o Cookie creation only when necessary
o State transition (sessions are needed suddenly)
Make as many requests stateless as possible
o Increase hit rate of cache
o Keep simple state on client side (e.g. personalization)
What can be cached?
o Only stateless content pages (GET)
o Be careful with transactions (workflows etc.)
How to handle content changes?
o Delete cache completely
o Selectively delete certain pages
o Cache-Control to tune time that objects can stay in cache
16. Further optimization
Welcome username to our website
Your
shoppi
ng cart
Teaser
(TTL:
2min)
Main navigation (TTL: 60min)
Side
navi-
gation
(TTL:
60min) Content
(TTL: 5 min)
Template
(TTL: 1d)
Distributed cache
Specialized service providers
Cure for hardware limits (Gb)
Content Distribution Network (CDN)
Differentiate between static and dynamic
elements of a page
Cache all static elements
Page assembly in Varnish via ESI
Edge Side Includes (ESI)
Cache dynamic elements also (in RAM)
Only POST requests make it to the
application server
Page assembly in Varnish
Memcached or Redis
19. Situation with many parallel orders
ERP system becomes bottleneck
● Many transactions
● All for the same product
Locking kills you
Application server overloaded
● Threads used by ERP
● All threads waiting
● No pages are show
● Whole site not working
(Dynamical) website down
● Frequently with special
campaigns
● Hurts as marketing budget
wasted
Order ERP
Any page
Order ERP
Order ERP
Order ERP
20. Solution: asynchronous (loose) coupling between shop and ERP
Grundsätzliche Idee
● Implement an asynchronous
interface to ERP (switch)
● Use ActiveMQ
● Transfer orders serially to ERP
in background
Restrictions
● Stock numbers in shop
● Slight risk of over-orders
● Asynchronouse error handling
is difficult and error-prone
● Not for permanent service
Result
● Several lakh orders per hour
Any page
ERP1
ERP2
ERP3
Order
Order
24. References
Web site of LIDL
● http://www.lidl.de
More detailed information
● http://www.mgm-tp.com
● http://blog.mgm-tp.com/2012/01/varnish-web-cache/
Tomcat Application Server
● http://tomcat.apache.org/
Varnish Web Cache
● https://www.varnish-cache.org/
Edge Side Include (ESI)
● http://www.w3.org/TR/esi-lang
Memcached and Redis
● http://memcached.org/, http://redis.io/
ActiveMQ
● http://activemq.apache.org/