When dynamic becomes static – the next
step in web caching techniques
Wim Godden
Cu.be Solutions
Disclaimer
The next step
As in : what you will be doing in the future
Not as in : go home and run it ;-)
Language of choice : PHP
But : think Perl, Python, Ruby, Java, .Net, …
Who am I ?
Wim Godden (@wimgtr)
Where I'm from
Where I'm from
Where I'm from
Where I'm from
Where I'm from
Where I'm from
My town
My town
Belgium – the traffic
Who am I ?
Wim Godden (@wimgtr)
Founder of Cu.be Solutions (http://cu.be)
Open Source developer since 1997
Developer of OpenX, PHPCompatibility, ...
Speaker at PHP and Open Source conferences
Who are you ?
Developers ?
System/network engineers ?
Managers ?
To understand the present
Understand the past
The Stone Age
New blog post by : caveman003
Pre-PHP : draw it and make html
The Egyptian Era
Old-school PHP : 'rebuild-every-time'
The Industrial Revolution
PHP : let's cache
Extra ! Extra !
PHP : dynamic content in static content
The Modern Era
PHP : multiple webservers
PHP : push updates to cache
Today
Adding reverse proxy caching
Website with ESI
Article content page
Page content
Header
Latest news
Navigation
Website with ESI
Article content page
Page content
Top header
(TTL = 2h)
Latest news
Navigation
(TTL = 1h)
Website with ESI
Article content page
Page content (TTL = 30m)
Top header
(TTL = 2h)
Latest news (TTL = 2m)
Navigation
(TTL = 1h)
ESI – how it works
GET /pageGET /page
ESI – how it works
<html>
...
<esi:include src="/top"/>
<esi:include src="/nav"/>
<div id=”something”>
<esi:include src="/latest-news"/>
</div>
<esi:include src="/article/id/732"/>
...
</html>
GET /pageGET /page
ESI – how it works
<div id=”top-part”>
<a href=”/login”>Login</a>
</div>
GET /top
ESI – how it works
<html>
...
<esi:include src="/top"/>
<esi:include src="/nav"/>
<div id=”something”>
<esi:include src="/latest-news"/>
</div>
<esi:include src="/article/id/732"/>
...
</html>
GET /pageGET /page
ESI – how it works
<html>
...
<div id=”top-part”>
<a href=”/login”>Login</a>
</div>
<esi:include src="/nav"/>
<div id=”something”>
<esi:include src="/latest-news"/>
</div>
<esi:include src="/article/id/732"/>
...
</html>
GET /pageGET /page
Going to /page/id/732
<html>
<esi:include src="/top"/>
<esi:include src="/nav"/>
<div id="something">
<esi:include src="/latest-news"/>
</div>
<esi:include src="/article/id/732"/>
</html>
Article content page
<esi:include src="/article/732"/>
Varnish - ESI
<esi:include src="/top"/>
<esi:include src="/news"/>
<esi:include
src="/nav"/>
Varnish - what can/can't be cached ?
Can :
Static pages
Images, js, css
Static parts of pages that don't change often (ESI)
Can't :
POST requests
Very large files (it's not a file server !)
Requests with Set-Cookie
User-specific content
ESI → no caching on user-specific content ?
Logged in as : Wim Godden
5 messages
TTL = 5minTTL=1h
TTL = 0s ?
Nginx
Web server
Reverse proxy
Lightweight, fast
13.46% of all Websites
Nginx
No threads, event-driven
Uses epoll / kqueue
Low memory footprint
10000 active connections = normal
ESI on Nginx
Logged in as : Wim Godden
5 messages
NEWSMenu
ESI SLIC on Nginx
<slic:include key="news" src="/news" />
<slic:include
key="menu"
src="/menu" />
<slic:include key="top" src="/top" session="true" />
Requesting /page (1st
time)
Nginx
Shared memory
1
2
3
4
/page
/page
Requesting /page SLIC subrequests (1st
time)
Nginx
1
2
3
/menu
/news
/top (in SLIC session)
Requesting /page (next time)
Nginx
Shared memory
1
2
/page
/menu
/news
/top (in SLIC session)
/page
New message is sent...
POST /send
DB
insert into...
set(...)
top (in SLIC session)
Advantages
No repeated GET hits to webserver anymore !
At login : POST → warm up the cache !
No repeated hits for user-specific content
Not even for non-specific content
News added
addnews() method
DB
insert into...
set(...)
Memcache key /news
Advantages
No repeated GET hits to webserver anymore !
At login : POST → warm up the cache !
No repeated hits for user-specific content
Not even for non-specific content
No TTLs for non-specific content
How many Memcached requests ?
Logged in as : Wim Godden
5 messages
<slic:include key="news" src="/news" />
<slic:include
key="menu"
src="/menu" />
<slic:include key="top" src="/top" session="true" />
First release : ESI
Part of the ESI 1.0 spec
Only relevant features implemented
Extension for dynamic session support
But : unavailable for copyright reasons
Rebuilt from scratch : SLIC
Control structures : if/else, switch/case, foreach
Variable handling
Strings : concatenation, substring, …
Exception handling, header manipulation, …
JSON support !
SLIC code samples
You are logged in as : <slic:session_var("person_name") />
You are logged in as : <@s("person_name") />
SLIC code samples
<slic:switch var="session_var('isAdmin')">
<slic:case value="1">
<slic:include key="admin-buttons" src="/admin-buttons.php" />
</slic:case>
<slic:default>
<div id="just-a-user">
<slic:include key="user-buttons" src="/user-buttons.php" />
</div>
</slic:default>
</slic:switch>
SLIC code samples
<slic:foreach item="messageId" src="global_var('thread' + query_var('threadId'))">
<slic:include key="'thread-message_' + messageId"
src="'/thread/message.php?id=' + messageId" />
</slic:foreach>
Approaches – full block
<p id=”LoggedInAs”>
You are logged in as : <slic:session_var("person_name") />
</p>
<p id=”MessageCount”>
You have 5 messages
</p>
Logged in as : Wim Godden
5 messages
<slic:include key="top" src="/top" session="true" />
Approaches – individual variables
<p id=”LoggedInAs”>
You are logged in as : <slic:session_var("person_name") />
</p>
<p id=”MessageCount”>
You have <slic:session_var(“messages”) /> messages
</p>
Logged in as : Wim Godden
5 messages
<slic:include key="top" src="/top" session="true" />
Approaches – JSON
<p id=”LoggedInAs”>
You are logged in as : <slic:session_var("userData").person_name />
</p>
<p id=”MessageCount”>
You have <slic:session_var(“userData”).message_count /> messages
</p>
Logged in as : Wim Godden
5 messages
<slic:include key="top" src="/top" session="true" />
Identifying the user
In Nginx configuration :
slic_session_cookie <name> → Defined by language (or configurable)
slic_session_identifier <string> → Defined by you
Example for PHP :
slic_session_cookie PHPSESSID
slic_session_identifier UID
Identifying the user
Nginx + SLIC
Cookie :
PHPSESSID =
jpsidc1po35sq9q3og4f3hi6e2
get UID_jpsidc1po35sq9q3og4f3hi6e2432
Retrieving user specific content
Nginx + SLIC
Cookie :
PHPSESSID =
jpsidc1po35sq9q3og4f3hi6e2
get userData_432
Why Nginx ?
Native Memcached support
Excellent and superfast subrequest system
Including parallel subrequests
Handles thousands of connections per worker
With minimal memory footprint
Integrates with php-fpm
Additional features (chroot, slow request log, offline processing, ...)
Graceful rolling upgrades
What's the result ?
Figures
2nd
customer :
No. of web servers : 72 → 8
No. of db servers : 15 → 4
Total : 87 → 12 (86% reduction !)
Last customer :
No. of total servers : +/- 1350
Expected reduction : 1350 → 380
Expected savings : €1.5 Million per year
Why is it so much faster ?
A real example : vBulletin
A real example : vBulletin
Post
isModerator session variable
isAdmin session variable
A real example : vBulletin
DB Server Load Web Server Load Max Requests/sec (1 = 282)
0
5
10
15
20
25
30
35
Standard install
With Memcached
Nginx + SCL + memcached
Code changes
Required
Template conversion
Push-to-DB → Push-to-DB + Push-to-Cache
Choice :
If user is logged in → push updates to cache
If user is not logged in → warm up cache on login
Availability
Good news :
It will become Open Source
It's solid : ESI version stable at 4 customers
PHP version (currently for dev, later for docs and learning)
Bad news :
First customer holds copyrights
Total rebuild
→ Open Source release
No current projects, so spare time
Anyone feel like sponsoring ?
Beta : Aug 14 (?)
Stable : Oct 14 (?) - on Github
So...
Questions ?
Questions ?
Contact
Twitter @wimgtr
Web http://techblog.wimgodden.be
Slides http://www.slideshare.net/wimg
E-mail wim.godden@cu.be

When dynamic becomes static : the next step in web caching techniques

  • 1.
    When dynamic becomesstatic – the next step in web caching techniques Wim Godden Cu.be Solutions
  • 2.
    Disclaimer The next step Asin : what you will be doing in the future Not as in : go home and run it ;-) Language of choice : PHP But : think Perl, Python, Ruby, Java, .Net, …
  • 3.
    Who am I? Wim Godden (@wimgtr)
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
    Who am I? Wim Godden (@wimgtr) Founder of Cu.be Solutions (http://cu.be) Open Source developer since 1997 Developer of OpenX, PHPCompatibility, ... Speaker at PHP and Open Source conferences
  • 14.
    Who are you? Developers ? System/network engineers ? Managers ?
  • 15.
    To understand thepresent Understand the past
  • 16.
    The Stone Age Newblog post by : caveman003
  • 17.
    Pre-PHP : drawit and make html
  • 18.
  • 19.
    Old-school PHP :'rebuild-every-time'
  • 20.
  • 21.
  • 22.
  • 23.
    PHP : dynamiccontent in static content
  • 24.
  • 25.
    PHP : multiplewebservers
  • 26.
    PHP : pushupdates to cache
  • 27.
  • 28.
  • 29.
    Website with ESI Articlecontent page Page content Header Latest news Navigation
  • 30.
    Website with ESI Articlecontent page Page content Top header (TTL = 2h) Latest news Navigation (TTL = 1h)
  • 31.
    Website with ESI Articlecontent page Page content (TTL = 30m) Top header (TTL = 2h) Latest news (TTL = 2m) Navigation (TTL = 1h)
  • 32.
    ESI – howit works GET /pageGET /page
  • 33.
    ESI – howit works <html> ... <esi:include src="/top"/> <esi:include src="/nav"/> <div id=”something”> <esi:include src="/latest-news"/> </div> <esi:include src="/article/id/732"/> ... </html> GET /pageGET /page
  • 34.
    ESI – howit works <div id=”top-part”> <a href=”/login”>Login</a> </div> GET /top
  • 35.
    ESI – howit works <html> ... <esi:include src="/top"/> <esi:include src="/nav"/> <div id=”something”> <esi:include src="/latest-news"/> </div> <esi:include src="/article/id/732"/> ... </html> GET /pageGET /page
  • 36.
    ESI – howit works <html> ... <div id=”top-part”> <a href=”/login”>Login</a> </div> <esi:include src="/nav"/> <div id=”something”> <esi:include src="/latest-news"/> </div> <esi:include src="/article/id/732"/> ... </html> GET /pageGET /page
  • 37.
    Going to /page/id/732 <html> <esi:includesrc="/top"/> <esi:include src="/nav"/> <div id="something"> <esi:include src="/latest-news"/> </div> <esi:include src="/article/id/732"/> </html>
  • 38.
    Article content page <esi:includesrc="/article/732"/> Varnish - ESI <esi:include src="/top"/> <esi:include src="/news"/> <esi:include src="/nav"/>
  • 39.
    Varnish - whatcan/can't be cached ? Can : Static pages Images, js, css Static parts of pages that don't change often (ESI) Can't : POST requests Very large files (it's not a file server !) Requests with Set-Cookie User-specific content
  • 40.
    ESI → nocaching on user-specific content ? Logged in as : Wim Godden 5 messages TTL = 5minTTL=1h TTL = 0s ?
  • 41.
  • 42.
    Nginx No threads, event-driven Usesepoll / kqueue Low memory footprint 10000 active connections = normal
  • 43.
    ESI on Nginx Loggedin as : Wim Godden 5 messages NEWSMenu
  • 44.
    ESI SLIC onNginx <slic:include key="news" src="/news" /> <slic:include key="menu" src="/menu" /> <slic:include key="top" src="/top" session="true" />
  • 45.
    Requesting /page (1st time) Nginx Sharedmemory 1 2 3 4 /page /page
  • 46.
    Requesting /page SLICsubrequests (1st time) Nginx 1 2 3 /menu /news /top (in SLIC session)
  • 47.
    Requesting /page (nexttime) Nginx Shared memory 1 2 /page /menu /news /top (in SLIC session) /page
  • 48.
    New message issent... POST /send DB insert into... set(...) top (in SLIC session)
  • 49.
    Advantages No repeated GEThits to webserver anymore ! At login : POST → warm up the cache ! No repeated hits for user-specific content Not even for non-specific content
  • 50.
    News added addnews() method DB insertinto... set(...) Memcache key /news
  • 51.
    Advantages No repeated GEThits to webserver anymore ! At login : POST → warm up the cache ! No repeated hits for user-specific content Not even for non-specific content No TTLs for non-specific content
  • 52.
    How many Memcachedrequests ? Logged in as : Wim Godden 5 messages <slic:include key="news" src="/news" /> <slic:include key="menu" src="/menu" /> <slic:include key="top" src="/top" session="true" />
  • 53.
    First release :ESI Part of the ESI 1.0 spec Only relevant features implemented Extension for dynamic session support But : unavailable for copyright reasons
  • 54.
    Rebuilt from scratch: SLIC Control structures : if/else, switch/case, foreach Variable handling Strings : concatenation, substring, … Exception handling, header manipulation, … JSON support !
  • 55.
    SLIC code samples Youare logged in as : <slic:session_var("person_name") /> You are logged in as : <@s("person_name") />
  • 56.
    SLIC code samples <slic:switchvar="session_var('isAdmin')"> <slic:case value="1"> <slic:include key="admin-buttons" src="/admin-buttons.php" /> </slic:case> <slic:default> <div id="just-a-user"> <slic:include key="user-buttons" src="/user-buttons.php" /> </div> </slic:default> </slic:switch>
  • 57.
    SLIC code samples <slic:foreachitem="messageId" src="global_var('thread' + query_var('threadId'))"> <slic:include key="'thread-message_' + messageId" src="'/thread/message.php?id=' + messageId" /> </slic:foreach>
  • 58.
    Approaches – fullblock <p id=”LoggedInAs”> You are logged in as : <slic:session_var("person_name") /> </p> <p id=”MessageCount”> You have 5 messages </p> Logged in as : Wim Godden 5 messages <slic:include key="top" src="/top" session="true" />
  • 59.
    Approaches – individualvariables <p id=”LoggedInAs”> You are logged in as : <slic:session_var("person_name") /> </p> <p id=”MessageCount”> You have <slic:session_var(“messages”) /> messages </p> Logged in as : Wim Godden 5 messages <slic:include key="top" src="/top" session="true" />
  • 60.
    Approaches – JSON <pid=”LoggedInAs”> You are logged in as : <slic:session_var("userData").person_name /> </p> <p id=”MessageCount”> You have <slic:session_var(“userData”).message_count /> messages </p> Logged in as : Wim Godden 5 messages <slic:include key="top" src="/top" session="true" />
  • 61.
    Identifying the user InNginx configuration : slic_session_cookie <name> → Defined by language (or configurable) slic_session_identifier <string> → Defined by you Example for PHP : slic_session_cookie PHPSESSID slic_session_identifier UID
  • 62.
    Identifying the user Nginx+ SLIC Cookie : PHPSESSID = jpsidc1po35sq9q3og4f3hi6e2 get UID_jpsidc1po35sq9q3og4f3hi6e2432
  • 63.
    Retrieving user specificcontent Nginx + SLIC Cookie : PHPSESSID = jpsidc1po35sq9q3og4f3hi6e2 get userData_432
  • 64.
    Why Nginx ? NativeMemcached support Excellent and superfast subrequest system Including parallel subrequests Handles thousands of connections per worker With minimal memory footprint Integrates with php-fpm Additional features (chroot, slow request log, offline processing, ...) Graceful rolling upgrades
  • 65.
  • 66.
    Figures 2nd customer : No. ofweb servers : 72 → 8 No. of db servers : 15 → 4 Total : 87 → 12 (86% reduction !) Last customer : No. of total servers : +/- 1350 Expected reduction : 1350 → 380 Expected savings : €1.5 Million per year
  • 67.
    Why is itso much faster ?
  • 68.
    A real example: vBulletin
  • 69.
    A real example: vBulletin Post isModerator session variable isAdmin session variable
  • 70.
    A real example: vBulletin DB Server Load Web Server Load Max Requests/sec (1 = 282) 0 5 10 15 20 25 30 35 Standard install With Memcached Nginx + SCL + memcached
  • 71.
    Code changes Required Template conversion Push-to-DB→ Push-to-DB + Push-to-Cache Choice : If user is logged in → push updates to cache If user is not logged in → warm up cache on login
  • 72.
    Availability Good news : Itwill become Open Source It's solid : ESI version stable at 4 customers PHP version (currently for dev, later for docs and learning) Bad news : First customer holds copyrights Total rebuild → Open Source release No current projects, so spare time Anyone feel like sponsoring ? Beta : Aug 14 (?) Stable : Oct 14 (?) - on Github
  • 73.
  • 74.
  • 75.
  • 76.
    Contact Twitter @wimgtr Web http://techblog.wimgodden.be Slideshttp://www.slideshare.net/wimg E-mail wim.godden@cu.be