90% of the page loading time is spent on retrieving CSS, JavaScript and images. There are many techniques to reduce the page loading time — using a CDN is one of the most effective. Currently it's expensive to integrate with a CDN (especially if you want to avoid vendor lock-in) and it's hard to serve file A from a CDN, file B from a static file server and file C from neither. It's also impossible to process a file (e.g. compress JavaScript, optimize images, transcode videos …) before they are synced to the CDN.
This session will explain how a CDN (Content Delivery Network) improves page loading times and how you should continuously analyze the page loading performance of your web site. Older techniques for integrating a CDN with Drupal will be compared and my new, alternative, comprehensive solution will be presented.
All the research and work performed for this presentation was part of my bachelor thesis at Hasselt University, which got the highest score possible: 19/20 (they never hand out a perfect score). It was labeled as being equivalent to a master thesis. So at least, you can expect the concept to be solid.
For the web site that served as the test case of my bachelor thesis, which had less than 5 images per page, the results were dramatic: the time required to load CSS, JS and images was reduced to less than half the original time, globally (visitors from more than 150 countries)! A combination of a North-American CDN and a static file server in Belgium was used to achieve this.
The average web site today has 50 images per page, so it's likely your results will be even more impressive.
Agenda
What is page loading performance?
How a CDN improves page loading times
Continuous page loading performance analysis through Episodes
The CDN integration module
The File Conveyor daemon to process & sync files with unlimited flexibility
Pressflow
CDN integration module 2.0
ESI (Edge Side Includes)
Master thesis goals
30. Daemon’s capabilities: scenario 1
Company wants to switch from CDN provider X to Amazon S3.
- CDN X: FTP
- Amazon S3: custom protocol
31. Daemon’s capabilities: scenario 1
Company wants to switch from CDN provider X to Amazon S3.
- CDN X: FTP
- Amazon S3: custom protocol
• Setup:
• CDN X → Amazon S3
• File Conveyor
32. Daemon’s capabilities: scenario 1
Company wants to switch from CDN provider X to Amazon S3.
- CDN X: FTP
- Amazon S3: custom protocol
• Setup:
• CDN X → Amazon S3
• File Conveyor
• Alternative: write a lot of code
34. Daemon’s capabilities: scenario 2
U.S. company expands to South-Korea
• Setup:
• North-American CDN
• Static file server in South-Korea
• File Conveyor + language-/subdomain-based logic to pick CDN/server
35. Daemon’s capabilities: scenario 2
U.S. company expands to South-Korea
• Setup:
• North-American CDN
• Static file server in South-Korea
• File Conveyor + language-/subdomain-based logic to pick CDN/server
• Alternative: global CDN or slower web site in South-Korea
46. Bachelor thesis: conclusion
1. Speed of web site very important
2. Episodes → monitor page loading performance
3. CDN integration module → best method to integrate Drupal with a CDN
47. Bachelor thesis: conclusion
1. Speed of web site very important
2. Episodes → monitor page loading performance
3. CDN integration module → best method to integrate Drupal with a CDN
4. File Conveyor → flexible CDN integration
52. What else exists …
High-traffic Drupal sites:
“Pressflow is a distribution of Drupal with integrated performance,
scalability, availability, and testing enhancements.”
72. ESI
• ESI module: http://drupal.org/project/esi
• In development
• developers are actively participating
73. ESI
• ESI module: http://drupal.org/project/esi
• In development
• developers are actively participating
• David Strauss (Four Kitchens)
74. ESI
• ESI module: http://drupal.org/project/esi
• In development
• developers are actively participating
• David Strauss (Four Kitchens)
• Josh Koenig (Chapter Three)
75. ESI
• ESI module: http://drupal.org/project/esi
• In development
• developers are actively participating
• David Strauss (Four Kitchens)
• Josh Koenig (Chapter Three)
• Note: not every CDN supports ESI!
78. Master thesis
• An application that can automatically extract conclusions out of Episodes logs
and visualize them.
79. Master thesis
• An application that can automatically extract conclusions out of Episodes logs
and visualize them.
• e.g.:
80. Master thesis
• An application that can automatically extract conclusions out of Episodes logs
and visualize them.
• e.g.:
• “http://d.o/ is slow in Belgium, for users of the ISP Telenet”
81. Master thesis
• An application that can automatically extract conclusions out of Episodes logs
and visualize them.
• e.g.:
• “http://d.o/ is slow in Belgium, for users of the ISP Telenet”
• “http://d.o/project/esi has slowly loading CSS”
82. Master thesis
• An application that can automatically extract conclusions out of Episodes logs
and visualize them.
• e.g.:
• “http://d.o/ is slow in Belgium, for users of the ISP Telenet”
• “http://d.o/project/esi has slowly loading CSS”
• “http://d.o/project/cdn has slowly loading JS for visitors that use the browser
Internet Explorer 6 or 7”
83. Master thesis
• An application that can automatically extract conclusions out of Episodes logs
and visualize them.
• e.g.:
• “http://d.o/ is slow in Belgium, for users of the ISP Telenet”
• “http://d.o/project/esi has slowly loading CSS”
• “http://d.o/project/cdn has slowly loading JS for visitors that use the browser
Internet Explorer 6 or 7”
• Challenges: complex data mining, distributed computing, storage, performance
84. Master thesis
• An application that can automatically extract conclusions out of Episodes logs
and visualize them.
• e.g.:
• “http://d.o/ is slow in Belgium, for users of the ISP Telenet”
• “http://d.o/project/esi has slowly loading CSS”
• “http://d.o/project/cdn has slowly loading JS for visitors that use the browser
Internet Explorer 6 or 7”
• Challenges: complex data mining, distributed computing, storage, performance
See http://wimleers.com/tags/master-thesis
86. Conclusion
1. Speed of web site very important
2. Episodes → monitor page loading performance
3. CDN integration module → best method to integrate Drupal with a CDN
4. File Conveyor → flexible CDN integration
87. Conclusion
1. Speed of web site very important
2. Episodes → monitor page loading performance
3. CDN integration module → best method to integrate Drupal with a CDN
4. File Conveyor → flexible CDN integration
5. CDN integration module 2 → more awesome!
88. Conclusion
1. Speed of web site very important
2. Episodes → monitor page loading performance
3. CDN integration module → best method to integrate Drupal with a CDN
4. File Conveyor → flexible CDN integration
5. CDN integration module 2 → more awesome!
6. ESI → big potential
89. Conclusion
1. Speed of web site very important
2. Episodes → monitor page loading performance
3. CDN integration module → best method to integrate Drupal with a CDN
4. File Conveyor → flexible CDN integration
5. CDN integration module 2 → more awesome!
6. ESI → big potential
7. Master thesis → more insight
90. Conclusion
1. Speed of web site very important
2. Episodes → monitor page loading performance
3. CDN integration module → best method to integrate Drupal with a CDN
4. File Conveyor → flexible CDN integration
5. CDN integration module 2 → more awesome!
6. ESI → big potential
7. Master thesis → more insight
This talk will be available at http://wimleers.com/talk/fosdem-2010
91. Need Drupal page loading performance help?
Contact me: http://wimleers.com/contact
This talk will be available at http://wimleers.com/talk/fosdem-2010
Editor's Notes
A content delivery network (CDN) is a collection of webservers that are distributed across multiple locations, to deliver data more efficiently to the end user. Typically, the server with the shortest distance to the end user is selected.
- geographical spread: more spread is better, but more expensive
- lock-in:
* push: supporting all transfer protocols is too hard
* pull: not every CDN supports pulling (and unlikely that the same features are supported)
- geographical spread: more spread is better, but more expensive
- lock-in:
* push: supporting all transfer protocols is too hard
* pull: not every CDN supports pulling (and unlikely that the same features are supported)
- geographical spread: more spread is better, but more expensive
- lock-in:
* push: supporting all transfer protocols is too hard
* pull: not every CDN supports pulling (and unlikely that the same features are supported)
- geographical spread: more spread is better, but more expensive
- lock-in:
* push: supporting all transfer protocols is too hard
* pull: not every CDN supports pulling (and unlikely that the same features are supported)
- geographical spread: more spread is better, but more expensive
- lock-in:
* push: supporting all transfer protocols is too hard
* pull: not every CDN supports pulling (and unlikely that the same features are supported)
- geographical spread: more spread is better, but more expensive
- lock-in:
* push: supporting all transfer protocols is too hard
* pull: not every CDN supports pulling (and unlikely that the same features are supported)
- geographical spread: more spread is better, but more expensive
- lock-in:
* push: supporting all transfer protocols is too hard
* pull: not every CDN supports pulling (and unlikely that the same features are supported)
- Best of +- 10 evaluated tools and commercial services
- Best of +- 10 evaluated tools and commercial services
- Best of +- 10 evaluated tools and commercial services
Result: no switching cost → no lock-in!
Result: no switching cost → no lock-in!
Result: no switching cost → no lock-in!
Result: faster web site, same setup cost, lower maintenance cost
Result: faster web site, same setup cost, lower maintenance cost
Result: faster web site, same setup cost, lower maintenance cost
- own web site
- > 100.000 visitors per month, >150 countries
- makes it a good test case to show the effect of CDN integration
- far less images than the average web site, thus less HTTP requests, thus harder to show the difference
- Europe + Russia: server in Belgium, rest of the world: CDN with servers in North America
- uses ip2country module to determine country of user
- information in the screenshots generated by the Episodes Server module
- > 2.7 million episodes measured over > 260k page views
- before June 21: large variability, especially globally
- starting June 21: CDN integration active
- starting June 22: also CSS & JS of delicious widget with CDN integration
- very clear downward trend starting June 21, even stronger starting June 22 (partially thanks to caching in the browser)