Drupal's page loading performance in 2010

6,403 views

Published on

90% of the page loading time is spent on retrieving CSS, JavaScript and images. There are many techniques to reduce the page loading time — using a CDN is one of the most effective. Currently it's expensive to integrate with a CDN (especially if you want to avoid vendor lock-in) and it's hard to serve file A from a CDN, file B from a static file server and file C from neither. It's also impossible to process a file (e.g. compress JavaScript, optimize images, transcode videos …) before they are synced to the CDN.

This session will explain how a CDN (Content Delivery Network) improves page loading times and how you should continuously analyze the page loading performance of your web site. Older techniques for integrating a CDN with Drupal will be compared and my new, alternative, comprehensive solution will be presented.
All the research and work performed for this presentation was part of my bachelor thesis at Hasselt University, which got the highest score possible: 19/20 (they never hand out a perfect score). It was labeled as being equivalent to a master thesis. So at least, you can expect the concept to be solid.

For the web site that served as the test case of my bachelor thesis, which had less than 5 images per page, the results were dramatic: the time required to load CSS, JS and images was reduced to less than half the original time, globally (visitors from more than 150 countries)! A combination of a North-American CDN and a static file server in Belgium was used to achieve this.
The average web site today has 50 images per page, so it's likely your results will be even more impressive.

Agenda

What is page loading performance?
How a CDN improves page loading times
Continuous page loading performance analysis through Episodes
The CDN integration module
The File Conveyor daemon to process & sync files with unlimited flexibility
Pressflow
CDN integration module 2.0
ESI (Edge Side Includes)
Master thesis goals

Published in: Technology
1 Comment
5 Likes
Statistics
Notes
No Downloads
Views
Total views
6,403
On SlideShare
0
From Embeds
0
Number of Embeds
452
Actions
Shares
0
Downloads
60
Comments
1
Likes
5
Embeds 0
No embeds

No notes for slide
  • A content delivery network (CDN) is a collection of webservers that are distributed across multiple locations, to deliver data more efficiently to the end user. Typically, the server with the shortest distance to the end user is selected.
  • - geographical spread: more spread is better, but more expensive
    - lock-in:
    * push: supporting all transfer protocols is too hard
    * pull: not every CDN supports pulling (and unlikely that the same features are supported)
  • - geographical spread: more spread is better, but more expensive
    - lock-in:
    * push: supporting all transfer protocols is too hard
    * pull: not every CDN supports pulling (and unlikely that the same features are supported)
  • - geographical spread: more spread is better, but more expensive
    - lock-in:
    * push: supporting all transfer protocols is too hard
    * pull: not every CDN supports pulling (and unlikely that the same features are supported)
  • - geographical spread: more spread is better, but more expensive
    - lock-in:
    * push: supporting all transfer protocols is too hard
    * pull: not every CDN supports pulling (and unlikely that the same features are supported)
  • - geographical spread: more spread is better, but more expensive
    - lock-in:
    * push: supporting all transfer protocols is too hard
    * pull: not every CDN supports pulling (and unlikely that the same features are supported)
  • - geographical spread: more spread is better, but more expensive
    - lock-in:
    * push: supporting all transfer protocols is too hard
    * pull: not every CDN supports pulling (and unlikely that the same features are supported)
  • - geographical spread: more spread is better, but more expensive
    - lock-in:
    * push: supporting all transfer protocols is too hard
    * pull: not every CDN supports pulling (and unlikely that the same features are supported)
  • - Best of +- 10 evaluated tools and commercial services
  • - Best of +- 10 evaluated tools and commercial services
  • - Best of +- 10 evaluated tools and commercial services
  • Result: no switching cost → no lock-in!
  • Result: no switching cost → no lock-in!
  • Result: no switching cost → no lock-in!
  • Result: faster web site, same setup cost, lower maintenance cost
  • Result: faster web site, same setup cost, lower maintenance cost
  • Result: faster web site, same setup cost, lower maintenance cost
  • - own web site
    - > 100.000 visitors per month, >150 countries
    - makes it a good test case to show the effect of CDN integration
    - far less images than the average web site, thus less HTTP requests, thus harder to show the difference
    - Europe + Russia: server in Belgium, rest of the world: CDN with servers in North America
    - uses ip2country module to determine country of user
  • - information in the screenshots generated by the Episodes Server module
    - > 2.7 million episodes measured over > 260k page views
    - before June 21: large variability, especially globally
    - starting June 21: CDN integration active
    - starting June 22: also CSS & JS of delicious widget with CDN integration
    - very clear downward trend starting June 21, even stronger starting June 22 (partially thanks to caching in the browser)
  • Drupal's page loading performance in 2010

    1. Drupal’s page loading performance in 2010 Wim Leers ~ http://wimleers.com/ Drupal.org, IRC, Twitter, LinkedIn: wimleers
    2. Bachelor thesis “Improving Drupal’s page loading performance” • Promotor: Prof. dr. Wim Lamotte • Co-promotor: dr. Peter Quax • Mentors: Stijn Agten & Maarten Wijnants
    3. Goal • Faster web sites • Speed = satisfaction = more & happier visitors = more revenue Source: http://www.slideshare.net/stubbornella/designing-fast-websites-presentation, Nicole Sullivan, Yahoo!
    4. Goal • Faster web sites • Speed = satisfaction = more & happier visitors = more revenue • Google: +0.5s → -20% searches Source: http://www.slideshare.net/stubbornella/designing-fast-websites-presentation, Nicole Sullivan, Yahoo!
    5. Terminology: page loading performance 90% 10% CSS, JS, images … HTML
    6. Drupal’s page loading performance • One of the most effective measures: “Use a CDN”
    7. Drupal’s page loading performance • One of the most effective measures: “Use a CDN” • Drupal: not yet possible! Possible now! :)
    8. Terminology: CDN
    9. Key properties of a CDN
    10. Key properties of a CDN • Geographical spread
    11. Key properties of a CDN • Geographical spread • Pull versus Push Pull F transfer automatically We protocol + virtually no setup – no flexibility
    12. Key properties of a CDN • Geographical spread • Pull versus Push Pull Push FTP, SFTP, rsync, transfer automatically WebDAV, Amazon S3 protocol … + virtually no setup flexibility – no flexibility setup
    13. Key properties of a CDN • Geographical spread • Pull versus Push • Lock-in Pull Push FTP, SFTP, rsync, transfer automatically WebDAV, Amazon S3 protocol … + virtually no setup flexibility – no flexibility setup
    14. Profiling: Episodes
    15. Profiling: Episodes • Measures “episodes” during page loading
    16. Profiling: Episodes • Measures “episodes” during page loading • Real measurements: JS in browser, for each visitor • No simulation!
    17. Episodes module
    18. Episodes module • Drupal module that offers Episodes integration
    19. Episodes module • Drupal module that offers Episodes integration
    20. Episodes Server module
    21. Episodes Server module • Drupal module that visualizes the collected measurements
    22. Daemon: File Conveyor
    23. Daemon: File Conveyor 1. Configuration: simple XML file
    24. Daemon: File Conveyor 1. Configuration: simple XML file 2. Detection: instantaneous
    25. Daemon: File Conveyor 1. Configuration: simple XML file 2. Detection: instantaneous 3. Processing: store image more efficiently … — extensible!
    26. Daemon: File Conveyor 1. Configuration: simple XML file 2. Detection: instantaneous 3. Processing: store image more efficiently … — extensible! 4. Syncing: supports many protocols (FTP, Amazon S3 …) — extensible!
    27. Daemon: File Conveyor 1. Configuration: simple XML file 2. Detection: instantaneous 3. Processing: store image more efficiently … — extensible! 4. Syncing: supports many protocols (FTP, Amazon S3 …) — extensible! 5. Result: SQLite DB with CDN URLs
    28. Daemon: File Conveyor 1. Configuration: simple XML file 2. Detection: instantaneous 3. Processing: store image more efficiently … — extensible! 4. Syncing: supports many protocols (FTP, Amazon S3 …) — extensible! 5. Result: SQLite DB with CDN URLs Project homepage: http://fileconveyor.org/ — code on GitHub
    29. Demo
    30. Daemon’s capabilities: scenario 1 Company wants to switch from CDN provider X to Amazon S3. - CDN X: FTP - Amazon S3: custom protocol
    31. Daemon’s capabilities: scenario 1 Company wants to switch from CDN provider X to Amazon S3. - CDN X: FTP - Amazon S3: custom protocol • Setup: • CDN X → Amazon S3 • File Conveyor
    32. Daemon’s capabilities: scenario 1 Company wants to switch from CDN provider X to Amazon S3. - CDN X: FTP - Amazon S3: custom protocol • Setup: • CDN X → Amazon S3 • File Conveyor • Alternative: write a lot of code
    33. Daemon’s capabilities: scenario 2 U.S. company expands to South-Korea
    34. Daemon’s capabilities: scenario 2 U.S. company expands to South-Korea • Setup: • North-American CDN • Static file server in South-Korea • File Conveyor + language-/subdomain-based logic to pick CDN/server
    35. Daemon’s capabilities: scenario 2 U.S. company expands to South-Korea • Setup: • North-American CDN • Static file server in South-Korea • File Conveyor + language-/subdomain-based logic to pick CDN/server • Alternative: global CDN or slower web site in South-Korea
    36. CDN integration module
    37. CDN integration module • Drupal core patch for Drupal 6, committed to Drupal 7 & 6
    38. CDN integration module • Drupal core patch for Drupal 6, committed to Drupal 7 & 6 • Basic mode: Origin Pull CDN
    39. CDN integration module • Drupal core patch for Drupal 6, committed to Drupal 7 & 6 • Basic mode: Origin Pull CDN • Advanced mode: any CDN (File Conveyor)
    40. Test case
    41. Test case
    42. Test case
    43. Bachelor thesis: conclusion
    44. Bachelor thesis: conclusion 1. Speed of web site very important
    45. Bachelor thesis: conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance
    46. Bachelor thesis: conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance 3. CDN integration module → best method to integrate Drupal with a CDN
    47. Bachelor thesis: conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance 3. CDN integration module → best method to integrate Drupal with a CDN 4. File Conveyor → flexible CDN integration
    48. It doesn’t end here!
    49. What else exists …
    50. What else exists … High-traffic Drupal sites:
    51. What else exists … High-traffic Drupal sites:
    52. What else exists … High-traffic Drupal sites: “Pressflow is a distribution of Drupal with integrated performance, scalability, availability, and testing enhancements.”
    53. What else is coming …
    54. What else is coming … 1. CDN integration 2
    55. What else is coming … 1. CDN integration 2 2. ESI
    56. What else is coming … 1. CDN integration 2 2. ESI 3. Master thesis
    57. CDN integration 2 Goals
    58. CDN integration 2 Goals 1. Significant expansion of functionality
    59. CDN integration 2 Goals Obsoleting modules: 1. Significant expansion of functionality Parallel & Simple CDN New mode: Media Mover
    60. CDN integration 2 Goals Obsoleting modules: 1. Significant expansion of functionality Parallel & Simple CDN New mode: Media Mover 2. Awesome UX
    61. CDN integration 2 Goals Obsoleting modules: 1. Significant expansion of functionality Parallel & Simple CDN New mode: Media Mover 2. Awesome UX Bojhan Somers
    62. CDN integration 2 Goals Obsoleting modules: 1. Significant expansion of functionality Parallel & Simple CDN New mode: Media Mover 2. Awesome UX Bojhan Somers 3. Excellent docs + high-quality screencasts
    63. CDN integration 2 Goals Obsoleting modules: 1. Significant expansion of functionality Parallel & Simple CDN New mode: Media Mover 2. Awesome UX Bojhan Somers 3. Excellent docs + high-quality screencasts 4. Streamlined CDN-specific UI
    64. CDN integration 2 Goals Obsoleting modules: 1. Significant expansion of functionality Parallel & Simple CDN New mode: Media Mover 2. Awesome UX Bojhan Somers 3. Excellent docs + high-quality screencasts 4. Streamlined CDN-specific UI
    65. CDN integration 2
    66. ESI Source: http://www.trygve-lie.com/blog/entry/esi_explained_simple
    67. ESI Source: http://www.trygve-lie.com/blog/entry/esi_explained_simple
    68. ESI Based on: http://www.trygve-lie.com/blog/entry/esi_explained_simple
    69. ESI
    70. ESI • ESI module: http://drupal.org/project/esi
    71. ESI • ESI module: http://drupal.org/project/esi • In development
    72. ESI • ESI module: http://drupal.org/project/esi • In development • developers are actively participating
    73. ESI • ESI module: http://drupal.org/project/esi • In development • developers are actively participating • David Strauss (Four Kitchens)
    74. ESI • ESI module: http://drupal.org/project/esi • In development • developers are actively participating • David Strauss (Four Kitchens) • Josh Koenig (Chapter Three)
    75. ESI • ESI module: http://drupal.org/project/esi • In development • developers are actively participating • David Strauss (Four Kitchens) • Josh Koenig (Chapter Three) • Note: not every CDN supports ESI!
    76. Master thesis “Web Performance Optimization: Analytics” • Promotor: Prof. dr. Jan van den Bussche
    77. Master thesis
    78. Master thesis • An application that can automatically extract conclusions out of Episodes logs and visualize them.
    79. Master thesis • An application that can automatically extract conclusions out of Episodes logs and visualize them. • e.g.:
    80. Master thesis • An application that can automatically extract conclusions out of Episodes logs and visualize them. • e.g.: • “http://d.o/ is slow in Belgium, for users of the ISP Telenet”
    81. Master thesis • An application that can automatically extract conclusions out of Episodes logs and visualize them. • e.g.: • “http://d.o/ is slow in Belgium, for users of the ISP Telenet” • “http://d.o/project/esi has slowly loading CSS”
    82. Master thesis • An application that can automatically extract conclusions out of Episodes logs and visualize them. • e.g.: • “http://d.o/ is slow in Belgium, for users of the ISP Telenet” • “http://d.o/project/esi has slowly loading CSS” • “http://d.o/project/cdn has slowly loading JS for visitors that use the browser Internet Explorer 6 or 7”
    83. Master thesis • An application that can automatically extract conclusions out of Episodes logs and visualize them. • e.g.: • “http://d.o/ is slow in Belgium, for users of the ISP Telenet” • “http://d.o/project/esi has slowly loading CSS” • “http://d.o/project/cdn has slowly loading JS for visitors that use the browser Internet Explorer 6 or 7” • Challenges: complex data mining, distributed computing, storage, performance
    84. Master thesis • An application that can automatically extract conclusions out of Episodes logs and visualize them. • e.g.: • “http://d.o/ is slow in Belgium, for users of the ISP Telenet” • “http://d.o/project/esi has slowly loading CSS” • “http://d.o/project/cdn has slowly loading JS for visitors that use the browser Internet Explorer 6 or 7” • Challenges: complex data mining, distributed computing, storage, performance See http://wimleers.com/tags/master-thesis
    85. Conclusion
    86. Conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance 3. CDN integration module → best method to integrate Drupal with a CDN 4. File Conveyor → flexible CDN integration
    87. Conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance 3. CDN integration module → best method to integrate Drupal with a CDN 4. File Conveyor → flexible CDN integration 5. CDN integration module 2 → more awesome!
    88. Conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance 3. CDN integration module → best method to integrate Drupal with a CDN 4. File Conveyor → flexible CDN integration 5. CDN integration module 2 → more awesome! 6. ESI → big potential
    89. Conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance 3. CDN integration module → best method to integrate Drupal with a CDN 4. File Conveyor → flexible CDN integration 5. CDN integration module 2 → more awesome! 6. ESI → big potential 7. Master thesis → more insight
    90. Conclusion 1. Speed of web site very important 2. Episodes → monitor page loading performance 3. CDN integration module → best method to integrate Drupal with a CDN 4. File Conveyor → flexible CDN integration 5. CDN integration module 2 → more awesome! 6. ESI → big potential 7. Master thesis → more insight This talk will be available at http://wimleers.com/talk/fosdem-2010
    91. Need Drupal page loading performance help? Contact me: http://wimleers.com/contact This talk will be available at http://wimleers.com/talk/fosdem-2010

    ×