Nick Santamaria's performance and scalability presentation from DrupalSouth 2015.
https://melbourne2015.drupal.org.au/session/performance-not-afterthought
5. Performance & Scalability
Performance
The speed with which a single request can be executed.
Scalability
The ability of a request to maintain its performance under increasing load.
6. What is Performance?
Back-end Performance Components
• PHP
• Amount of code being executed (ie, number of modules)
• Efficiency of code
• Database
• Schema design
• Query execution time
7. What is Performance?
Back-end Performance Components
• API Requests
• PHP will wait until the request returns a result or times out
• Caching
• Drupal database
• Memcached / Redis / MongoDB
• Varnish
8. What is Performance?
Front-end Performance Components
• Network Overhead
• Local vs offshore datacenters
• Number of requests
• Payload Size
• Image optimisation
• CSS / JS Minification
• Markup size & compression
9. What is Performance?
Front-end Performance Components
• Javascript
• Number of scripts being included
• Synchronous vs asynchronous execution
• Code efficiency
10. What is Scalability?
“Why is scalability so hard? Because scalability cannot be an after-thought.”
- Werner Vogels, Amazon CTO
11. What is Scalability?
A system is said to be scalable if adding resources results
in proportionally increased performance.
9 women can not make a baby in 1 month.
Will doubling your site’s server resources double the traffic it can handle?
12. What is Scalability?
Scalability Components
• Caching
• Block cache
• Page cache
• Reverse proxy cache
• Opcode caching
• Infrastructure
• Web server load balancing
• Database clustering
• Caching backends - redis, memcached etc..
14. Common Problems
Too many modules - AKA “Open Buffet Syndrome”
Real life example
• 365 enabled modules
• 24 core modules
• 51 custom modules
• 72 exported features
• 750 files loaded on every request.
• 10 - 20% of PHP execution time was loading files, even with APC.
• CPU cycles wasted - 25,000+ calls to module_implements() per request.
15. • Pages with product/* paths are NEVER cached.
• Anonymous users who visit this page bypass page cache on all subsequent pages.
• … AND those visitors write to the database on every subsequent page view.
Common Problems
Anonymous users with sessions
Seems innocent, but this one line has consequences.
17. Strategies for Success
Complicated entity & field architecture
• How many INSERT queries per save?
• node
• node_revision
• field_collection_item
• field_collection_item_revision
• field_data_field_collection_b
• field_revision_field_collection_b
• field_data_field_taxonomy_ref
• field_revision_field_taxonomy_ref
• field_data_field_collection_c
• field_revision_field_collection_c
• field_data_field_text
• field_revision_field_text
• file_managed
• field_data_field_media
• field_revision_field_media
Real world field collection implementations are
FAR more complicated than this example!
18. Common Problems
Others
• Never use views_php module - create custom views handlers and plugins.
• Complex faceted search using Drupal database - use Solr.
• dblog module enabled on production - use syslog.
• Carefully consider use of modules with node access functionality - they disable
block caching.
19. Common Problems
Others
• Never use views_php module - create custom views handlers and plugins.
• Complex faceted search using Drupal database - use Solr.
• dblog module enabled on production - use syslog.
• Carefully consider use of modules with node access functionality - they disable
block caching.
21. On-Demand Cache Purging
• Planning
• Divide the site into page “types”.
• For each type, build a list of events which would require a page
to be cleared from cache.
• Considerations
• No relative dates, ie “time ago”.
• Some page types may be more suited to periodic caching.
• Create a spidering script to warm the caches!
• Extend to other caches using CacheTags - drupal.org/project/cachetags
Strategies for Success
22. Strategies for Success
Authcache (2.x branch)
• Replaces Drupal’s default page caching allowing you to cache
authenticated pages.
• Huge scalability improvements for sites with a large proportion of
authenticated visitors.
• But also much, much more.
• Personalisation - authcache_p13n
• Form token magic - authcache_form
• Store page cache in Varnish - authcache_varnish
• Integrates with Cache Expiration
23. Strategies for Success
Authcache
• Planning
• Define which page types are cacheable.
• Design how you will segment your visitors (from a cache
perspective).
• Identify all personalised information which must be displayed.
• Considerations
• Forms can be tricky - ensure you test thoroughly.
• Ensure your analytics / marketing / tracking services are
compatible.
• See Commerce Kickstart for great out-of-the-box implementation.
24. Strategies for Success
Consuming Feeds & Web Services
• Regularly importing data into Drupal can be resource intensive.
• Feeds, migrate, custom PHP etc… All share the same fundamental
problems:
• Fetching large datasets, which hog i/o, memory, and CPU
cycles.
• Lots of slow INSERT and UPDATE operations on the database.
• New data will not display immediately unless caches cleared.
• The solution? Move to the front end!
25. Strategies for Success
Consuming Feeds & Web Services
• PaRSS - drupal.org/project/parss
• Integrates simple jQuery RSS parser with link fields.
• AngularJS - angularjs.org
• Very powerful front-end MVC framework.
• Usual implementation may not be suitable for this problem.
• Angular Blocks - drupal.org/node/2445795
• Allows other modules to expose AngularJS apps as blocks!
• Used successfully on recent intranet project, some pages
having 6 angular apps on a single page.
26. Strategies for Success
Load Testing
• Make it part of your development process.
• Dont leave it to the last minute or post-launch.
• Tools
• Apache jMeter
• github.com/jacobSingh/Drupal-Performance-Testing-Suite
• Blazemeter - blazemeter.com
• Blitz - blitz.io
• Web Page Test - webpagetest.org
27. Strategies for Success
Queues
• Use queues when dealing with:
• Batch processing large datasets.
• Performing complex calculations.
• Sequential processing of tasks.
• Modules / Tools
• Advanced Queue - drupal.org/project/advancedqueue
• Advanced Queue Runner - github.com/nvahalik/advancedqueue-runner
• Drupal Core Queues - system.queue.inc
28. Strategies for Success
Queues
• Improves reliability.
• If not using queues
• There is no guarantee the process will be completed.
• If the process fails, there is no easy way to repeat it.
• If using queues
• Each item is executed at least once.
• If the process fails, the queue remains intact.
• System load is stabilised because processing of complex or
heavy operations is delayed.
29. Strategies for Success
Optimised Front-end
• Image Sprites
• Minimises the number of HTTP requests.
• CSS
• Think about what your sass / less becomes once compiled.
• How complex and specific do the selectors become?
• Consider architecting your CSS for conditional inclusion.
• Does the site have “sections”?
• CSS rendering is a blocking process.
30. Strategies for Success
Optimised Front-end
• Asynchronous Javascript - drupal.org/project/async_js
• Defers javascript execution.
• Can improve responsiveness of “sluggish” JS-heavy sites.
• Advanced Aggregation - drupal.org/project/advagg
• Use CDN version of jQuery.
• On-demand generation of aggregated assets.
31. Strategies for Success
Other Recommendations
• Elysia Cron - drupal.org/project/elysia_cron
• Configure scheduling and frequency of specific cron tasks.
• Run heavy cron tasks during low traffic periods.
• Entity Cache - drupal.org/project/entitycache
• Stores complete entity objects in your caching backend.
• Enable appropriate dependent modules such as
commerce_entitycache, bean_entitycache etc..
• Apache Solr for search
• drupal.org/project/search_api_solr
• drupal.org/project/apachesolr
33. Infrastructure
Caching Backends
• Memcached - drupal.org/project/memcache
• Battle tested.
• Widely deployed.
• Volatile storage - not suitable for persistent data.
• Redis - drupal.org/project/redis
• Less “mature” than Memcached.
• 1:1 featureset with Memcached.
• Benchmarks slightly better than Memcached.
• Commits data to disk by default, can be used for persistent data
• Use PHP extension - github.com/phpredis/phpredis (not Predis class)
34. Infrastructure
Caching Backends
• I recommend Redis
• Store sessions in Redis rather than the database
Session Proxy - drupal.org/project/session_proxy
• Form cache can go straight into redis - no more need for this line:
$conf['cache_class_cache_form'] = 'DrupalDatabaseCache';
35. Infrastructure
Simplest Approach
• Single server with all components
• PHP
• Web Server (Apache)
• Database (MySQL)
• Varnish (... sometimes)
Varnish
Apache
PHP
MySQL
Instance #1
36. Infrastructure
Scaling Vertically
• Increase instance size.
• Change instance types:
• CPU optimised
• Memory optimised
• I/O optimised
• Will hit an endpoint eventually.
“We’re going to need a bigger box”
42. Debugging Performance and Scalability Issues
Tools
• New Relic APM, browser & server monitoring
• MySQL slow query log
• Add following lines to my.cnf and restart mysql
• log_slow_queries=/var/log/mysql/slow-query.log
• long_query_time=20
• XHProf - PHP profiler
• Great slides for getting set up here - http://msonnabaum.github.io/xhprof-
presentation/
• Browser Developer Tools
• Javascript profiler
• Network Monitor
43. Debugging Performance and Scalability Issues
General Tips
• Look beyond the symptoms to find the underlying cause.
• Change one thing at a time.
• Measure, change, measure.
• Sometimes you just have to throw more RAM at the problem.