Drupal Distribution Optimised for performance and scale
Support for database replication
Support for Squid or Varnish as reverse proxy caches
Optimised for MySQL
Optimised for PHP
Supported by Acquia
Varnish Varnish stores web pages in memory so the web servers don't have to create the same web page over and over again. The web server only recreates a page when it is changed. Reporting from Varnish can provide some good data to help with capacity planning.
Good candidates to remove, mod_cgi, mod_dav, mod_ldap *
How do I see if that made a difference?
ab -n 100 -c 5 http://www.domain.com/test.html
Is the swap file being used?
vmstat 1 60
Runs vmstat every 1 second 60 times
Decreasing Apache Timeouts
Normally the default is set to 5 minutes (300 seconds), how about reducing it to 20 seconds?
You can do this in your virtual host or server config.
Maxclients Setting in Apache Whats a good way to find the maxclients number? MaxClients ≈ (Ram - size of all other processes) / (size of the apache process) We are making an educated guess based on dividing the system memory (our physical RAM) by the maximum size of an apache process, with enough wiggle room to have the operating system run smoothly. To find the size of a running process; ps -ylC apache2 --sort:rss Divide the size by 1024 to get the process size in meg. You can also use pmap, use top to find the pid, then use; pmap -x <pid id> Another handy way of seeing how your memory is doing. free -m and good old VMSTAT to see if your memory is being paged vmstat 5 60
mod_expires Allows Drupal to send out http expires headers caching all files in the users browsers (~2 weeks) or until a new page is made available.
Drupal is pre-configured to use mod_exp if it's available. Configure it's use in your .htaccess
# Cache files for 2 weeks after access
# Don't Cache Dynamic pages
ExpiresByType text/html A1
You don't let Apache cache HTML content as Drupal's content isn't static. This is why Drupal uses it's own cache.
Reduce DNS Lookups You can tell Apache to not perform a DNS lookup on files by their mime type. HostnameLookups Off <Files ~".(html/cgi)$"> HostnameLookup On </Files> File Negotiation Be specific when specifying filenames rather than wildcards (when possible). Instead of: DirectoryIndex index Use: DirectoryIndex index.cgi index.php
htaccess If you have access to your VirtualHosts in Apache, move your directives out of htaccess and move them in to the VirtualHost for your website. The reason for this is that Apache loads your virtual host once when started but Drupal searches for htaccess files in multiple directories at runtime. Disable htaccess lookups with: <Directory /> AllowOveride None </Directory>
Uses multiple child processes with many threads each. Each thread handles one connection at a time.
Uses multiple child processes each with one connection at a time. On many systems it's comparable to worker (in speed) but it uses more memory. Generally, recommended for Drupal due to it's threading model.
Alternatives to Apache Nginx (engine-X) Faster than Apache and has more predictable memory usage. Not as straight forward to setup (rewrite rules for example). LIGHTHTTPD Good performance. Although there has been discussion on the drupal forums as to its ability to cleanly run Drupal 7. Not for the feint of heart. Microsoft Web Matrix Runs Drupal under IIS with PHP, good for Microsoft shops. http://www.microsoft.com/web/drupal/ Apache supports pluggable concurrency modules, called M ulti P rocessing M odules. So which is a good fit for my website?
Drupal does a lot of work in the database, especially for authenticated users and modules, so how can we get the best out of our database?
Database Optimisations Enable Query Cache This feature is generally disabled by default. To enable it, assign a value to query_cache_size in your mysql configuration file. [mysqld] query_cache_size = 64M You can query the setting as it's a variable. SHOW VARIABLES LIKE 'have_query_cache'; You may have to do some testing to find the best value to use. Logging Slow Queries You can instruct MySQL to log all queries that take too long to run, for later analysis. [mysqld] log_slow_queries=/var/log/slow-queries.log log_query_time=5 Leaving off the query time will default it to 10 seconds. Analysing Slow Queries Prepend your query with EXPLAIN and run it for more information, or analyse your query with MAATKIT. EXPLAIN will show which indices are being used, sometime just indexing the table can be a good fix.
Table locking can be a good indicator of problems in your database.
More Database Tips MyISAM and InnoDb Performance wise, they both stand up well, but whats the difference for Drupal? Table Locking MyISAM = Table Level Locking InnoDB = Row level Locking How do I know if I need to make a change? Take a look at SHOW STATUS LIKE 'Table%'; Table_locks_immediate 1151552 Table_locks_waited 15324 How do I change a table type? ALTER TABLE accesslog TYPE='InnoDB';
Good Candidates for InnoDB
Which tables would benefit from changing to use row level locking.
mysqlreport can be automated to show any wait times for transactions.
Keep your cron short to prune tables regularly if tables get too big.
You can use the Devel module to identify Query expensive pages in Drupal.
Caching in Drupal can be enabled through the Performance module in Admin and settings.php.
Drupal Ships with 6 Cache tables
When writing your own modules and need to cache data, think about using your own tables. It reduces write contention with Drupal for using the cache tables and doesn't bloat the table with your data.
Most functions also have a $reset parameter which instructs the function to clear down it's internal cache.
Drupal Performance Module Options Normal: Drupal bootstraps in phases, when normal is selected it uses just enough phases to load a page from cache. Keeping db queries to a minimum . Aggressive : Completely bypasses loading of all modules. Boot and Exit hooks are never called for cached pages. This means less PHP is parsed since no modules are loaded. There's are also fewer database calls. Fastpath : Not enabled from the admin panel, this option is enabled from settings.php. The idea is that a call to the file system is faster as there's no ramping up for a database query. This may not scale across load balanced hosting..
Drupal doesn't store session information for the first anonymous visitor. This is so webcrawlers and spiders don't fill your session tables up. However, these tables can get very large.
Garbage Collection Default value for garbage collection is a little over 2 days, you can also increase the frequency of collection with session.gc_maxlifetime (seconds) session.cache_expire (minutes) Note : When you adjust maxlifetime, adjust cache_expire to be the same. Tip: As Drupal can serve cached pages to anonymous users and anonymous users don't normally use interactive features of Drupal. How about reducing the time they are logged in or log them out when they close their browser. # 86400 seconds = 24 hours session.cookie_lifetime, 86400 # Logout on browser close session.cookie_lifetime, 0 Pruning Sessions Drupal controls when session start by turning off PHP's session autostart functionality in htaccess. php_value session.auto_start 0 The session table is cleared out when PHP's garbage collection runs. The lifetime of a session record is determined by session.gc.maxlifetime (seconds) Other settings you can experiment with: session.cache_expire session.cache_limiter session.cookie_lifetime session.save_handler session.use_only_cookies session.use_trans_sid
Google's custom search service with some impressive features, including branding support and on-the-fly indexing support. Well documented API, but requires some work to get setup initially.
Alternatives to Search Acquia Search Built upon Lucene and Solr from Apache, hosted service with easy integration with Drupal. All administration tasks are built in to the admin panel. A very powerful alternative to Drupal's built in search. http://acquia.com/products-services/acquia-search Apache Solr Java based open source enterprise search platform from the Apache Lucene project. http://lucene.apache.org/solr/
Throttling and Block Caching
Enabling throttling allows you to turn off modules & blocks when the system starts to get sluggish. You can set the threshold in the admin panel, and determine which modules and blocks to turn off from their respective admin pages.
Cacherouter / Boost / Devel / Authcache
Content Delivery Networks
The capacity sum of strategically placed servers can result in an impressive boost in the number of concurrent users.
Web-based companies live or die by the ability to scale their infrastructure to accommodate increasing demand. This book is a hands-on and practical guide to planning for such growth.
O'Reilley Media : Amazon
Further Reading High Performance MySQL High Performance MySQL is the definitive guide to building fast, reliable systems with MySQL. This book covers every aspect of MySQL performance in detail and focuses on robustness, security and data integrity. O'Reilley Media : Amazon Even Faster Websites Steve Sounders works on the performance team at Google and has written a couple of great books on performance. O'Reilley Media : Bio