Scale Apache with Nginx

6,624 views

Published on

The need to scale is in high demand in an age where everything is moving to the cloud. Though the standard Apache configuration could handle a website with moderate traffic, the minute it gets slash dotted or twitted multiple times could spell an embarrassing crash landing! If you are the administrator of such a website then good luck finding another job! On the other hand you value high availability in the midst of popularity then read on. On this one day workshop, we will show you how to scale your website and webapps to scale to handle thousands of simultaneous sessions the right way. The topics covered will include:
- Setting up Apache and NGiNXM
- Setting up a sample LAMP web app
- Benchmarking Apache performance
- Fine tuning Apache to improve performance
- Fine tuning NGiNX to improve performance
- Discussion about code level improvements when developing custom webapps using PHP

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,624
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
98
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Scale Apache with Nginx

  1. 1. Scaling Apache for LAMP Buddhika Siddhisena (Co-Founder & CTO of THINKCube Systems) bud@thinkcube.com | twitter @geekaholic
  2. 2. What is ApacheThe folk story is that Apache was named after "A-patchy-server", which was the result of NCSA httpdserver being patched a lot. The project was started by Brian BehlendorfToday, Apache is still the most popular web server out there running more than half of the websiteson the net. It is actively developed by the Apache Software Foundation along with many other softwareprojects.Besides its primary function of being a website, Apache can also be configured as a reverse proxy forload balancing.
  3. 3. Installing ApacheThe easiest method of installing Apache along with PHP and MySQL (aka LAMP) is to use the taskselcommand.taskselAlternatively install each package manually:apt-get install apache2 libapache2-mod-php5 mysql-serverInstalling a sample LAMP app - DrupalIn order to test out Apache performance as we tune it, it is good to setup a real world full fledged CMSsuch as Drupal. Download the latest version of Drupal from drupal.org Follow the Drupl setup guide Install the Devel module into Drupal modules directory Login to Drupal as admin and using the devel plugin, populate Drupal with sample data for testing (Configuration -> Development -> Generate Content)
  4. 4. Setting up Benchmarking toolsSetup AutobenchAutobench is a handy script to stress test a webserver by sending an increasing number of requests. Itworks by calling the httperf tool iteratively with increasing parameters.Download autobench and follow directions to compile.In order to plot graphs, you need to install gnuplot via apt. As of this writing, the script used to plotthe graph has a bug calling the current version of gnuplot and requires the following minormodification.$ sudo vi which bench2graphline ~78 should beecho set style data linespoints >> gnuplot.cmd
  5. 5. Baseline benchmark with AutobenchLets benchmark our standard Apache setup to get an idea of default performance. autobench --single_host --host1 localhost --uri1 /drupal --quiet --low_rate 20 --high_rate 200 --rate_step 20 --num_call 10 --num_conn 5000 --timeout 5 --file results.tsvBasically the above will test a single host, localhost/drupal by sending it 20 connections per second,each having 10 requests up to 200 connections per second incrementing by 20. The total number ofconnections are capped at 5000 while any request that takes more than 5 seconds to respond isconsidered unsuccessful.Plotting the resultsUsing the result.tsv file and the included bench2graph utility, you can plot a graph into a postscriptfile. bench2graph results.tsv results.ps
  6. 6. Tuning Apache - Enable GZipYou can decrease network overhead and make pages load faster, there by reducing the amount oftime a client is connected by compressing pages using gzip. All modern browser support renderingcompressed files.In order to benchmark its effect, you can install a tool such as Firebug on the client side.
  7. 7. Tuning Apache - Enable GZipEnable the mod_deflate module. On Ubuntu :a2enmod deflate && a2enmod headersThen well configure deflate to compress everything except images.sudo vi /etc/apache2/modules-enabled/deflate.conf 1 <Location /> 2 # Insert filter 3 SetOutputFilter DEFLATE 4 5 # Dont compress images 6 SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary 7 8 # Make sure proxies dont deliver the wrong content 9 Header append Vary User-Agent env=!dont-vary1011 </Location>
  8. 8. Apache configuration tuningThere are a few key parameters that can be tuned: KeepAlive - By default its set to ON which is good. Clients will make all requests in one shot via http 1.1. KeepAliveTimeout - Better to keep it low. Defaults to 15 sec. Make sure thats enough. Rule is 1.5 to 2 times your page load speed. TimeOut - The default is 5 minutes which might be long to allow for one process. Adjust accordingly. StartServers, MinSpareServers, MaxSpareServers - Generally even on a busy site you may not need to tweak. Apache can self regulate. MaxClients - The maximum number of clients (threads) Apache will handle simultaneously.
  9. 9. Calculating MaxClientsps -eafly |grep apache2|awk {print $8}|sort -nUse free to figure out how much memory is available. Cache is also considered free memory but youmight want to leave some and not assume all cache will be used.freeBy deviding free memory by the average memory used by an Apache thread, you can estimate thenumber of MaxClients.e.g: Assuming Apache memory usage and free memory are as follows$ ps -eafly |grep apache2|awk {print $8}|sort -n816389638963896389620844$ free total used free shared buffers cachedMem: 508904 447344 61560 0 141136 213468-/+ buffers/cache: 92740 416164Swap: 407544 4364 403180Memory avail ~= 60000 (free) + 100000 (cached) ~= 160 MB and Memory per thread ~= 4 MB Then asafe value for MaxClients = 40
  10. 10. Improving PHP performanceWe can improve PHP performance by1. Caching pages (useful if dynamic content doesnt change often)2. PHP Opcode optimizations (pre-compile php)Fortunately we can get the benefit of both using PHP APC, which is a PHP accellerator!apt-get install php-apcYou can verify installation by loading a php page having phpinfo(); and searching for apc. Or if youhave php5-cli installed:php -r "phpinfo();" | grep apcUsing memcachedMemcached is a distributed cache for storing key-value pairs in memory for faster access withreduced trips to the database. Some popular PHP apps can use memcache if available. memcacheddoes not instantly accellerate PHP!apt-get install memcached php5-memcacheservice memcached start
  11. 11. More Tips for improving performance Keep DirectoryIndex file list as short as possible. Whenever possible disable .htaccess via AllowOverride none Use Options FollowSymLinks to simplify file access process in Apache Minimize the use of mod_rewrite or at least complex regexs If logs are unnecessary disable them or log to another server via syslog. For Deny/Allow rules use IPs rather then domains. (prevents superfluous DNS lookups). Do not enable HostnameLookups (DNS is slow). For dynamic sites see if you can separate dynamic vs static content into two servers
  12. 12. Scaling Architectures Buddhika Siddhisena (Co-Founder & CTO of THINKCube Systems) bud@thinkcube.com | twitter @geekaholic
  13. 13. Architecture overviewIn terms of scaling the web server there are few options.1. Single machine (Scale vertically)Basically the easiest to setup. Scaling is a matter of buying a better server or upgrading it!2. App-DB machines (2-Tier)Separate DB from App, as a result each can be scaled separately.
  14. 14. Architecture overview contd...3. Load balancer + App-DB machines (3-Tier)Load balancer (aka reverse proxy) will route requests betwen multiple backend HTTP servers whilecaching results.
  15. 15. Data IndependenceIt is good to isolate the data from the app by hosting it on a separate server. This was the two aspectscan be scaled independantly. Some methods to consider: Store DB data on MYSQL running on a separate server Enable file sharing to share data files using NFS, rsync Clustering MYSQL across multiple servers using mysqlcluster Cluster file system via DRDB, GFS2 or as Facebook does using Bittorrent
  16. 16. Setting up an HTTP accellerator using Apache
  17. 17. Apache as a reverse proxyIn this setup, the reverse server is what the user will contact while the real webserver can be hiddenbehind a private network.1. On the reverse proxy server :Enable required modules for caching reverse proxy.a2enmod proxya2enmod proxy_connecta2enmod proxy_httpa2enmod cache2. Configure proxy modulevi /etc/apache2/modules-enabled/proxy.conf1 <Proxy *>2 AddDefaultCharset off3 Order deny,allow4 Deny from all5 Allow from all6 </Proxy>7 ProxyVia On
  18. 18. Apache as a reverse proxy contd...3. Setup (public) virtual hostNext we configure an empty virtual host that is configured to the public site. But instead of showingthe document root we do a reverse proxy.vi /etc/apache2/sites-available/public-domain.com 1 <VirtualHost *:80> 2 3 ServerName your-public-domain.com 4 5 <Proxy *> 6 Order deny,allow 7 Allow from all 8 </Proxy> 910 ProxyPass / http://your-private-domain.com/11 ProxyPassReverse / http://your-private-domain.com/1213 </VirtualHost>a2ensite public-domain.comservice apache2 reload
  19. 19. Using Nginx
  20. 20. What is Nginx? Nginx was designed as a reverse proxy first, and an HTTP server second Unlike Apache, Nginx uses a non blocking process modelTwo modes of operation for Nginx:1. Use Nginx for the static content and Apache for PHP2. Use FastCGI to embed PHP
  21. 21. Nginx process model in a nutshell Receive request, trigger events in a process The process handles all the events and returns the output Process handles events in parallel Limitation is PHP can no longer be embedded ( mod_php ) inside process as PHP is not asynchronous Unlike Apache, Nginx doesnt not have an .htaccess equivelant. You need to reload server after making any chage, making it difficult to use for shared hosting
  22. 22. Using Nginx and Apache side-by-sideIn this setup we put Nginx as the frontend http accellerator and Apache as the backend app server. Ifyou want to run this on the same physical server youll need to either change the Apache port from 80to another value or bind and Nginx to their own IP addresses with the same server.Listen 8080or using the ip addressListen 127.0.0.1:8080Now were ready to install Nginxsudo apt-get install nginx
  23. 23. Apache style virtual host in NginxNginx uses a different format for defining virtual hosts than Apahche.1 <VirtualHost>2 DocumentRoot "/usr/local/www/mydomain.com"3 ServerName mydomain.com4 ServerAlias www.mydomain.com5 CustomLog /var/log/httpd/mydomain_access.log common6 ErrorLog /var/log/httpd/mydomain_error.log7 ...8 </VirtualHost>becomes... 1 server { 2 root /usr/local/www/mydomain.com; 3 server_name mydomain.com www.mydomain.com; 4 5 # by default logs are stored in nginxs log folder 6 # it can be changed to a full path such as /var/log/... 7 access_log logs/mydomain_access.log; 8 error_log logs/mydomain_error.log; 9 ...10 }
  24. 24. Redirecting all PHP requests to ApacheThe following example will server all static content via nginx while redirect dynamic content (php) toApache 1 server { 2 listen 80 default; 3 server_name localhost; 4 5 access_log /var/log/nginx/localhost.access.log; 6 7 location / { 8 root /var/www; 9 index index.html index.htm;10 }1112 ## Parse all .php file in the /var/www directory13 location ~ .php$ {14 # these two lines tell Apache the actual IP of the client being forwarded1516 proxy_set_header X-Real-IP $remote_addr;17 proxy_set_header X-Forwarded-For $remote_addr;1819 # this next line adds the Host header so that apache knows which vHost to serve2021 proxy_set_header Host $host;2223 # And now we pass back to apache24 proxy_pass http://127.0.0.1:8080;2526 }27 }
  25. 25. References How To Improve Website Performance (With Drupal, LAMP) PHP Performance tuning HipHop compiler by Facebook How the code you write affects PHP benchmarks Scaling Facebook with OpenSource tools Scaling drupal DRBD for HA distributed file system
  26. 26. Thank youbud@thinkcube.com | twitter @geekaholic

×