phptek13 - Caching and tuning fun tutorial

  • 5,486 views
Uploaded on

phptek13 Caching and tuning fun for high scalability tutorial

phptek13 Caching and tuning fun for high scalability tutorial

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
5,486
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
19
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Caching serves 3 purposes : - Firstly, to reduce the number of requests or the load at the source of information, which can be a database server, content repository, or anything else.
  • Secondly, you want to improve the response time of each request. If you request a page that takes 50ms to load without caching and you get 10 hits/second, you won't be able to serve those requests with 5 Apache processes. If you could cache some of the data used on the page you might be able to return the page in 20ms. That doesn't just improve user experience, but reduces the load on the webserver, since the number of concurrent connections is a lot lower. → connections closed faster → handle more connections and, as a result, more hits on the same machine. → If you don't : more Apache processes needed → will eat memory, will eat system resources and as a result will also cause context switching.
  • More tuning → Reduce the amount of data that needs to be sent over the network or the Internet - benefits you, as application provider, because you have less traffic on your upstream connection. - also better for the enduser because there will be less data to be downloaded. → Ofcourse part of frontend side, which we'll discuss near the end of the tutorial.
  • The first way to cache in a web application, or actually more commonly a website, is to cache entire pages. This used to be a very popular caching technique, back in the days when pages were very static. It's still popular for things like company websites, blogs, and any site that has pages that don't change a lot. Basically, you just render the page once, put it in a cache, and from the moment onwards, you just retrieve it from the cache.
  • Store part of a page. Probably most common + best way to cache data. - Basically what you do is, take piece of data : - data from the database - result of a calculation - an aggregation of two feeds - parsed data from CSV-file from NFS share located on the other side of the world - could be data that was stored on a USB stick your kid is now chewing on. What I mean is : it doesn't matter where the data came from. Part of a page, usually a block on a page and want save time by not having to get that data from its original source every time again. Instead of saving entire page, where you can have multiple dynamic parts, some of which might not be cached because they are really dynamic, like the current time. So store small block, so that when we render the page, all we do is get small block from cache and place it in dynamic page and output it.
  • Store the output of SQL queries. → Now, who of you know what SQL query caching is, MySQL query cache for example ? → Basically, the MySQL query cache is a cache which stores the output of recently run SQL queries. It's built into MySQL, it's... not enabled by default everywhere, it depends on your distribution. → And it speeds up queries by a huge margin. Disabling it is something I never do, because you gain a lot by having it enabled. → However, there's a few limitations : - First of all, the query cache is limited in size.
  • But, basically one of the big drawbacks of MySQL query cache, is that every time you do an insert, update or delete on a table, the entire query cache for queries referencing that table, is erased. → Another drawback is that you still need to connect to the MySQL server and you still need to go through a lot of the core of MySQL to get your results. → So, storing the output of SQL queries in a separate cache, being Memcache or one of the other tools we're going to see in a moment, is actually not a bad idea. Also because of the fact that, if you have a big Website, you will still get quite a bit load on your MySQL database. So anything that takes the load off the database and takes it to where you have more resources available, is a good idea. → Better : store returned object or group of objects
  • Another caching technique I want to mention is storing the result of complex PHP processes. - You might think about some kind of calculation, but when I mention calculation, people tend to think about getting data from here and there and then summing them. - That's not what I mean. By complex PHP processes I mean things like parsing configuration files, parsing XML files, loading CSV-data in an array, converting mutliple XML-files into an object structure, and so on. - End result of those complex PHP processes can be cached, especially if the data from which we started doesn't change a lot. That way you can save a lot of system resources, which can be used for other things.
  • There's plenty of other types of data to store in cache. The only limit there is your imagination. All you need to think of is : - I have this data - how long did it take me to create it - how often does it change - how often will it be retrieved ? That last bit can be a difficult thing to balance out, but we'll get back to that later.
  • OK, let's talk about where cached data can be stored. I already mentioned MySQL query cache. Turn it on But don't rely on it too heavily especially if you have data that changes often.
  • I said I was going to discuss some do's and don'ts... → This one falls under the category don't → There's a second database mechanism for "caching", at least some people use it for that purpose. It's called database memory tables. → MySQL has such as storage type : it's called a memory or a heap table. And basically it allows you to store data in tables that are stored in memory. → Don't confuse it with a temporary table, which is only valid for your connection. → This is actually a persistent table, well persistent meaning that it will survive after you disconnect, but it won't survive a server reboot, because it's in-memory only. → Advantages of this storage type are that it's faster than disk-based tables and you can join it with disk-based tables. Also, there's a default limit of 16MByte per table and it can be troublesome getting it to work on a master-slave setup. → So my advise is : don't use it.
  • Alright, next. Opcode caching... this is definitely a DO. → There's a few opcode caches out there. → Now what is opcode caching ? Basically, when you run a PHP file, the PHP is converted by the PHP compiler to what is called bytecode. This code is then executed by the PHP engine and that produces the output. → Now, if your PHP code doesn't change a lot, which normally it shouldn't while your application is live, there's no reason for the PHP compiler to convert your source code to bytecode over and over again, because basically it's just doing the same thing, every time. → So an opcode cache caches the bytecode, the compiled version of your source code. That way, it doesn't have to compile the code, unless your source code changes. This can have a huge performance impact.
  • APC is the most popular one and will probably be included in one of the next few releases. Might be 5.4, but there's still a lot of discussion about that. I'm guessing we probably won't see it before 5.5 or who knows 6.0, if that ever comes out. To enable APC, all you have to do is install the module, which can be done using PECL or through your distribution's package management system. Then make sure apc is enabled in php.ini and you're good to go. → The other opcode caches are eAccelerator, which is sort of slightly outdated now, although it does in some cases produce a better performance. But since APC will be included in the PHP core, I'm not sure if it's going to survive for very long anymore. → Then there's Zend Accelerator, which is built into Zend Server. Basically, it's similar to APC in terms of opcode caching functionality, but it's just bundled with the Zend products.
  • APC is the most popular one and will probably be included in one of the next few releases. Might be 5.4, but there's still a lot of discussion about that. I'm guessing we probably won't see it before 5.5 or who knows 6.0, if that ever comes out. To enable APC, all you have to do is install the module, which can be done using PECL or through your distribution's package management system. Then make sure apc is enabled in php.ini and you're good to go. → The other opcode caches are eAccelerator, which is sort of slightly outdated now, although it does in some cases produce a better performance. But since APC will be included in the PHP core, I'm not sure if it's going to survive for very long anymore. → Then there's Zend Accelerator, which is built into Zend Server. Basically, it's similar to APC in terms of opcode caching functionality, but it's just bundled with the Zend products.
  • APC is the most popular one and will probably be included in one of the next few releases. Might be 5.4, but there's still a lot of discussion about that. I'm guessing we probably won't see it before 5.5 or who knows 6.0, if that ever comes out. To enable APC, all you have to do is install the module, which can be done using PECL or through your distribution's package management system. Then make sure apc is enabled in php.ini and you're good to go. → The other opcode caches are eAccelerator, which is sort of slightly outdated now, although it does in some cases produce a better performance. But since APC will be included in the PHP core, I'm not sure if it's going to survive for very long anymore. → Then there's Zend Accelerator, which is built into Zend Server. Basically, it's similar to APC in terms of opcode caching functionality, but it's just bundled with the Zend products.
  • Slightly better than using local disk is using a local memory disk or a ramdisk. → Advantage : slightly faster, on the other hand if you're using Linux the standard file caching system will cache recently accessed files anyway, so there might not be a big performance impact when comparing to standard disk caching.
  • See slide >> replication!<<
  • See slide
  • See slide
  • - Key names must be unique - Prefix/namespace your keys ! → might seem overkill at first, but it's usually necessary after a while, at least for large systems. → Oh, and don't share the same Memcache with multiple projects. Start separate instances for each !) - Be careful with charachters. Use only letters, numbers and underscore ! - Sometimes MD5() is your friend → but : harder to debug - Use clear names. Remember you can't make a list of data in the cache, so you'll need to document them. I know you don't like to write documentation, but you'll simply have to in this case.
  • OK, that sort of covers the basics of how we can use Memcache to cache data for your site. So purely in terms of caching in the code, we've done a lot. → There's still things that you can always add. If you're using Zend Framework or any other major framework, you can cache things like the initialization of the configuration file, creation of the route object (which is a very heavy process if you have a lot of routes). → Things like translation and locale can be cached in Zend Framework using 1 command, so do that ! → But as I said before, the only limit is your imagination... → and your common sense ! → Don't overdo it... make sure that the cache has enough space left for the things you really need to cache.
  • So, why don't we switch everything from Apache to nginx ? → Well, it's not THAT easy. There's a lot of Apache modules that Nginx doesn't have, like WebDAV support and many of the authentication modules. → The basic modules are there and they're built into Nginx, which again makes them faster than Apache, because they don't go through some module API whcih causes overhead. → But there are some specific solutions that you can not build using Nginx, although there are some third-party modules out there now, but keep in mind you have to add these by recompiling Nginx. → Now, since we're talking mostly about scaling public websites, chances are we're not going to need any of those modules, so we'll have no trouble at all putting the N in LANMMP.
  • → see slide → And that's all there is to it : it's running. Well, not exactly, we still need to configure it ofcourse.
  • Now, as I mentioned Nginx is very fast and as a first step to using it to scale our website, we're going place it in front of Apache. So, we're going to run it on the same server, but we're going to move Apache to a different port, preferably one that's not accessible from the outside, and we're going to have Nginx forward requests to Apache. → Ofcourse we're not going to send all requests to Apache, 'cause that would be quite stupid, adding overhead. → We're only going to send all dynamic content requests to Apache and serve all static files directly from Nginx.
  • So, we're serving all those extensions directly from disk and forwarding all the rest to the Apache running on port 8080. We're also forwarding the Set-Cookie headers and adding a few headers so Apache can log the original IP if it wants to. → Something to keep in mind here : you will have 2 logfiles now : 1 from Nginx and 1 from Apache. → What you should notice once you start using this type of setup is that your performance from an enduser perspective will remain somewhat the same if your server was not overloaded yet. If it was having issues because of memory problems or too many Apache workers, ... → However, you will suddenly need a lot less Apache workers, which will save you quite a lot of memory. That memory can be used for... Memcache maybe ?
  • OK, what we just did is very nice. → But if you're really not relying on any of the special Apache modules, why would you keep Apache anyway ? → Why not just replace it alltogether ? Well, it depends on what your bottleneck is. → If you're looking for a way to lower your memory usage and you don't mind losing some processing power, this is certainly the way to go. → So let's go for a LNMMP stack. We're going to kick out Apache.
  • If one of the backend webservers goes down, you want all traffic to go to the other one ofcourse. That's where health checks come in
  • >> platforms ! <<
  • >> thing on list <<
  • Indicates how long the file should not be retrieved
  • Split requests across subdomains : - HTTP/1.1 spec advises 2 connections per hostname - To get around that, use multiple subdomains. - Especially put your statics separately → helps when you grow and put them on a CDN - Be careful : don't use too many subdomains → DNS penalty
  • >> in Varnish <<

Transcript

  • 1. Caching and tuning funfor high scalabilityWim GoddenCu.be Solutions
  • 2. Who am I ?Wim Godden (@wimgtr)Founder of Cu.be Solutions (http://cu.be)Open Source developer since 1997Developer of OpenX, PHPCompatibility, Nginx extensions, ...Zend / ZF Certified EngineerMySQL Certified DeveloperSpeaker at PHP and Open Source conferences
  • 3. Who are you ?Developers ?System/network engineers ?Managers ?Caching experience ?
  • 4. Goals of this tutorialEverything about caching and tuningA few techniquesHow-toHow-NOT-to→ Increase reliability, performance and scalability5 visitors/day → 5 million visitors/day(Dont expect miracle cure !)
  • 5. LAMP
  • 6. Architecture
  • 7. Test page3 DB-queriesselect firstname, lastname, email from user where user_id = 5;select title, createddate, body from article order by createddate desc limit 5;select title, createddate, body from article order by score desc limit 5;Page just outputs result
  • 8. Our base benchmarkApachebench = useful enoughResult ?Single webserver ProxyStatic PHP Static PHPApache + PHP 3900 17.5 6700 17.5Limit :CPU, networkor diskLimit :database
  • 9. CachingCaching
  • 10. What is caching ?CACHECACHE
  • 11. What is caching ?x = 5, y = 2n = 50Same resultCACHECACHEselect*fromarticlejoin useron article.user_id = user.idorder bycreated desclimit10Doesnt changeall the time
  • 12. Theory of cachingDBCache$data = get(key)falseGET /pagePageselect data fromtable$data = returned resultset(key, $data)if ($data == false)
  • 13. Theory of cachingDBCacheHIT
  • 14. Caching goals - 1stgoalReduce # of concurrent requestReduce the load
  • 15. Caching goals - 2ndgoal
  • 16. Some figuresPageviews : 5000 (4000 on 10 pages)Avg. loading time : 200msCache 10 pagesAvg. loading time : 20ms→ Total avg. loading time : 56msWorth it ?
  • 17. Caching goals - 3rdgoalSend less data across the network / InternetYou benefit → lower bill from upstream providerUsers benefit → faster page loadWait a second... thats mostly frontend stuff !
  • 18. Caching techniques#1 : Store entire pagesCompany WebsitesBlogsFull pages that dont changeRender → Store in cache → retrieve from cache
  • 19. Caching techniques#2 : Store parts of a pageMost common techniqueUsually a small block in a pageBest effect : reused on lots of pagesCan be inserted on dynamic pages
  • 20. Caching techniques#3 : Store SQL queries↔ SQL query cacheLimited in size
  • 21. Caching techniques#3 : Store SQL queries↔ SQL query cacheLimited in sizeResets on every insert/update/deleteServer and connection overheadGoal :not to get rid of DBfree up DB resources for more hits !Better :store returned objectstore group of objects
  • 22. Caching techniques#4 : Store complex PHP resultsNot just calculationsCPU intensive tasks :Config file parsingXML file parsingLoading CSV in an arraySave resources → more resources available
  • 23. Caching techniques#xx : Your callOnly limited by your imagination !When you have data, think :Creating time ?Modification frequency ?Retrieval frequency ?
  • 24. How to find cacheable dataNew projects : start from cache everythingExisting projects :Look at MySQL slow query logMake a complete query log (dont forget to turn it off !)→ Use Percona Toolkit (pt-query-digest)Check page loading times
  • 25. Caching storage - MySQL query cacheUse itDont rely on itGood if you have :lots of readsfew different queriesBad if you have :lots of insert/update/deletelots of different queries
  • 26. Caching storage - Database memory tablesTables stored in memoryIn MySQL : memory/heap table↔ temporary table :memory tables are persistenttemporary tables are session-specificFaster than disk-based tablesCan be joined with disk-based tablesBut :default 16MByte limitmaster-slave = troubleif you dont need join → overhead of DB softwareSo : dont use it unless you need to join
  • 27. Caching storage - Opcode cachingDO !
  • 28. Caching storage - Opcode cachingAPCDe-facto standardWill be in PHP core in 5.4 ? 5.5 ? 6.0 ?PECL or packages
  • 29. Caching storage - Opcode cachingZend Optimizer+Will be in PHP core in 5.5APCeAccelerator
  • 30. Caching storage - Opcode cachingZend Optimizer+Will be in PHP core in 5.5APCeAcceleratorPHP PHP + Zend O+ PHP + APC42.18 req/sec 206.20 req/sec 211.49 req/sec
  • 31. Caching storage - DiskData with few updates : goodCaching SQL queries : preferably notDONT use NFS or other network file systemshigh latencypossible problem for sessions : locking issues !
  • 32. Caching storage - Memory disk (ramdisk)Usually faster than physical diskBut : OS file caching makes difference minimal(if you have enough memory)
  • 33. Caching storage - Disk / ramdiskOverhead : filesystemLimited number of files per directory→ SubdirectoriesLocal5 Webservers → 5 local cachesHow will you keep them synchronized ?→ Dont say NFS or rsync !
  • 34. Caching storage - Memcache(d)Facebook, Twitter, YouTube, … → need we say more ?Distributed memory caching systemMultiple machines ↔ 1 big memory-based hash-tableKey-value storage systemKeys - max. 250bytesValues - max. 1Mbyte
  • 35. Caching storage - Memcache(d)Facebook, Twitter, YouTube, … → need we say more ?Distributed memory caching systemMultiple machines ↔ 1 big memory-based hash-tableKey-value storage systemKeys - max. 250bytesValues - max. 1MbyteExtremely fast... non-blocking, UDP (!)
  • 36. Memcache - where to install
  • 37. Memcache - where to install
  • 38. Memcache - installation & running itInstallationDistribution packagePECLWindows : binariesRunningNo config-filesmemcached -d -m <mem> -l <ip> -p <port>ex. : memcached -d -m 2048 -l 172.16.1.91 -p 11211
  • 39. Caching storage - Memcache - some notesNot fault-tolerantIts a cache !Lose session dataLose shopping cart data...
  • 40. Caching storage - Memcache - some notesNot fault-tolerantIts a cache !Lose session dataLose shopping cart data…Different librariesOriginal : libmemcacheNew : libmemcached (consistent hashing, UDP, binary protocol, …)Firewall your Memcache port !
  • 41. Memcache in code<?php$memcache = new Memcache();$memcache->addServer(172.16.0.1, 11211);$memcache->addServer(172.16.0.2, 11211);$myData = $memcache->get(myKey);if ($myData === false) {$myData = GetMyDataFromDB();// Put it in Memcache as myKey, without compression, with no expiration$memcache->set(myKey, $myData, false, 0);}echo $myData;Careful : false is a valid value !
  • 42. Memcache in code<?php$memcache = new Memcache();$memcache->addServer(172.16.0.1, 11211);$memcache->addServer(172.16.0.2, 11211);$myData = $memcache->get(myKey);if ($memcache->getResultCode() == Memcached::RES_NOTSTORED) {$myData = GetMyDataFromDB();// Put it in Memcache as myKey, without compression, with no expiration$memcache->set(myKey, $myData, false, 0);}echo $myData;
  • 43. Benchmark with MemcacheSingle webserver ProxyStatic PHP Static PHPApache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108
  • 44. Wheres the data ?Memcache client decides (!)2 hashing algorithms :TraditionalServer failure → all data must be rehashedConsistentServer failure → 1/x of data must be rehashed (x = # of servers)No replication !
  • 45. Memcached ? Couchbase ? Redis ? … ?Memcached Couchbase RedisPurpose Key-value store Document store Key-value storeAdditional indexes No Yes NoReplication No Master-masterMaster-slaveMaster-slavePersistency No Yes YesInteresting read : http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis(Cassandra vs MongoDB vs CouchDB vs Redis vs Riak vs HBase vs Couchbasevs Neo4j vs Hypertable vs ElasticSearch vs Accumulo vs VoltDB vs Scalariscomparison)
  • 46. Memcache slabs(or why Memcache says its full when its not)Multiple slabs of different sizes :Slab 1 : 400 bytesSlab 2 : 480 bytes (400 * 1.2)Slab 3 : 576 bytes (480 * 1.2) (and so on...)Multiplier (1.2 here) can be configuredStore a lot of very large objects→ Large slabs : full→ Rest : free→ Eviction of data !
  • 47. Memcache - Is it working ?Connect to it using telnet"stats" command →Use Cacti or other monitoring toolsSTAT pid 2941STAT uptime 10878STAT time 1296074240STAT version 1.4.5STAT pointer_size 64STAT rusage_user 20.089945STAT rusage_system 58.499106STAT curr_connections 16STAT total_connections 276950STAT connection_structures 96STAT cmd_get 276931STAT cmd_set 584148STAT cmd_flush 0STAT get_hits 211106STAT get_misses 65825STAT delete_misses 101STAT delete_hits 276829STAT incr_misses 0STAT incr_hits 0STAT decr_misses 0STAT decr_hits 0STAT cas_misses 0STAT cas_hits 0STAT cas_badval 0STAT auth_cmds 0STAT auth_errors 0STAT bytes_read 613193860STAT bytes_written 553991373STAT limit_maxbytes 268435456STAT accepting_conns 1STAT listen_disabled_num 0STAT threads 4STAT conn_yields 0STAT bytes 20418140STAT curr_items 65826STAT total_items 553856STAT evictions 0STAT reclaimed 0
  • 48. Memcache - backing up
  • 49. Memcache - deleting<?php$memcache = new Memcache();$memcache->delete(myKey);
  • 50. Memcache - caching a page<?php$output = $memcache->get(page_ . $page_id);if ($output === false) {ob_start();GetMyPageInRegularWay($page_id);$output = ob_get_contents();ob_end_clean();$memcache->set(page_ . $page_id, $output, false, 600); // Cache 10 mins}echo $output;
  • 51. Memcache - tipPage with multiple blocks ?→ use Memcached::getMulti()But : what if you get some hits and some misses ?getMulti($array)Hashingalgorithm
  • 52. Naming your keysKey names must be uniquePrefix / namespace your keys !Only letters, numbers and underscoreWhy ? → Change caching layermd5() is useful→ BUT : harder to debugUse clear namesDocument your key names !
  • 53. Updating data
  • 54. Updating dataLCD_Popular_Product_List
  • 55. Adding/updating data$memcache->delete(ArticleDetails__Toshiba_32C100U_32_Inch);$memcache->delete(LCD_Popular_Product_List);
  • 56. Adding/updating data
  • 57. Adding/updating data - Why it crashed
  • 58. Adding/updating data - Why it crashed
  • 59. Adding/updating data - Why it crashed
  • 60. Cache stampeding
  • 61. Cache stampeding
  • 62. Memcache code ?DBVisitor interface Admin interfaceMemcache code
  • 63. Cache warmup scriptsUsed to fill your cache when its emptyRun it before starting Webserver !2 ways :Visit all URLsError-proneHard to maintainCall all cache-updating methodsMake sure you have a warmup script !
  • 64. Cache stampeding - what about locking ?Seems like a nice idea, but...While lock in placeWhat if the process that created the lock fails ?
  • 65. Quick word about expirationGeneral rule : dont let things expireException to the rule : things that have an end date (calendaritems)
  • 66. So...DONT DELETE FROM CACHE&DONT EXPIRE FROM CACHE(unless you know youll never store it again)
  • 67. Time for...a break (15 min)After the break :Byebye ApacheReverse proxyingThe importance of frontendlots of hints & tips...
  • 68. LAMP...→ LAMMP→ LANMMP
  • 69. NginxWeb serverReverse proxyLightweight, fast12.81% of all Websites
  • 70. NginxNo threads, event-drivenUses epoll / kqueueLow memory footprint10000 active connections = normal
  • 71. Nginx - a true alternative to Apache ?Not all Apache modulesmod_auth_*mod_dav*…Basic modules are availableLots of 3rdparty modules (needs recompilation !)
  • 72. Nginx - InstallationPackagesWin32 binaries→ Not for production !Build from source (./configure; make; make install)
  • 73. Nginx - Configurationserver {listen 80;server_name www.domain.ext *.domain.ext;index index.html;root /home/domain.ext/www;}server {listen 80;server_name photo.domain.ext;index index.html;root /home/domain.ext/photo;}
  • 74. Nginx - phase 1Move Apache to a different port (8080)Put Nginx at port 80Nginx serves all statics (images, css, js, …)Forward dynamic requests to Apache
  • 75. Nginx for static files onlyserver {listen 80;server_name www.domain.ext;location ~* ^.*.(jpg|jpeg|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|pdf|ppt|txt|tar|rtf|js)$ {expires 30d;root /home/www.domain.ext;}location / {proxy_pass http://www.domain.ext:8080;proxy_pass_header Set-Cookie;proxy_set_header X-Real-IP $remote_addr;proxy_set_header Host $host;proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;}}
  • 76. Nginx for PHP ?Bottleneck = PHP ? Keep it in ApacheBottleneck = memory ? Go for it !LANMMP to... LNMPP(ok, this is getting ridiculous)
  • 77. Nginx with PHP-FPMSince PHP 5.3.3Runs on port 9000Nginx connects using fastcgi methodlocation / {fastcgi_pass 127.0.0.1:9000;fastcgi_index index.php;include fastcgi_params;fastcgi_param SCRIPT_NAME $fastcgi_script_name;fastcgi_param SCRIPT_FILENAME /home/www.domain.ext/$fastcgi_script_name;fastcgi_param SERVER_NAME $host;fastcgi_intercept_errors on;}
  • 78. Nginx + PHP-FPM featuresGraceful upgradeSpawn new processes under high loadChrootSlow request log !
  • 79. Nginx + PHP-FPM featuresGraceful upgradeSpawn new processes under high loadChrootSlow request log !fastcgi_finish_request() → offline processing
  • 80. Nginx + PHP-FPM - performance ?Single webserver ProxyStatic PHP Static PHPApache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108Nginx + PHP-FPM + MC 11700 57 11200 112Limit :single-threadedApachebench
  • 81. Nginx + PHP-FPM - performance ?Single webserver ProxyStatic PHP Static PHPApache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108Nginx + PHP-FPM + MC 11700 57 11200 112Apache (tuned) +PHP/MC10600 55 11400 108Limit :single-threadedApachebench
  • 82. Reverse proxy time...
  • 83. VarnishNot just a load balancerReverse proxy cache / http accelerator / …Caches (parts of) pages in memoryCareful :uses threads (like Apache)Nginx usually scales better (but doesnt have VCL)
  • 84. Varnish - Installation & configurationInstallationPackagesSource : ./configure && make && make installConfiguration/etc/default/varnish/etc/varnish/*.vcl
  • 85. Varnish - backends + load balancingbackend server1 {.host = "192.168.0.10";}backend server2 {.host = "192.168.0.11";}director example_director round-robin {{.backend = server1;}{.backend = server2;}}
  • 86. Varnish - backends + load balancingbackend server1 {.host = "192.168.0.10";.probe = {.url = "/";.interval = 5s;.timeout = 1 s;.window = 5;.threshold = 3;}}
  • 87. Varnish - VCLVarnish Configuration LanguageDSL (Domain Specific Language)→ compiled to CHooks into each requestDefines :Backends (web servers)ACLsLoad balancing strategyCan be reloaded while running
  • 88. Varnish - whatever you wantReal-time statistics (varnishtop, varnishhist, ...)ESI
  • 89. Article content page/page/732Article content (TTL : 15 min)/article/732Varnish - ESIPerfect for caching pagesHeader (TTL : 60 min)/topLatest news (TTL : 2 min) /newsNavigation(TTL :60 min)/navIn your /page/732 output :<esi:include src="/top"/><esi:include src="/nav"/><esi:include src="/news"/><esi:include src="/article/732"/>In your Varnish config :sub vcl_fetch {if (req.url == "/news") {esi; /* Do ESI processing */set obj.ttl = 2m;} elseif (req.url == "/nav") {esi;set obj.ttl = 1m;} elseif ….….}
  • 90. Varnish with ESI - hold on tight !Single webserver ProxyStatic PHP Static PHPApache + PHP 3900 17.5 6700 17.5Apache + PHP + MC 3900 55 6700 108Nginx + PHP-FPM + MC 11700 57 11200 112Varnish - - 11200 4200
  • 91. Varnish - what can/cant be cached ?Can :Static pagesImages, js, cssPages or parts of pages that dont change often (ESI)Cant :POST requestsVery large files (its not a file server !)Requests with Set-CookieUser-specific content
  • 92. ESI → no caching on user-specific content ?Logged in as : Wim Godden5 messagesTTL = 5minTTL=1hTTL = 0s ?
  • 93. Coming soon...Based on NginxReduces load by 50 – 95%Requires code changes !Well-built project → few changesEffect on webservers and database servers
  • 94. Requesting /page (1sttime)NginxShared memory1234/page/page
  • 95. Requesting /page ESI subrequests (1sttime)Nginx123/menu/news/top (in ESI session)
  • 96. Requesting /page (next time)NginxShared memory12/page/menu/news/top (in ESI session)/page
  • 97. New message is sent...POST /sendDBinsert into...set(...)top (in ESI session)
  • 98. AdvantagesNo repeated GET hits to webserver anymore !At login : POST → warm up the cache !No repeated hits for user-specific contentNot even for non-specific content
  • 99. News addedaddnews() methodDBinsert into...set(...)Memcache key /news
  • 100. How many Memcache requests ?Logged in as : Wim Godden5 messages<scl:include key="news" src="/news" ttl="5m" /><scl:includekey="menu"src="/menu"ttl="1h" /><scl:include key="top" src="/top" session="true" ttl="1h" />
  • 101. First release : ESIPart of the ESI 1.0 specOnly relevant features implementedExtension for dynamic session supportBut : unavailable for copyright reasons
  • 102. Rebuilt from scratch : SCLSession-specific Caching LanguageLanguage details :Control structures : if/else, switch/case, foreachVariable handlingStrings : concatenation, substring, ...
  • 103. Whats the result ?
  • 104. FiguresSecond customer (already using Nginx + Memcache) :No. of web servers : 72 → 8No. of db servers : 15 → 4Total : 87 → 12 (86% reduction !)Latest customer :Total no. of servers : 1350 → 38072% reduction → €1.5 million / year
  • 105. A real example : vBulletinDB Server Load Web Server Load Max Requests/sec (1 = 282)05101520253035Standard installWith MemcachedNginx + SCL + memcached
  • 106. AvailabilityGood news :It will become Open SourceIts solid : stable at 3 customers, being installed at 1 moreBad news :First customer holds copyrightsTotal rebuild→ Open Source releaseBeta : Sep 2013Final : End 2013 (on Github !)
  • 107. Time to tune...
  • 108. PHP speed - some tipsUpgrade PHP - every minor release has 5-15% speed gain !Use an opcode cache (Zend O+, APC, eAccelerator, XCache)Profile your codeXHProfXdebugBut : turn off profilers on acceptance/production platforms !
  • 109. KCachegrind is your friend
  • 110. PHP speed - some tipsMost performance issues are in DB queries → look there first !Log PHP errors and review those logs !Shorter code != faster code → keep your code readable !Hardware cost < Manpower cost→ 1 more server < 30 mandays of laborKeep micro-optimizations in code = last thing on list
  • 111. Apache - tuning tipsDisable unused modules → fixes 10% of performance issuesSet AllowOverride to None. Enable only where needed !Disable SymLinksIfOwnerMatch. Enable only where needed !MinSpareServers, MaxSpareServers, StartServers, MaxClients,MPM selection → a whole session of its own ;-)Dont mod_proxy → use Nginx or Varnish !High load on an SSL-site ? → put SSL on a reverse proxy
  • 112. DB speed - some tipsAvoid dynamic functionsEx. : select id from calendar where startDate > curdate()Better : select id from calendar where startDate > "2013-05-14"Use same types for joinsi.e. dont join decimal with int, bigint with int, etc.RAND() is evil !count(*) is evil in InnoDB without a where clause !Persistent connect is sort-of evilIndex, index, index !→ But only on fields that are used in where, order by, group by !
  • 113. Caching & Tuning @ frontendhttp://www.websiteoptimization.com/speed/tweak/average-web-page/
  • 114. Caching in the browserHTTP 304 (Not modified)Expires/Cache-Control header2 notes :Dont use POST if you want to cacheDont cache user-specific pages in browser (security !)
  • 115. HTTP 304Browser ServerNo headerLast Modified: Fri 28 Jan 2011 08:31:01 GMTIf-Modified-Since: Fri 28 Jan 2011 08:31:01 GMT200 OK / 304 Not ModifiedFirst requestNext requests
  • 116. HTTP 304 with ETagBrowser ServerNo headerEtag: 8a53321-4b-43f0b6dd972c0If-None-Match: 8a53321-4b-43f0b6dd972c0200 OK / 304 Not ModifiedFirst requestNext requests
  • 117. Expires/Cache-control headerCache-ControlHTTP/1.1Seconds to expiryUsed by browsersBrowser ServerNo headerExpires: Fri 29 Nov 2011 12:11:08 GMTCache-Control: max-age=86400First requestNext requests No requests until item expiresExpiresHTTP/1.0Date to expire onUsed by old proxiesRequires clock to be accurate !
  • 118. Pragma: no-cache = evil"Pragma: no-cache" doesnt make it uncacheableDont want caching on a page ?HTTP/1.0 : "Expires : Fri, 30 Oct 1998 00:00:00 GMT" (in the past)HTTP/1.1 : "Cache-Control: no-store"
  • 119. Frontend tuning1. You optimize backend2. Frontend engineers messes up → havoc on backend3. Dont forget : frontend sends requests to backend !SO...Care about frontendTest frontendCheck what requests frontend sends to backend
  • 120. Tuning frontendMinimize requestsCombine CSS/JavaScript filesUse inline images in CSS/XHTML (not supported on all browsers yet)
  • 121. Frontend tuning - inline CSS/XHTML images#navbar span {width: 31px;height: 31px;display: inline;float: left;margin-right: 4px;}.home { background-image:url(data:image/gif;base64,R0lGODlhHwAfAPcAAAAAAIxKAKVjCLW1tb29tcbGvc7OxtZ7ANbWztbW1tbe1t7e1uelMefn1ufn3ufn5+fv3u+MAO/v5+/v7/fGCPf35/f37//nY////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////........MEl0nGVUC6tObNnPceSFBaQVMJAxC4lo3gNOrUaFnTHoAxNm3XVxPfRq139e8BEGAjWD5bgIALw287T8AcAXLly2kjOACdc17higXSIKDO/Lpv7Qq4bw7APgBq8eOzX69InrZ6xe3dbxZffyTGkb8tdx8F+b0Xn2sFsCSBAgTM5lp63RHYnoHUudZgRgkGOGCB+43nGk4OGcQTabKx5dyJKJ7ImoUNCaRRAZYN1ppsrT3Y2gIwyjSQBAtUpABml/0IJGYd6VjQUDH9uBFkGxGm5I8dPQaRUAQUMBdhhBV25ZYUJZBcSAtSJBddWZZ5UAGPOTXlgkNVOSZdBxEwIkYu7VhYnAol5GaadRqF0Uaz0TgXnX2umVFyGakJUUAAADs=); margin-left: 4px; }<img border=0src="data:image/gif;base64,R0lGODlhHwAfAPcAAAAAAIxKAKVjCLW1tb29tcbGvc7OxtZ7ANbWztbW1tbe1t7e1uelMefn1ufn3ufn5+fv3u+MAO/v5+/v7/fGCPf35/f37//nY/......Uaz0TgXnX2umVFyGakJUUAAADs=">
  • 122. Tuning frontendMinimize requestsCombine CSS/JavaScript filesUse inline images in CSS/XHTML (not supported on all browsers yet)Use CSS Sprites
  • 123. CSS Sprites
  • 124. Tuning content - CSS sprites
  • 125. Tuning content - CSS sprites11 images11 HTTP requests24KByte1 image1 HTTP requests14KByte
  • 126. Tuning frontendMinimize requestsCombine CSS/JavaScript filesUse inline images in CSS/XHTML (not supported on all browsers yet)Use CSS Sprites (horizontally if possible)Put CSS at topPut JavaScript at bottomMax. no connectionsEspecially if JavaScript does Ajax (advertising-scripts, …) !Avoid iFramesAgain : max no. of connectionsDont scale images in HTMLHave a favicon.ico (dont 404 it !)→ see my blog
  • 127. Tuning frontendDont use inline CSS/JavaScriptCSS/JavaScript need to be external files (minified, merged)Why ? → Cacheable by browser / reverse proxyUse GET for Ajax retrieval requests (and cache them !)Optimize images (average 50-60% !)Split requests across subdomainsPut statics on a separate subdomain (without cookies !)www.domain.extMax. 2requestswww.domain.extMax. 2requestsMax. 2requestsimages.domain.ext
  • 128. Tuning miscellaneousAvoid DNS lookupsFrontend : dont use too many subdomains (2 = ideal)Backend :Turn off DNS resolution in Apache : HostnameLookups OffIf your app uses external dataRun a local DNS cache (timeout danger !)Make sure you can trust DNS servers (preferable run your own)Compress non-binary content (GZIP)mod_deflate in ApacheHttpGzipModule in Nginx (HttpGzipStaticModule for pre-zipped statics !)No native support in Varnish
  • 129. What else can kill your site ?Redirect loopsMultiple requestsMore load on WebserverMore PHP to processAdditional latency for visitorTry to avoid redirects anyway→ In ZF : use $this->_forward instead of $this->_redirectWatch your logs, but equally important...Watch the logging process →Logging = disk I/O → can kill your server !
  • 130. Above all else... be prepared !Have a monitoring systemUse a cache abstraction layer (disk → Memcache)Dont install for the worst → prepare for the worstHave a test-setupHave fallbacks→ Turn off non-critical functionality
  • 131. So...CacheBut : never delete, always push !Have a warmup scriptMonitor your cacheHave an abstraction layerApache = fine, Nginx = betterStatic pages ? Use VarnishTune your frontend → impact on backend !
  • 132. Questions ?
  • 133. Questions ?
  • 134. ContactTwitter @wimgtrWeb http://techblog.wimgodden.beSlides http://www.slideshare.net/wimgE-mail wim.godden@cu.bePlease...Rate my talk : http://joind.in/8195
  • 135. Thanks !Please...Rate my talk : http://joind.in/8195