• Save
Caching and tuning fun for high scalability @ PHPTour
Upcoming SlideShare
Loading in...5
×
 

Caching and tuning fun for high scalability @ PHPTour

on

  • 1,865 views

Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load ...

Caching has been a 'hot' topic for a few years. But caching takes more than merely taking data and putting it in a cache : the right caching techniques can improve performance and reduce load significantly. But we'll also look at some major pitfalls, showing that caching the wrong way can bring down your site. If you're looking for a clear explanation about various caching techniques and tools like Memcached, Nginx and Varnish, as well as ways to deploy them in an efficient way, this talk is for you. In this tutorial, we'll start from a Zend Framework based site. We'll add caching, begin to add servers and replace the standard LAMP stack, all while performing live benchmarks.

Statistics

Views

Total Views
1,865
Views on SlideShare
1,862
Embed Views
3

Actions

Likes
2
Downloads
0
Comments
0

1 Embed 3

http://lanyrd.com 3

Accessibility

Categories

Upload Details

Uploaded via as OpenOffice

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Caching and tuning fun for high scalability @ PHPTour Caching and tuning fun for high scalability @ PHPTour Presentation Transcript

  • Caching and tuning fun for high scalability Wim Godden Cu.be Solutions
  • Who am I ?
    • Wim Godden (@wimgtr)
    • Owner of Cu.be Solutions (http://cu.be)
    • PHP developer since 1997
    • Developer of OpenX
    • Zend Certified Engineer
    • Zend Framework Certified Engineer
    • MySQL Certified Developer
  • Who are you ?
    • Developers ?
    • System/network engineers ?
    • Managers ?
    • Caching experience ?
  • Caching and tuning fun for high scalability Wim Godden Cu.be Solutions
  • Goals of this tutorial
    • Everything about caching and tuning
    • A few techniques
      • How-to
      • How-NOT-to
    • -> Increase reliability, performance and scalability
    • 5 visitors/day -> 5 million visitors/day
    • (Don't expect miracle cure !)
  • LAMP
  • Architecture
  • Our base benchmark
    • Apachebench = useful enough
    • Result ?
  • Caching
  • What is caching ?
  • What is caching ? select * from article join user on article.user_id = user.id order by created desc limit 10
  • Theory of caching if ($data == false) DB
  • Theory of caching DB
  • Caching techniques
      #1 : Store entire pages #2 : Store part of a page (block) #3 : Store data retrieval (SQL ?) #4 : Store complex processing result #? : Your call !
      When you have data, think :
    • Creating time ?
    • Modification frequency ?
    • Retrieval frequency ?
  • How to find cacheable data
    • New projects : start from 'cache everything'
    • Existing projects :
      • Look at MySQL slow query log
      • Make a complete query log (don't forget to turn it off !)
      • Check page loading times
  • Caching storage - MySQL query cache
    • Use it
    • Don't rely on it
    • Bad if you have :
      • lots of insert/update/delete
      • lots of different queries
  • Caching storage - Disk
    • Data with few updates : good
    • Caching SQL queries : preferably not
    • DON'T use NFS or other network file systems
      • especially for sessions
      • locking issues !
      • high latency
  • Caching storage - Disk / ramdisk
    • Local
      • 5 Webservers -> 5 local caches
      • -> Hard to scale
      • How will you keep them synchronized ?
        • -> Don't say NFS or rsync !
  • Caching storage - Memcache
    • Facebook, Twitter, Slashdot, … -> need we say more ?
    • Distributed memory caching system
    • Multiple machines ↔ 1 big memory-based hash-table
    • Key-value storage system
      • Keys - max. 250bytes
      • Values - max. 1Mbyte
  • Caching storage - Memcache
    • Facebook, Twitter, Slashdot, … -> need we say more ?
    • Distributed memory caching system
    • Multiple machines ↔ 1 big memory-based hash-table
    • Key-value storage system
      • Keys - max. 250bytes
      • Values - max. 1Mbyte
    • Extremely fast... non-blocking, UDP (!)
  • Memcache - where to install
  • Memcache - where to install
  • Memcache - installation & running it
    • Installation
      • Distribution package
      • PECL
      • Windows : binaries
    • Running
      • No config-files
      • memcached -d -m <mem> -l <ip> -p <port>
      • ex. : memcached -d -m 2048 -l 172.16.1.91 -p 11211
  • Caching storage - Memcache - some notes
    • Not fault-tolerant
      • It's a cache !
      • Lose session data
      • Lose shopping cart data
      • ...
  • Caching storage - Memcache - some notes
    • Not fault-tolerant
      • It's a cache !
      • Lose session data
      • Lose shopping cart data
    • Firewall your Memcache port !
  • Memcache in code <?php $memcache = new Memcache(); $memcache->addServer( '172.16.0.1' , 11211); $memcache->addServer( '172.16.0.2' , 11211); $myData = $memcache->get( 'myKey' ); if ($myData === false ) { $myData = GetMyDataFromDB(); // Put it in Memcache as 'myKey', without compression, with no expiration $memcache->set( 'myKey' , $myData, false , 0); } echo $myData;
  • Where's the data ?
    • Memcache client decides (!)
    • 2 hashing algorithms :
      • Traditional
        • Server failure -> all data must be rehashed
      • Consistent
        • Server failure -> 1/x of data must be rehashed (x = # of servers)
  • Benchmark with Memcache
  • Memcache slabs
      (or why Memcache says it's full when it's not)
    • Multiple slabs of different sizes :
      • Slab 1 : 400 bytes
      • Slab 2 : 480 bytes (400 * 1.2)
      • Slab 3 : 576 bytes (480 * 1.2) (and so on...)
    • Multiplier (1.2 here) can be configured
    • Store a lot of very large objects
    • -> Large slabs : full
    • -> Rest : free
    • -> Eviction of data !
  • Memcache - Is it working ?
    • Connect to it using telnet
      • &quot;stats&quot; command ->
      • Use Cacti or other monitoring tools
    STAT pid 2941 STAT uptime 10878 STAT time 1296074240 STAT version 1.4.5 STAT pointer_size 64 STAT rusage_user 20.089945 STAT rusage_system 58.499106 STAT curr_connections 16 STAT total_connections 276950 STAT connection_structures 96 STAT cmd_get 276931 STAT cmd_set 584148 STAT cmd_flush 0 STAT get_hits 211106 STAT get_misses 65825 STAT delete_misses 101 STAT delete_hits 276829 STAT incr_misses 0 STAT incr_hits 0 STAT decr_misses 0 STAT decr_hits 0 STAT cas_misses 0 STAT cas_hits 0 STAT cas_badval 0 STAT auth_cmds 0 STAT auth_errors 0 STAT bytes_read 613193860 STAT bytes_written 553991373 STAT limit_maxbytes 268435456 STAT accepting_conns 1 STAT listen_disabled_num 0 STAT threads 4 STAT conn_yields 0 STAT bytes 20418140 STAT curr_items 65826 STAT total_items 553856 STAT evictions 0 STAT reclaimed 0
  • Memcache - backing up
  • Memcache - tip
      Page with multiple blocks ? -> use Memcached::getMulti() Warning : what if you get some hits and some misses ?
  • Updating data
  • Updating data LCD_Popular_Product_List
  • Adding/updating data $memcache->delete( 'LCD_Popular_Product_List' );
  • Adding/updating data
  • Adding/updating data - Why it crashed
  • Adding/updating data - Why it crashed
  • Adding/updating data - Why it crashed
  • Cache stampeding
  • Cache stampeding
  • Memcache code ? DB
  • Cache warmup scripts
    • Used to fill your cache when it's empty
    • Run it before starting Webserver !
    • 2 ways :
      • Visit all URLs
        • Error-prone
        • Hard to maintain
      • Call all cache-updating methods
    • Make sure you have a warmup script !
  • Cache stampeding - what about locking ?
      Seems like a nice idea, but...
    • While lock in place
    • What if the process that created the lock fails ?
  • LAMP...
      -> LAMMP -> LNMMP
  • Nginx
    • Web server
    • Reverse proxy
    • Lightweight, fast
    • 8.3% of all Websites
  • Nginx
    • No threads, event-driven
    • Uses epoll / kqueue
    • Low memory footprint
    • 10000 active connections = normal
  • Nginx - a true alternative to Apache ?
    • Not all Apache modules
      • mod_auth_*
      • mod_dav*
    • Basic modules are available
    • Some 3 rd party modules (needs recompilation !)
  • Nginx - Configuration server { listen 80; server_name www.domain.ext *.domain.ext; index index.html; root /home/domain.ext/www; } server { listen 80; server_name photo.domain.ext; index index.html; root /home/domain.ext/photo; }
  • Nginx + PHP-FPM - performance ?
  • Reverse proxy time...
  • Varnish
    • Not just a load balancer
    • Reverse proxy cache / http accelerator / …
    • Caches (parts of) pages in memory
    • Careful :
      • uses threads (like Apache)
      • Nginx usually scales better (but doesn't have VCL)
  • Varnish - backends + load balancing backend server1 { .host = &quot;192.168.0.10&quot;; } backend server2{ .host = &quot;192.168.0.11&quot;; } director example_director round-robin { { .backend = server1; } { .backend = server2; } }
  • Varnish - VCL
    • Varnish Configuration Language
    • DSL (Domain Specific Language)
      • -> compiled to C
    • Hooks into each request
    • Defines :
      • Backends (web servers)
      • ACLs
      • Load balancing strategy
    • Can be reloaded while running
  • Varnish - whatever you want
    • Real-time statistics (varnishtop, varnishhist, ...)
    • ESI
  • Varnish - ESI
      Perfect for caching pages
    In your article page output : <esi:include src=&quot;/news&quot;/> In your Varnish config : sub vcl_fetch { if (req.url == &quot;/news&quot;) { esi; /* Do ESI processing */ set obj.ttl = 2m; } elseif (req.url == &quot;/nav&quot;) { esi; set obj.ttl = 1m; } elseif …. … . }
  • Varnish with ESI - hold on tight !
  • Varnish - what can/can't be cached ?
    • Can :
      • Static pages
      • Images, js, css
      • Pages or parts of pages that don't change often (ESI)
    • Can't :
      • POST requests
      • Very large files (it's not a file server !)
      • Requests with Set-Cookie
      • User-specific content
  • ESI -> no caching on user-specific content ? Logged in as : Wim Godden 5 messages TTL = 5min TTL=1h TTL = 0s ?
  • Under development
    • Release date
      • Beta : Dec 2011
      • Stable : Feb 2012
  • Tuning
  • PHP speed - some tips
    • Upgrade PHP - every minor release has 5-15% speed gain !
    • Use an opcode cache
  • Caching storage - Opcode caching
  • PHP speed - some tips
    • Upgrade PHP - every minor release has 5-15% speed gain !
    • Use an opcode cache
    • Profile your code
      • XHProf
      • Xdebug
  • KCachegrind is your friend
  • PHP speed - some tips
    • Upgrade PHP - every minor release has 5-15% speed gain !
    • Use an opcode cache
    • Profile your code
      • XHProf
      • Xdebug
    • But : turn off profilers on production platforms !
  • DB speed - some tips
    • Use same types for joins
      • i.e. don't join decimal with int
    • RAND() is evil !
    • count(*) is evil in InnoDB without a where clause !
      • (and there are other examples of specific things to avoid)
    • Persistent connect is not always good !
  • Caching & Tuning @ frontend http://www.websiteoptimization.com/speed/tweak/average-web-page/
  • Frontend tuning
      1. You optimize backend 2. Frontend engineers messes up -> havoc on backend 3. Don't forget : frontend sends requests to backend ! SO...
    • Care about frontend
    • Test frontend
    • Check what requests frontend sends to backend
  • Tuning frontend
    • Minimize requests
      • Combine CSS/JavaScript files
  • Tuning frontend
    • Minimize requests
      • Combine CSS/JavaScript files
      • Use CSS Sprites
  • CSS Sprites
  • Tuning content - CSS sprites
  • Tuning content - CSS sprites 11 images 11 HTTP requests 24KByte 1 image 1 HTTP requests 14KByte
  • Tuning frontend
    • Minimize requests
      • Combine CSS/JavaScript files
      • Use CSS Sprites (horizontally if possible)
    • Put CSS at top
    • Put JavaScript at bottom
      • Max. no connections
      • Especially if JavaScript does Ajax (advertising-scripts, …) !
    • Avoid iFrames
      • Again : max no. of connections
    • Don't scale images in HTML
    • Have a favicon.ico (don't 404 it !)
      • -> see my blog
  • What else can kill your site ?
    • Redirect loops
      • Multiple requests
        • More load on Webserver
        • More PHP to process
      • Additional latency for visitor
      • Try to avoid redirects anyway
      • -> In ZF : use $this->_forward instead of $this->_redirect
    • Watch your logs, but equally important...
    • Watch the logging process ->
    • Logging = disk I/O -> can kill your server !
    • Slashdot effect
  • Above all else... be prepared !
    • Have a monitoring system
    • Use a cache abstraction layer (disk -> Memcache)
    • Don't install for the worst -> prepare for the worst
    • Have a test-setup
    • Have fallbacks
      • -> Turn off non-critical functionality
    • Questions ?
    • Questions ?
  • Contact
    • Twitter @wimgtr
    • Web http://techblog.wimgodden.be
    • Slides http://www.slideshare.net/wimg
    • E-mail [email_address]
    • Please...
    • Rate my talk : http://joind.in/4361
    • Thanks !
      Please... Rate my talk : http://joind.in/4361
  •