Caching and tuning fun for high scalability Wim Godden Cu.be Solutions
Who am I ? Wim Godden (@wimgtr)
Owner of Cu.be Solutions (http://cu.be)
PHP developer since 1997
Developer of OpenX
Zend Certified Engineer
Zend Framework Certified Engineer
MySQL Certified Developer
Who are you ? Developers ?
System/network engineers ?
Managers ?
Caching experience ?
Caching and tuning fun for high scalability Wim Godden Cu.be Solutions
Goals of this tutorial Everything about caching and tuning
A few techniques How-to
How-NOT-to -> Increase reliability, performance and scalability
5 visitors/day -> 5 million visitors/day
(Don't expect miracle cure !)
LAMP
Architecture
Our base benchmark Apachebench = useful enough
Result ?
Caching
What is caching ?
What is caching ? select * from article join user on article.user_id = user.id order by created desc limit 10
Theory of caching if ($data == false) DB
Theory of caching DB
Caching techniques #1 : Store entire pages #2 : Store part of a page (block) #3 : Store data retrieval (SQL ?) #4 : Store complex processing result #? : Your call ! When you have data, think : Creating time ?
Modification frequency ?
Retrieval frequency ?
How to find cacheable data New projects : start from 'cache everything'
Existing projects : Look at MySQL slow query log
Make a complete query log (don't forget to turn it off !)
Check page loading times
Caching storage - MySQL query cache Use it
Don't rely on it
Bad if you have : lots of insert/update/delete
lots of different queries
Caching storage - Disk Data with few updates : good
Caching SQL queries : preferably not
DON'T  use NFS or other network file systems especially for sessions
locking issues !
high latency
Caching storage - Disk / ramdisk Local 5 Webservers -> 5 local caches
-> Hard to scale
How will you keep them synchronized ? -> Don't say NFS or rsync !
Caching storage - Memcache Facebook, Twitter, Slashdot, … -> need we say more ?
Distributed memory caching system
Multiple machines ↔ 1 big memory-based hash-table
Key-value storage system Keys - max. 250bytes
Values - max. 1Mbyte
Caching storage - Memcache Facebook, Twitter, Slashdot, … -> need we say more ?
Distributed memory caching system
Multiple machines ↔ 1 big memory-based hash-table
Key-value storage system Keys - max. 250bytes
Values - max. 1Mbyte Extremely fast... non-blocking, UDP (!)
Memcache - where to install
Memcache - where to install
Memcache - installation & running it Installation Distribution package
PECL
Windows : binaries Running No config-files
memcached -d -m <mem> -l <ip> -p <port>
ex. : memcached -d -m 2048 -l 172.16.1.91 -p 11211
Caching storage - Memcache - some notes Not fault-tolerant It's a cache !
Lose session data
Lose shopping cart data
...
Caching storage - Memcache - some notes Not fault-tolerant It's a cache !
Lose session data
Lose shopping cart data
… Firewall your Memcache port !
Memcache in code <?php $memcache =  new  Memcache(); $memcache->addServer( '172.16.0.1' , 11211); $memcache->addServer( '172.16.0.2' , 11211); $myData = $memcache->get( 'myKey' ); if  ($myData ===  false ) { $myData = GetMyDataFromDB(); // Put it in Memcache as 'myKey', without compression, with no expiration $memcache->set( 'myKey' , $myData,  false , 0); } echo  $myData;
Where's the data ? Memcache client decides (!)
2 hashing algorithms : Traditional Server failure -> all data must be rehashed Consistent Server failure -> 1/x of data must be rehashed (x = # of servers)
Benchmark with Memcache
Memcache slabs (or why Memcache says it's full when it's not) Multiple slabs of different sizes : Slab 1 : 400 bytes
Slab 2 : 480 bytes (400 * 1.2)
Slab 3 : 576 bytes (480 * 1.2) (and so on...) Multiplier (1.2 here) can be configured
Store a lot of very large objects
-> Large slabs : full
-> Rest : free
-> Eviction of data !

Caching and tuning fun for high scalability @ PHPTour