Cache is King!
(Or How I Learned To Stop Worrying
and Love the RAM)
Cache is King
• What is Memcaching?
• How Does it Work?
• Setup Overview.
• How To Namespace the Right Way.
• Avoiding Cache Rushing.
• What Should I Cache?
Agenda
Cache is King
• a big ol' hash table
• RAM not disc
• alleviates database (and other io) loads to
speed up load times
• stores data objects as key->value pairs
• open source (BSD license)
• distributed
What is Memcaching?
Cache is King
• A database
• Redundant
• Locking
• A backup
• Highly available
• Limitless in size
• Namespaced (more later)
What Memcaching is NOT?
Cache is King
• 2 hashes
• client hashes key against server list to find
server
• client then sends request to server
• server does hash key lookup against slabs of
keys
• Serve returns object
How Does it Work?
Cache is King
• When setting a value you can avoid the 250
byte limit imposed for keys by md5-hashing
them.
• If your max page size is 1MB are you limited to
1 MB size chunks? Kinda – but there is
compression available.
• You can increase or decrease page size and/or
growth factor, but there is a performance hit.
Storage
Cache is King
• Default minimum chunk size is 48 bytes
• Default growth factor is 1.25
• 48 bytes required for storage overhead
• Each slab is a 1mb page containing same size
chunks
• Non-contiguous – there is space wasted
Chunks and Slabs and Growth Factors
Cache is King
• Need to store a 1001 byte piece (953 + 48).
• Multiply by growth factor until chunk size is
greater than 1001 bytes
88 x 1.25^11 = 1025
• Look for the slab of 1025 byte pieces.
• Create a new slab if it does not exist
• Store the value in a chunk (note 24 bytes wasted)
• 250 byte limit to key (so use md5 to set and get)
Example
Cache is King
• There is no room left for our 1025-byte chunk
AND we have initialized enough slabs already
to fill the memcached space – what now?
• LRU (least recently used) within 1025b slabs
• LRU is not global.
• Evict the LRU 1025-byte chunk
Eviction
Cache is King
• Memcached is not language specific
• Memcached runs as a service on the OS
• PHP has a library to connect to it
• Grab an API (or write your own) to wrap the
functionality in PHP for your needs.
In Practice
Cache is King
get
add
replace
delete
What functions do I need to wrap?
Cache is King
•NOT natively supported
•BUT completely do-able
Namespaces
Cache is King
• Do not store an array of keys as a
representation of a namespace.
• Does not scale well
• You MUST iterate through all in order
to expire namespace
Namespaces the Wrong Way
Cache is King
• Requires at least 1 additional round trip to
the hash table (using namespace, key and
value)
– store a key->value pair of namespace and key
– then store a key->value pair of key and value
• Assign an ID for a namespace (or retrieve
the existing one if it already exists)
Namespaces the Right Way
Cache is King
• Setting involves:
– Get namespace->id (generate and set with no expire if
new namespace)
– Set key->id (no expire)
– then set the key->value (with expire)
• Getting involves:
– Get id by namespace (namespace->id)
– if not exists then ā€œmissā€
– If success then get the key->id
– key->value (may still ā€œmissā€)
Namespaces the Right Way
Cache is King
• Deleting (expiring) involves:
– ā€œincrementā€ value of namespace->id
– Since we have stored the data as COMPRESSED
rather than INT it will invalidate this chunk
– Now simply trying to get something by key and
namespace will ā€œmissā€ because the first get for
namespace fails
– Note the subsequent key->value pairs are not
deleted – they merely become inaccessible, will
turn LRU and eventually get evicted.
Namespaces the Right Way
Cache is King
• 2 process (A and B) want the same thing that is expired
• A "misses" so it goes off to get it from the database
• B then "misses" so it also goes off to get it (A isn't done)
• A and B are both hitting the DB for the same thing
• A and B both intend to "replace" to cache when they return
• If you are doing millions of page hits a day = big troubles!
• NOT optimal! - indeed can be a bottleneck on common
and/or expensive values
Cache Rush
Cache is King
• Store your value as an array or object with a "real" expire
included, then cache with extra time added on.
• The item will be retrieved regardless 'cuz you stored it with
extra time
• A can add time to the internal expiry and do a cache
replace before going to process data
• B checks cache - hey it's not expired (ā€˜cuz A added time!)
• A comes back and does another replace with
– new data
– new internal expiry
– new cache expiry
Cache Rush Solution
Cache is King
• start with low-hanging fruit
– Big Static Html pages
– objects that change seldom (long expiry dates –
more likely to update before expired)
– objects that change more often (careful with
expiration dates now – much more important)
What Should I Cache?
Cache is King
• Now the big stuff
– dynamic html (i.e. with user content) cached as
static with placeholders for dynamic content
– Static html fragments cached short term for the
dynamic content
– finally - user specific data that is expensive, but
used often (store as fragments or objects then
replace placeholders in cached html)
What Should I Cache?
Cache is King
• David Engel
• davidengel.dev@gmail.com
• http://winnipegphp.com
• http://www.meetup.com/Winnipeg-PHP/
• http://www.linkedin.com/groups/PHP-
Winnipeg-3874131
Closing
Cache is King
# yum install memcached
# chkconfig memcached on
# /etc/init.d/memcached start
OR
# service memcached start
# setenforce 0
# setsebool -P httpd_can_network_memcache 1
# setenforce 1
CentOS Notes
Cache is King
# yum install php-pecl-memcache
# vi /etc/sysconfig/memcached
PORT="11211ā€œ
USER="memcachedā€œ
MAXCONN="1024ā€œ
CACHESIZE="512ā€œ
OPTIONS=""
CentOS Notes
Cache is King
# pkg_add -i memcached-1.4.13
# pkg_add -i pecl-memcache-3.0.6p1
# ln -sf /etc/php-5.3.sample/memcache.ini 
/etc/php-5.3/memcache.ini
# vi /etc/rc.local
# Start memcached
if [ -x /usr/local/bin/memcached ]; then
echo -n ' memcached'
/usr/local/bin/memcached -m 1024M -d -u _memcached –P /var/run/memcached.pid
fi
OpenBSD Notes

Cache is King!

  • 1.
    Cache is King! (OrHow I Learned To Stop Worrying and Love the RAM)
  • 2.
    Cache is King •What is Memcaching? • How Does it Work? • Setup Overview. • How To Namespace the Right Way. • Avoiding Cache Rushing. • What Should I Cache? Agenda
  • 3.
    Cache is King •a big ol' hash table • RAM not disc • alleviates database (and other io) loads to speed up load times • stores data objects as key->value pairs • open source (BSD license) • distributed What is Memcaching?
  • 4.
    Cache is King •A database • Redundant • Locking • A backup • Highly available • Limitless in size • Namespaced (more later) What Memcaching is NOT?
  • 5.
    Cache is King •2 hashes • client hashes key against server list to find server • client then sends request to server • server does hash key lookup against slabs of keys • Serve returns object How Does it Work?
  • 6.
    Cache is King •When setting a value you can avoid the 250 byte limit imposed for keys by md5-hashing them. • If your max page size is 1MB are you limited to 1 MB size chunks? Kinda – but there is compression available. • You can increase or decrease page size and/or growth factor, but there is a performance hit. Storage
  • 7.
    Cache is King •Default minimum chunk size is 48 bytes • Default growth factor is 1.25 • 48 bytes required for storage overhead • Each slab is a 1mb page containing same size chunks • Non-contiguous – there is space wasted Chunks and Slabs and Growth Factors
  • 8.
    Cache is King •Need to store a 1001 byte piece (953 + 48). • Multiply by growth factor until chunk size is greater than 1001 bytes 88 x 1.25^11 = 1025 • Look for the slab of 1025 byte pieces. • Create a new slab if it does not exist • Store the value in a chunk (note 24 bytes wasted) • 250 byte limit to key (so use md5 to set and get) Example
  • 9.
    Cache is King •There is no room left for our 1025-byte chunk AND we have initialized enough slabs already to fill the memcached space – what now? • LRU (least recently used) within 1025b slabs • LRU is not global. • Evict the LRU 1025-byte chunk Eviction
  • 10.
    Cache is King •Memcached is not language specific • Memcached runs as a service on the OS • PHP has a library to connect to it • Grab an API (or write your own) to wrap the functionality in PHP for your needs. In Practice
  • 11.
  • 12.
    Cache is King •NOTnatively supported •BUT completely do-able Namespaces
  • 13.
    Cache is King •Do not store an array of keys as a representation of a namespace. • Does not scale well • You MUST iterate through all in order to expire namespace Namespaces the Wrong Way
  • 14.
    Cache is King •Requires at least 1 additional round trip to the hash table (using namespace, key and value) – store a key->value pair of namespace and key – then store a key->value pair of key and value • Assign an ID for a namespace (or retrieve the existing one if it already exists) Namespaces the Right Way
  • 15.
    Cache is King •Setting involves: – Get namespace->id (generate and set with no expire if new namespace) – Set key->id (no expire) – then set the key->value (with expire) • Getting involves: – Get id by namespace (namespace->id) – if not exists then ā€œmissā€ – If success then get the key->id – key->value (may still ā€œmissā€) Namespaces the Right Way
  • 16.
    Cache is King •Deleting (expiring) involves: – ā€œincrementā€ value of namespace->id – Since we have stored the data as COMPRESSED rather than INT it will invalidate this chunk – Now simply trying to get something by key and namespace will ā€œmissā€ because the first get for namespace fails – Note the subsequent key->value pairs are not deleted – they merely become inaccessible, will turn LRU and eventually get evicted. Namespaces the Right Way
  • 17.
    Cache is King •2 process (A and B) want the same thing that is expired • A "misses" so it goes off to get it from the database • B then "misses" so it also goes off to get it (A isn't done) • A and B are both hitting the DB for the same thing • A and B both intend to "replace" to cache when they return • If you are doing millions of page hits a day = big troubles! • NOT optimal! - indeed can be a bottleneck on common and/or expensive values Cache Rush
  • 18.
    Cache is King •Store your value as an array or object with a "real" expire included, then cache with extra time added on. • The item will be retrieved regardless 'cuz you stored it with extra time • A can add time to the internal expiry and do a cache replace before going to process data • B checks cache - hey it's not expired (ā€˜cuz A added time!) • A comes back and does another replace with – new data – new internal expiry – new cache expiry Cache Rush Solution
  • 19.
    Cache is King •start with low-hanging fruit – Big Static Html pages – objects that change seldom (long expiry dates – more likely to update before expired) – objects that change more often (careful with expiration dates now – much more important) What Should I Cache?
  • 20.
    Cache is King •Now the big stuff – dynamic html (i.e. with user content) cached as static with placeholders for dynamic content – Static html fragments cached short term for the dynamic content – finally - user specific data that is expensive, but used often (store as fragments or objects then replace placeholders in cached html) What Should I Cache?
  • 21.
    Cache is King •David Engel • davidengel.dev@gmail.com • http://winnipegphp.com • http://www.meetup.com/Winnipeg-PHP/ • http://www.linkedin.com/groups/PHP- Winnipeg-3874131 Closing
  • 22.
    Cache is King #yum install memcached # chkconfig memcached on # /etc/init.d/memcached start OR # service memcached start # setenforce 0 # setsebool -P httpd_can_network_memcache 1 # setenforce 1 CentOS Notes
  • 23.
    Cache is King #yum install php-pecl-memcache # vi /etc/sysconfig/memcached PORT="11211ā€œ USER="memcachedā€œ MAXCONN="1024ā€œ CACHESIZE="512ā€œ OPTIONS="" CentOS Notes
  • 24.
    Cache is King #pkg_add -i memcached-1.4.13 # pkg_add -i pecl-memcache-3.0.6p1 # ln -sf /etc/php-5.3.sample/memcache.ini /etc/php-5.3/memcache.ini # vi /etc/rc.local # Start memcached if [ -x /usr/local/bin/memcached ]; then echo -n ' memcached' /usr/local/bin/memcached -m 1024M -d -u _memcached –P /var/run/memcached.pid fi OpenBSD Notes