Your SlideShare is downloading. ×
0
memcached

 scaling your website
 with memcached


 by: steve yen
about me

• Steve Yen

 • NorthScale
 • Escalate Software
 • Kiva Software
what you’ll learn

•   what, where, why, when



• how
 • especially, best practices
“mem cache dee”

• latest version1.4.1

• http://code.google.com/p/memcached
open source
distributed cache
livejournal
helps your websites run
          fast
popular
simple
KISS
easy
small bite-sized steps


• not a huge, forklift replacement
  rearchitecture / reengineering project
fast
“i only block for
  memcached”
scalable
many client libraries
• might be TOO many
• the hit list...
 • Java ==> spymemcached
 • C ==> libmemcached
 • Python, Ruby...
frameworks

• rails
• django
• spring / hibernate
• cakephp, symphony, etc
applications

• drupal
• wordpress
• mediawiki
• etc
it works

it promises to solve performance problems


                               it delivers!
problem?
your website is too
      slow
RDBMS melting down
urgent! emergency
one server


web app + RDBMS
1 + 1 servers

     web app


     RDBMS
N + 1 servers

web app, web app, web app, web app


             RDBMS
RDBMS
EXPLAIN PLAN?
buy a bigger box
buy better disks
master write DB +
multiple read DB?
vertical partitioning?
sharding?
uh oh, big reengineering

• risky!

• touch every line of code, every query!!
and, it’s 2AM
you need a band-aid
a simple band-aid now
use a cache
keep things in memory!
don’t hit disk
distributed cache


• to avoid wasting memory
don’t write one of
  these yourself
memcached
simple API


• hash-table-ish
your code before



v = db.query( SOME SLOW QUERY )
your code after

v = memcachedClient.get(key)
if (!v) {
    v = db.query( SOME SLOW QUERY )
    memcachedClient.set(key, v...
cache read-heavy stuff
invalidate when writing


• db.execute(“UPDATE foo WHERE ...”)
• memcachedClient.delete(...)
and, repeat

• each day...
 • look for the next slowest operations
 • add code to cache a few more things
your life gets better
thank you memcached!
no magic
you are in control
now for the decisions
memcached adoption

• first, start using memcached
 • poorly
   • but you can breathe again
memcached adoption


• next, start using memcached correctly
memcached adoption

• later
 • queueing
 • persistence
 • replication
 • ...
an early question
where to run servers?
answer 1

• right on your web servers

• a great place to start, if you have extra
  memory
servers

web app web app web app web app
memcached   memcached   memcached,   memcached




                  RDBMS
add up your memory
        usage!


• having memcached server swap == bad!
answer 2

• run memcached right on your database
  server?


• WRONG!
answer 3
• run memcached on separate dedicated
  memcached servers


• congratulations!
 • you either have enough money
 •...
running a server

• daemonize

• don’t be root!

• no security
server lists

• mc-server1:11211
• mc-server2:11211
• mc-server3:11211
consistent hashing




 source: http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/
client-side intelligence


• no “server master” bottleneck
libmemcached

• fast C memcached client
 • supports consistent hashing
 • many wrappers to your favorite languages
updating server lists

• push out new configs and restart?

• moxi
 • memcached + integrated proxy
keys

• no whitespace
• 250 char limit
• use short prefixes
keys & MD5

• don’t

• stats become useless
values
• any binary object

• 1MB limit
 • change #define & recompile if you want more
 • and you’re probably doing somethi...
values
• query resultset
•   serialized object
•   page fragment

•
•
    pages
    etc
nginx + memcached
>1 language?

• JSON
• protocol buffers
• XML
memcached is lossy


• memcached WILL lose data
that’s a good thing




         remember, it’s a CACHE
why is memcached
      lossy?
memcached node dies
when node restarts...

• you just get a bunch of cache misses
                    (and a short RDBMS spike)
eviction


more disappearing data!
LRU


• can config memcached to not evict
 • but, you’re probably doing something
    wrong if you do this
remember, it forgets


• it’s just a CACHE
expiration

• aka, timeouts

• memcached.set(key, value, timeout)
use expirations or not?
1st school of thought

• expirations hide bugs
• you should be doing proper invalidations
 • (aka, deletes)
 • coherency!
school 2

• it’s 3AM and I can’t think anymore

• business guy:
 • “sessions should auto-logout after 30
    minutes due t...
put sessions
      in memcached?

• just a config change
 • eg, Ruby on Rails
good


• can load-balance requests to any web host
• don’t touch the RDBMS on every web
  request
bad


• could lose a user’s session
solution

• save sessions to memcached
• the first time, also save to RDBMS
 • ideally, asynchronously
• on cache miss, res...
solution

• save sessions to memcached
• the first time, also save to RDBMS
 • ideally, asynchronously
• on cache miss, res...
in the background...
• have a job querying the RDBMS
 • cron job?
• the job queries for “old” looking session
  records in...
add vs replace vs set
append vs prepend
CAS


• compare - and - swap
incr and decr


• no negative numbers
queueing


• “hey, with those primitives, I could build a queue!”
don’t
• memcached is lossy
• protocol is incorrect for a queue
• instead
 • gearman
 • beanstalkd
 • etc
cache stampedes

• gearman job-unique-id
• encode a timestamp in your values
 • one app node randomly decides to
    refre...
coherency
denormalization


• or copies of data
example: changing a
   product price
memcached UDF’s

• another great tool in your toolbox

• on a database trigger, delete stuff from
  memcached
memcached UDF’s


• works even if you do UPDATES with fancy
  WHERE clauses
multigets

• they are your friend

• memcached is fast, but...
 • imagine 1ms for a get request
   • 200 serial gets ==> 2...
a resultset loop

foreach product in resultset
  c = memcached.get(product.category_id)
  do something with c
2 loops
for product in resultset
  multiget_request.append(product.category_id)
multiget_response = memcachedClient.multig...
memcached slabber

• allocates memory into slabs

• it might “learn” the wrong slab sizes

• watch eviction stats
losing a node


• means your RDBMS gets hit
replication
• simple replication in libmemcached

• >= 2x memory cost
• only simple verbs
 • set, get, delete
• doesn’t ha...
persistence
things that speak
        memcached

• tokyo tyrant
• memcachedb
• moxi
another day

• monitoring & statistics
• near caching
• moxi
thanks!!!

• love any feedback
 • your memcached war stories
    • your memcached wishlist

• steve.yen@northscale.com
thanks!

photo credits

 •     http://flickr.com/photos/davebluedevil/15877348/

 •     http://www.flickr.com/photos/theamar...
Upcoming SlideShare
Loading in...5
×

Memcached Code Camp 2009

4,033

Published on

memcached best practices presentation at Silicon Valley Code Camp 2009,

Published in: Technology
0 Comments
12 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,033
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
153
Comments
0
Likes
12
Embeds 0
No embeds

No notes for slide

Transcript of "Memcached Code Camp 2009"

  1. 1. memcached scaling your website with memcached by: steve yen
  2. 2. about me • Steve Yen • NorthScale • Escalate Software • Kiva Software
  3. 3. what you’ll learn • what, where, why, when • how • especially, best practices
  4. 4. “mem cache dee” • latest version1.4.1 • http://code.google.com/p/memcached
  5. 5. open source
  6. 6. distributed cache
  7. 7. livejournal
  8. 8. helps your websites run fast
  9. 9. popular
  10. 10. simple
  11. 11. KISS
  12. 12. easy
  13. 13. small bite-sized steps • not a huge, forklift replacement rearchitecture / reengineering project
  14. 14. fast
  15. 15. “i only block for memcached”
  16. 16. scalable
  17. 17. many client libraries • might be TOO many • the hit list... • Java ==> spymemcached • C ==> libmemcached • Python, Ruby, etc ==> • libmemcached wrappers
  18. 18. frameworks • rails • django • spring / hibernate • cakephp, symphony, etc
  19. 19. applications • drupal • wordpress • mediawiki • etc
  20. 20. it works it promises to solve performance problems it delivers!
  21. 21. problem?
  22. 22. your website is too slow
  23. 23. RDBMS melting down
  24. 24. urgent! emergency
  25. 25. one server web app + RDBMS
  26. 26. 1 + 1 servers web app RDBMS
  27. 27. N + 1 servers web app, web app, web app, web app RDBMS
  28. 28. RDBMS
  29. 29. EXPLAIN PLAN?
  30. 30. buy a bigger box
  31. 31. buy better disks
  32. 32. master write DB + multiple read DB?
  33. 33. vertical partitioning?
  34. 34. sharding?
  35. 35. uh oh, big reengineering • risky! • touch every line of code, every query!!
  36. 36. and, it’s 2AM
  37. 37. you need a band-aid
  38. 38. a simple band-aid now
  39. 39. use a cache
  40. 40. keep things in memory!
  41. 41. don’t hit disk
  42. 42. distributed cache • to avoid wasting memory
  43. 43. don’t write one of these yourself
  44. 44. memcached
  45. 45. simple API • hash-table-ish
  46. 46. your code before v = db.query( SOME SLOW QUERY )
  47. 47. your code after v = memcachedClient.get(key) if (!v) { v = db.query( SOME SLOW QUERY ) memcachedClient.set(key, v) }
  48. 48. cache read-heavy stuff
  49. 49. invalidate when writing • db.execute(“UPDATE foo WHERE ...”) • memcachedClient.delete(...)
  50. 50. and, repeat • each day... • look for the next slowest operations • add code to cache a few more things
  51. 51. your life gets better
  52. 52. thank you memcached!
  53. 53. no magic
  54. 54. you are in control
  55. 55. now for the decisions
  56. 56. memcached adoption • first, start using memcached • poorly • but you can breathe again
  57. 57. memcached adoption • next, start using memcached correctly
  58. 58. memcached adoption • later • queueing • persistence • replication • ...
  59. 59. an early question
  60. 60. where to run servers?
  61. 61. answer 1 • right on your web servers • a great place to start, if you have extra memory
  62. 62. servers web app web app web app web app memcached memcached memcached, memcached RDBMS
  63. 63. add up your memory usage! • having memcached server swap == bad!
  64. 64. answer 2 • run memcached right on your database server? • WRONG!
  65. 65. answer 3 • run memcached on separate dedicated memcached servers • congratulations! • you either have enough money • or enough traffic that it matters
  66. 66. running a server • daemonize • don’t be root! • no security
  67. 67. server lists • mc-server1:11211 • mc-server2:11211 • mc-server3:11211
  68. 68. consistent hashing source: http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/
  69. 69. client-side intelligence • no “server master” bottleneck
  70. 70. libmemcached • fast C memcached client • supports consistent hashing • many wrappers to your favorite languages
  71. 71. updating server lists • push out new configs and restart? • moxi • memcached + integrated proxy
  72. 72. keys • no whitespace • 250 char limit • use short prefixes
  73. 73. keys & MD5 • don’t • stats become useless
  74. 74. values • any binary object • 1MB limit • change #define & recompile if you want more • and you’re probably doing something wrong if you want more
  75. 75. values • query resultset • serialized object • page fragment • • pages etc
  76. 76. nginx + memcached
  77. 77. >1 language? • JSON • protocol buffers • XML
  78. 78. memcached is lossy • memcached WILL lose data
  79. 79. that’s a good thing remember, it’s a CACHE
  80. 80. why is memcached lossy?
  81. 81. memcached node dies
  82. 82. when node restarts... • you just get a bunch of cache misses (and a short RDBMS spike)
  83. 83. eviction more disappearing data!
  84. 84. LRU • can config memcached to not evict • but, you’re probably doing something wrong if you do this
  85. 85. remember, it forgets • it’s just a CACHE
  86. 86. expiration • aka, timeouts • memcached.set(key, value, timeout)
  87. 87. use expirations or not?
  88. 88. 1st school of thought • expirations hide bugs • you should be doing proper invalidations • (aka, deletes) • coherency!
  89. 89. school 2 • it’s 3AM and I can’t think anymore • business guy: • “sessions should auto-logout after 30 minutes due to bank security policy”
  90. 90. put sessions in memcached? • just a config change • eg, Ruby on Rails
  91. 91. good • can load-balance requests to any web host • don’t touch the RDBMS on every web request
  92. 92. bad • could lose a user’s session
  93. 93. solution • save sessions to memcached • the first time, also save to RDBMS • ideally, asynchronously • on cache miss, restore from RDBMS
  94. 94. solution • save sessions to memcached • the first time, also save to RDBMS • ideally, asynchronously • on cache miss, restore from RDBMS
  95. 95. in the background... • have a job querying the RDBMS • cron job? • the job queries for “old” looking session records in the sessions table • refresh old session records from memcached
  96. 96. add vs replace vs set
  97. 97. append vs prepend
  98. 98. CAS • compare - and - swap
  99. 99. incr and decr • no negative numbers
  100. 100. queueing • “hey, with those primitives, I could build a queue!”
  101. 101. don’t • memcached is lossy • protocol is incorrect for a queue • instead • gearman • beanstalkd • etc
  102. 102. cache stampedes • gearman job-unique-id • encode a timestamp in your values • one app node randomly decides to refresh slightly early
  103. 103. coherency
  104. 104. denormalization • or copies of data
  105. 105. example: changing a product price
  106. 106. memcached UDF’s • another great tool in your toolbox • on a database trigger, delete stuff from memcached
  107. 107. memcached UDF’s • works even if you do UPDATES with fancy WHERE clauses
  108. 108. multigets • they are your friend • memcached is fast, but... • imagine 1ms for a get request • 200 serial gets ==> 200ms
  109. 109. a resultset loop foreach product in resultset c = memcached.get(product.category_id) do something with c
  110. 110. 2 loops for product in resultset multiget_request.append(product.category_id) multiget_response = memcachedClient.multiget( multiget_request) for c in multiget_response do something with c
  111. 111. memcached slabber • allocates memory into slabs • it might “learn” the wrong slab sizes • watch eviction stats
  112. 112. losing a node • means your RDBMS gets hit
  113. 113. replication • simple replication in libmemcached • >= 2x memory cost • only simple verbs • set, get, delete • doesn’t handle flapping nodes
  114. 114. persistence
  115. 115. things that speak memcached • tokyo tyrant • memcachedb • moxi
  116. 116. another day • monitoring & statistics • near caching • moxi
  117. 117. thanks!!! • love any feedback • your memcached war stories • your memcached wishlist • steve.yen@northscale.com
  118. 118. thanks! photo credits • http://flickr.com/photos/davebluedevil/15877348/ • http://www.flickr.com/photos/theamarand/2874288064/ • http://www.flickr.com/photos/splityarn/3469596708/ • http://www.flickr.com/photos/heisnofool/3241930754/ • http://www.flickr.com/photos/onourminds/2885704630/ • http://www.flickr.com/photos/lunaspin/990825818/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×