0
all the little pieces
                         distributed systems with PHP

                              Andrei Zmievski...
Who is this guy?
                   • Open Source Fellow @ Digg
                   • PHP Core Developer since 1999
       ...
Why distributed?



                   • Because Moore’s Law will not save you
                   • Despite what DHH says
...
Share nothing


                   • Your Mom was wrong
                   • No shared data on application servers
       ...
distribute…


                   • memory (memcached)
                   • storage (mogilefs)
                   • work (g...
Building blocks


                   • GLAMMP - have you heard of it?
                   • Gearman + LAMP + Memcached
    ...
memcached
Friday, June 12, 2009
background

                   • created by Danga Interactive
                   • high-performance, distributed
         ...
background


                   • Very fast over the network and very
                        easy to set up
             ...
architecture

                    client                  memcached




                    client                  memcac...
architecture


                           memcached




Friday, June 12, 2009
slab #4                 4104 bytes


                        slab #3        1368 bytes    1368 bytes


                   ...
memory architecture

                   • memory allocated on startup,
                        released on shutdown
      ...
memory architecture


                   • items are deleted:
                        • on set
                        • o...
applications

                   • object cache
                   • output cache
                   • action flood control...
PHP clients


                   • a few private ones (Facebook, Yahoo!,
                        etc)
                   •...
pecl/memcached

                   • based on libmemcached
                   • released in January 2009
                 ...
API
                   • get         • cas
                   • set         • *_by_key
                   • add         • ...
consistent hashing
                                IP1



                        IP2-1

                                 ...
compare-and-swap (cas)


                   • “check and set”
                   • no update if object changed
           ...
compare-and-swap (cas)
      $m = new Memcached();
      $m->addServer('localhost', 11211);

      do {
                $i...
delayed “lazy” fetching


                   • issue request with getDelayed()
                   • do other work
        ...
binary protocol

                   • performance
                     • every request is parsed
                     • ca...
callbacks


                   • read-through cache callback
                   • if key is not found, invoke callback,
  ...
callbacks


                   • result callback
                   • invoked by getDelayed() for every
                  ...
buffered writes


                   • queue up write requests
                   • send when a threshold is exceeded or
 ...
key prefixing


                   • optional prefix prepended to all the
                        keys automatically
       ...
key locality



                   • allows mapping a set of keys to a
                        specific server




Friday, ...
multiple serializers


                   • PHP
                   • igbinary
                   • JSON (soon)



Friday, ...
future


                   • UDP support
                   • replication
                   • server management (ejectio...
tips & tricks
                   • 32-bit systems with > 4GB memory:
                        memcached -m4096 -p11211
    ...
tips & tricks


                   • write-through or write-back cache
                   • Warm up the cache on code push...
tips & tricks

                   • Don’t think row-level DB-style
                        caching; think complex objects
...
delete by namespace
      1         $ns_key = $memcache->get(quot;foo_namespace_keyquot;);
      2         // if not set, ...
storing lists of data

                   • Store items under indexed keys:
                        comment.12, comment.23...
preventing stampeding



                   • embedded probabilistic timeout
                   • gearman unique task tric...
optimization


                   • watch stats (eviction rate, fill, etc)
                        • getStats()
           ...
slabs

                   • Tune slab sizes to your needs:
                        • -f chunk size growth factor (default
...
slabs
         slab           class   1:   chunk   size   104   perslab 10082
         slab           class   2:   chunk  ...
slabs
                         Most objects: ∼1-2KB, some larger

         slab           class   1:   chunk   size   1048...
memcached @ digg



Friday, June 12, 2009
ops


                   • memcached on each app server (2GB)
                   • the process is niced to a lower level
 ...
key prefixes

                   • global key prefix for apc, memcached,
                        etc

                   • e...
cache chain

                   • multi-level caching: globals, APC,
                        memcached, etc.
             ...
other


                   • large objects (> 1MB)
                   • split on the client side
                   • save...
stats




Friday, June 12, 2009
alternatives


                   • in-memory: Tokyo Tyrant, Scalaris
                   • persistent: Hypertable, Cassand...
mogile
Friday, June 12, 2009
background

                   • created by Danga Interactive
                   • application-level distributed
         ...
background

                   • automatic file replication with custom
                        policies
                  ...
architecture
                                       node

                            tracker
                            ...
applications


                   • images
                   • document storage
                   • backing store for ce...
PHP client



                   • File_Mogile in PEAR
                   • MediaWiki one (not maintained)




Friday, Jun...
Example

      $hosts = array('172.10.1.1', '172.10.1.2');
      $m = new File_Mogile($hosts, 'profiles');
      $m->store...
mogile @ digg



Friday, June 12, 2009
mogile @ digg


                   • Wrapper around File_Mogile to cache
                        entries in memcache
     ...
mogile @ digg

                   • not huge (about 3.5 TB of data)
                   • files are replicated 3x
          ...
gearman
Friday, June 12, 2009
background


                   • created by Danga Interactive
                   • anagram of “manager”
                 ...
background


                   • parallel, asynchronous, scales well
                   • fire and forget, decentralized
 ...
background

                   • dispatch function calls to machines
                        that are better suited to do ...
architecture

              client                   worker




              client       gearmand    worker




        ...
applications

                   • thumbnail generation
                   • asynchronous logging
                   • cac...
servers



                   • Gearman-Server (Perl)
                   • gearmand (C)




Friday, June 12, 2009
clients

                   • Net_Gearman
                        • simplified, pretty stable
                   • pecl/gea...
Concepts


                   • Job
                   • Worker
                   • Task
                   • Client



F...
Net_Gearman

                   • Net_Gearman_Job
                   • Net_Gearman_Worker
                   • Net_Gearman...
Echo Job

      class Net_Gearman_Job_Echo extends Net_Gearman_Job_Common
      {
          public function run($arg)
    ...
Reverse Job
      class Net_Gearman_Job_Reverse extends Net_Gearman_Job_Common
      {
          public function run($arg)...
Worker
      define('NET_GEARMAN_JOB_PATH', './');

      require 'Net/Gearman/Worker.php';

      try {
          $worker...
Client

      require_once 'Net/Gearman/Client.php';

      function complete($job, $handle, $result) {
          echo quo...
Client


      $client = new Net_Gearman_Client(array('lager:4730'));

      $task = new Net_Gearman_Task('Reverse', range...
Client

      $set = new Net_Gearman_Set();
      $set->addTask($task);

      $client->runSet($set);

      $client->Echo...
pecl/gearman



                   • More complex API
                   • Jobs aren’t separated into files




Friday, Jun...
Worker
      $gmworker= new gearman_worker();
      $gmworker->add_server();
      $gmworker->add_function(quot;reversequo...
Client
      $gmclient= new gearman_client();

      $gmclient->add_server('lager');

      echo quot;Sending jobnquot;;

...
gearman @ digg



Friday, June 12, 2009
gearman @ digg

                   • 400,000 jobs a day
                   • Jobs: crawling, DB job, FB sync,
            ...
tips and tricks

                   • you can daemonize the workers easily
                        with daemon or supervis...
Thrift
Friday, June 12, 2009
background


                   • NOT developed by Danga (Facebook)
                   • cross-language services
         ...
background

                   • interface description language
                   • bindings: C++, C#, Cocoa, Erlang,
   ...
IDL



Friday, June 12, 2009
Demo



Friday, June 12, 2009
Thank You


                        http://gravitonic.com/talks
Friday, June 12, 2009
Upcoming SlideShare
Loading in...5
×

All The Little Pieces

5,331

Published on

Quick, what do memcache, MogileFS, and Gearman have in common? They are scalable, distributed technologies, and they can also interface with PHP, your ubiquitous web development language. Digg uses all 3 (and a few more) in its quest for social news domination, and this presentation shares what we’ve learned about them and how they are best utilized with PHP.

Published in: Technology
0 Comments
24 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,331
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
218
Comments
0
Likes
24
Embeds 0
No embeds

No notes for slide

Transcript of "All The Little Pieces"

  1. 1. all the little pieces distributed systems with PHP Andrei Zmievski @ Digg Dutch PHP Conference @ Amsterdam Friday, June 12, 2009
  2. 2. Who is this guy? • Open Source Fellow @ Digg • PHP Core Developer since 1999 • Architect of the Unicode/i18n support • Release Manager for PHP 6 • Twitter: @a • Beer lover (and brewer) Friday, June 12, 2009
  3. 3. Why distributed? • Because Moore’s Law will not save you • Despite what DHH says Friday, June 12, 2009
  4. 4. Share nothing • Your Mom was wrong • No shared data on application servers • Distribute it to shared systems Friday, June 12, 2009
  5. 5. distribute… • memory (memcached) • storage (mogilefs) • work (gearman) Friday, June 12, 2009
  6. 6. Building blocks • GLAMMP - have you heard of it? • Gearman + LAMP + Memcached • Throw in Mogile too Friday, June 12, 2009
  7. 7. memcached Friday, June 12, 2009
  8. 8. background • created by Danga Interactive • high-performance, distributed memory object caching system • sustains Digg, Facebook, LiveJournal, Yahoo!, and many others • if you aren’t using it, you are crazy Friday, June 12, 2009
  9. 9. background • Very fast over the network and very easy to set up • Designed to be transient • You still need a database Friday, June 12, 2009
  10. 10. architecture client memcached client memcached client memcached Friday, June 12, 2009
  11. 11. architecture memcached Friday, June 12, 2009
  12. 12. slab #4 4104 bytes slab #3 1368 bytes 1368 bytes slab #2 456 bytes 456 bytes 456 bytes 152 152 152 152 152 slab #1 bytes bytes bytes bytes bytes Friday, June 12, 2009
  13. 13. memory architecture • memory allocated on startup, released on shutdown • variable sized slabs (30+ by default) • each object is stored in the slab most fitting its size • fragmentation can be problematic Friday, June 12, 2009
  14. 14. memory architecture • items are deleted: • on set • on get, if it’s expired • if slab is full, then use LRU Friday, June 12, 2009
  15. 15. applications • object cache • output cache • action flood control / rate limiting • simple queue • and much more Friday, June 12, 2009
  16. 16. PHP clients • a few private ones (Facebook, Yahoo!, etc) • pecl/memcache • pecl/memcached Friday, June 12, 2009
  17. 17. pecl/memcached • based on libmemcached • released in January 2009 • surface API similarity to pecl/ memcache • parity with other languages Friday, June 12, 2009
  18. 18. API • get • cas • set • *_by_key • add • getMulti • replace • setMulti • delete • getDelayed / fetch* • append • callbacks • prepend Friday, June 12, 2009
  19. 19. consistent hashing IP1 IP2-1 IP2-2 IP3 Friday, June 12, 2009
  20. 20. compare-and-swap (cas) • “check and set” • no update if object changed • relies on CAS token Friday, June 12, 2009
  21. 21. compare-and-swap (cas) $m = new Memcached(); $m->addServer('localhost', 11211); do { $ips = $m->get('ip_block', null, $cas); if ($m->getResultCode() == Memcached::RES_NOTFOUND) { $ips = array($_SERVER['REMOTE_ADDR']); $m->add('ip_block', $ips); } else { $ips[] = $_SERVER['REMOTE_ADDR']; $m->cas($cas, 'ip_block', $ips); } } while ($m->getResultCode() != Memcached::RES_SUCCESS); Friday, June 12, 2009
  22. 22. delayed “lazy” fetching • issue request with getDelayed() • do other work • fetch results with fetch() or fetchAll() Friday, June 12, 2009
  23. 23. binary protocol • performance • every request is parsed • can happen thousands times a second • extensibility • support more data in the protocol Friday, June 12, 2009
  24. 24. callbacks • read-through cache callback • if key is not found, invoke callback, save value to memcache and return it Friday, June 12, 2009
  25. 25. callbacks • result callback • invoked by getDelayed() for every found item • should not call fetch() in this case Friday, June 12, 2009
  26. 26. buffered writes • queue up write requests • send when a threshold is exceeded or a ‘get’ command is issued Friday, June 12, 2009
  27. 27. key prefixing • optional prefix prepended to all the keys automatically • allows for namespacing, versioning, etc. Friday, June 12, 2009
  28. 28. key locality • allows mapping a set of keys to a specific server Friday, June 12, 2009
  29. 29. multiple serializers • PHP • igbinary • JSON (soon) Friday, June 12, 2009
  30. 30. future • UDP support • replication • server management (ejection, status callback) Friday, June 12, 2009
  31. 31. tips & tricks • 32-bit systems with > 4GB memory: memcached -m4096 -p11211 memcached -m4096 -p11212 memcached -m4096 -p11213 Friday, June 12, 2009
  32. 32. tips & tricks • write-through or write-back cache • Warm up the cache on code push • Version the keys (if necessary) Friday, June 12, 2009
  33. 33. tips & tricks • Don’t think row-level DB-style caching; think complex objects • Don’t run memcached on your DB server — your DBAs might send you threatening notes • Use multi-get — run things in parallel Friday, June 12, 2009
  34. 34. delete by namespace 1 $ns_key = $memcache->get(quot;foo_namespace_keyquot;); 2 // if not set, initialize it 3 if ($ns_key === false) 4 $memcache->set(quot;foo_namespace_keyquot;, 5 rand(1, 10000)); 6 // cleverly use the ns_key 7 $my_key = quot;foo_quot;.$ns_key.quot;_12345quot;; 8 $my_val = $memcache->get($my_key); 9 // to clear the namespace: 10 $memcache->increment(quot;foo_namespace_keyquot;); Friday, June 12, 2009
  35. 35. storing lists of data • Store items under indexed keys: comment.12, comment.23, etc • Then store the list of item IDs in another key: comments • To retrieve, fetch comments and then multi-get the comment IDs Friday, June 12, 2009
  36. 36. preventing stampeding • embedded probabilistic timeout • gearman unique task trick Friday, June 12, 2009
  37. 37. optimization • watch stats (eviction rate, fill, etc) • getStats() • telnet + “stats” commands • peep (heap inspector) Friday, June 12, 2009
  38. 38. slabs • Tune slab sizes to your needs: • -f chunk size growth factor (default 1.25) • -n minimum space allocated for key +value+flags (default 48) Friday, June 12, 2009
  39. 39. slabs slab class 1: chunk size 104 perslab 10082 slab class 2: chunk size 136 perslab 7710 slab class 3: chunk size 176 perslab 5957 slab class 4: chunk size 224 perslab 4681 ... slab class 38: chunk size 394840 perslab 2 slab class 39: chunk size 493552 perslab 2 Default: 38 slabs Friday, June 12, 2009
  40. 40. slabs Most objects: ∼1-2KB, some larger slab class 1: chunk size 1048 perslab 1000 slab class 2: chunk size 1064 perslab 985 slab class 3: chunk size 1080 perslab 970 slab class 4: chunk size 1096 perslab 956 ... slab class 198: chunk size 9224 perslab 113 slab class 199: chunk size 9320 perslab 112 memcached -n 1000 -f 1.01 Friday, June 12, 2009
  41. 41. memcached @ digg Friday, June 12, 2009
  42. 42. ops • memcached on each app server (2GB) • the process is niced to a lower level • separate pool for sessions • 2 servers keep track of cluster health Friday, June 12, 2009
  43. 43. key prefixes • global key prefix for apc, memcached, etc • each pool has additional, versioned prefix: .sess.2 • the key version is incremented on each release • global prefix can invalidate all caches Friday, June 12, 2009
  44. 44. cache chain • multi-level caching: globals, APC, memcached, etc. • all cache access is through Cache_Chain class • various configurations: • APC ➡ memcached • $GLOBALS ➡ APC Friday, June 12, 2009
  45. 45. other • large objects (> 1MB) • split on the client side • save the partial keys in a master one Friday, June 12, 2009
  46. 46. stats Friday, June 12, 2009
  47. 47. alternatives • in-memory: Tokyo Tyrant, Scalaris • persistent: Hypertable, Cassandra, MemcacheDB • document-oriented: CouchDB Friday, June 12, 2009
  48. 48. mogile Friday, June 12, 2009
  49. 49. background • created by Danga Interactive • application-level distributed filesystem • used at Digg, LiveJournal, etc • a form of “cloud caching” • scales very well Friday, June 12, 2009
  50. 50. background • automatic file replication with custom policies • no single point of failure • flat namespace • local filesystem agnostic • not meant for speed Friday, June 12, 2009
  51. 51. architecture node tracker node tracker app node DB node tracker node Friday, June 12, 2009
  52. 52. applications • images • document storage • backing store for certain caches Friday, June 12, 2009
  53. 53. PHP client • File_Mogile in PEAR • MediaWiki one (not maintained) Friday, June 12, 2009
  54. 54. Example $hosts = array('172.10.1.1', '172.10.1.2'); $m = new File_Mogile($hosts, 'profiles'); $m->storeFile('user1234', 'image', '/tmp/image1234.jpg'); ... $paths = $m->getPaths('user1234'); Friday, June 12, 2009
  55. 55. mogile @ digg Friday, June 12, 2009
  56. 56. mogile @ digg • Wrapper around File_Mogile to cache entries in memcache • fairly standard set-up • trackers run on storage nodes Friday, June 12, 2009
  57. 57. mogile @ digg • not huge (about 3.5 TB of data) • files are replicated 3x • the user profile images are cached on Netscaler (1.5 GB cache) • mogile cluster load is light Friday, June 12, 2009
  58. 58. gearman Friday, June 12, 2009
  59. 59. background • created by Danga Interactive • anagram of “manager” • a system for distributing work • a form of RPC mechanism Friday, June 12, 2009
  60. 60. background • parallel, asynchronous, scales well • fire and forget, decentralized • avoid tying up Apache processes Friday, June 12, 2009
  61. 61. background • dispatch function calls to machines that are better suited to do work • do work in parallel • load balance lots of function calls • invoke functions in other languages Friday, June 12, 2009
  62. 62. architecture client worker client gearmand worker client worker Friday, June 12, 2009
  63. 63. applications • thumbnail generation • asynchronous logging • cache warm-up • DB jobs, data migration • sending email Friday, June 12, 2009
  64. 64. servers • Gearman-Server (Perl) • gearmand (C) Friday, June 12, 2009
  65. 65. clients • Net_Gearman • simplified, pretty stable • pecl/gearman • more powerful, complex, somewhat unstable (under development) Friday, June 12, 2009
  66. 66. Concepts • Job • Worker • Task • Client Friday, June 12, 2009
  67. 67. Net_Gearman • Net_Gearman_Job • Net_Gearman_Worker • Net_Gearman_Task • Net_Gearman_Set • Net_Gearman_Client Friday, June 12, 2009
  68. 68. Echo Job class Net_Gearman_Job_Echo extends Net_Gearman_Job_Common { public function run($arg) { var_export($arg); echo quot;nquot;; } } Echo.php Friday, June 12, 2009
  69. 69. Reverse Job class Net_Gearman_Job_Reverse extends Net_Gearman_Job_Common { public function run($arg) { $result = array(); $n = count($arg); $i = 0; while ($value = array_pop($arg)) { $result[] = $value; $i++; $this->status($i, $n); } return $result; } } Reverse.php Friday, June 12, 2009
  70. 70. Worker define('NET_GEARMAN_JOB_PATH', './'); require 'Net/Gearman/Worker.php'; try { $worker = new Net_Gearman_Worker(array('localhost:4730')); $worker->addAbility('Reverse'); $worker->addAbility('Echo'); $worker->beginWork(); } catch (Net_Gearman_Exception $e) { echo $e->getMessage() . quot;nquot;; exit; } Friday, June 12, 2009
  71. 71. Client require_once 'Net/Gearman/Client.php'; function complete($job, $handle, $result) { echo quot;$job complete, result: quot;.var_export($result, true).quot;nquot;; } function status($job, $handle, $n, $d) { echo quot;$n/$dnquot;; } continued.. Friday, June 12, 2009
  72. 72. Client $client = new Net_Gearman_Client(array('lager:4730')); $task = new Net_Gearman_Task('Reverse', range(1,5)); $task->attachCallback(quot;completequot;,Net_Gearman_Task::TASK_COMPLETE); $task->attachCallback(quot;statusquot;,Net_Gearman_Task::TASK_STATUS); continued.. Friday, June 12, 2009
  73. 73. Client $set = new Net_Gearman_Set(); $set->addTask($task); $client->runSet($set); $client->Echo('Mmm... beer'); Friday, June 12, 2009
  74. 74. pecl/gearman • More complex API • Jobs aren’t separated into files Friday, June 12, 2009
  75. 75. Worker $gmworker= new gearman_worker(); $gmworker->add_server(); $gmworker->add_function(quot;reversequot;, quot;reverse_fnquot;); while (1) { $ret= $gmworker->work(); if ($ret != GEARMAN_SUCCESS) break; } function reverse_fn($job) { $workload= $job->workload(); echo quot;Received job: quot; . $job->handle() . quot;nquot;; echo quot;Workload: $workloadnquot;; $result= strrev($workload); echo quot;Result: $resultnquot;; return $result; } Friday, June 12, 2009
  76. 76. Client $gmclient= new gearman_client(); $gmclient->add_server('lager'); echo quot;Sending jobnquot;; list($ret, $result) = $gmclient->do(quot;reversequot;, quot;Hello!quot;); if ($ret == GEARMAN_SUCCESS) echo quot;Success: $resultnquot;; Friday, June 12, 2009
  77. 77. gearman @ digg Friday, June 12, 2009
  78. 78. gearman @ digg • 400,000 jobs a day • Jobs: crawling, DB job, FB sync, memcache manipulation, Twitter post, IDDB migration, etc. • Each application server has its own Gearman daemon + workers Friday, June 12, 2009
  79. 79. tips and tricks • you can daemonize the workers easily with daemon or supervisord • run workers in different groups, don’t block on job A waiting on job B • Make workers exit after N jobs to free up memory (supervisord will restart them) Friday, June 12, 2009
  80. 80. Thrift Friday, June 12, 2009
  81. 81. background • NOT developed by Danga (Facebook) • cross-language services • RPC-based Friday, June 12, 2009
  82. 82. background • interface description language • bindings: C++, C#, Cocoa, Erlang, Haskell, Java, OCaml, Perl, PHP, Python, Ruby, Smalltalk • data types: base, structs, constants, services, exceptions Friday, June 12, 2009
  83. 83. IDL Friday, June 12, 2009
  84. 84. Demo Friday, June 12, 2009
  85. 85. Thank You http://gravitonic.com/talks Friday, June 12, 2009
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×