The MagicalWorld of Gearman           Brian Moon          dealnews.com   http://brian.moonspot.net/          @brianlmoon
Basic FeaturesUse Cases            How It Works
“The way I like to think of Gearman is as a massivelydistributed, massively fault tolerant fork mechanism.”               ...
The Basics• Clients need jobs done• Workers can do jobs• Gearmand coordinates the work
Gearmand           http://www.flickr.com/photos/andrefromont/4896802557
GearmandDaemon that manages the work.Does not do any work.Accepts a job id and a binary payload fromclients.Workers keep c...
Client         http://www.flickr.com/photos/pitadel/4951801589
ClientClients connect to Gearmand and ask forwork to be done.The client can fire and forget or wait on aresponse.Multiple ...
Workers          http://www.flickr.com/photos/nathaninsandiego/5972599772
WorkersDaemonized codeA single worker can do just one job or cando many jobs.Does not have to be written using thesame lan...
Key Features• Background jobs• De-duplication of jobs• Multiple jobs per client• High, normal and low priority• Work will ...
Background Jobs• Clients can fire and forget work to be done• Well suited for data marshalling• Minimal ability to track th...
De-duplication• Clients provide a unique job id• If more than one client provides the same  job id, work is done once• Not...
Priority• High, Normal and Low priority options.• New items are inserted at the end of the  queue based on priority• Prior...
Worker Selection• Uses the “game show method”• Workers that do multiple jobs will more  likely get jobs “higher” in their ...
Operational Visibility• Gearmand can report status about jobs and  workers• It is only a view of current status, not  hist...
Marshalling Data
Memcached                              Main                               Main                                Main        ...
Memcached                                     Main                                      Main                              ...
Main                   Main                  Optimized                 Database                 Database         CRO      ...
Main                       Main                      Optimized                     Database                     Database  ...
Main                     Main                    Optimized                   Database                   Database          ...
Why Gearman• Rid us of database spikes• Changes “feel” realtime• In the case of an issue, changes can queue  up and happen...
SMTP Replacement• Large daily newsletter at 3PM• Email alerts go out on demand to  thousands of readers as deals are publi...
Web  Web                                   Cron   Web    Web                                 Cron                         ...
Logging
Logging Options• Disk - reliable unless load is high. Can’t be  queried easily in real time.• MySQL - Can make complex que...
Logging via Gearman• Frontend can fire and forget log data,  returning immediately to the application• Log data is queued• ...
Web  Web   Web    WebServers     Web                                 Writing Log Data Servers      Web  Servers       Web ...
Web  Web                        Querying Log Data   Web    WebServers     Web Servers      Web  Servers       Web   Server...
Request Funneling
Normalizing URIshttp://dealnews.com/?ref=google_10-corporate&s_kwcid=%7Bifcontent%3AContentNetwork%7D%7Bifsearch%3A%7Bkeyw...
Normalizing URIs  http://dealnews.com/
Normalizing URIs• Define what parameters a request needs  • sort  • view  • region  • date  • start• Throw out the rest• So...
Normalizing URIs    •   http://dealnews.com/    •   http://dealnews.com/?sort=category    •   http://dealnews.com/?ref=foo...
Why normalize/funnel?• We can now cache the data for this request and  know it is the same data even if the original URI i...
Why normalize/funnel?• 72 Unique URIs for the front page in 3 minute spike.  There were only 6 possible real versions. (no...
Request Funneling                        Proxy ServerApache   Apache                Apache                  Apache        ...
What does a worker do?• Builds a new URI from the input data• Makes an HTTP request to an app server• If cacheable, stores...
The Magical World of Gearman                  Brian Moon                 dealnews.com          http://brian.moonspot.net/ ...
Upcoming SlideShare
Loading in...5
×

Gearman

2,159

Published on

How Gearman works and how we use it at dealnews.com.

Published in: Technology, Business
1 Comment
8 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,159
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
57
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Gearman

    1. 1. The MagicalWorld of Gearman Brian Moon dealnews.com http://brian.moonspot.net/ @brianlmoon
    2. 2. Basic FeaturesUse Cases How It Works
    3. 3. “The way I like to think of Gearman is as a massivelydistributed, massively fault tolerant fork mechanism.” - Joe Stump
    4. 4. The Basics• Clients need jobs done• Workers can do jobs• Gearmand coordinates the work
    5. 5. Gearmand http://www.flickr.com/photos/andrefromont/4896802557
    6. 6. GearmandDaemon that manages the work.Does not do any work.Accepts a job id and a binary payload fromclients.Workers keep connections open at alltimes. http://www.flickr.com/photos/andrefromont/4896802557
    7. 7. Client http://www.flickr.com/photos/pitadel/4951801589
    8. 8. ClientClients connect to Gearmand and ask forwork to be done.The client can fire and forget or wait on aresponse.Multiple jobs can be done asynchronouslyby workers for one client. http://www.flickr.com/photos/pitadel/4951801589
    9. 9. Workers http://www.flickr.com/photos/nathaninsandiego/5972599772
    10. 10. WorkersDaemonized codeA single worker can do just one job or cando many jobs.Does not have to be written using thesame language as the worker. http://www.flickr.com/photos/nathaninsandiego/5972599772
    11. 11. Key Features• Background jobs• De-duplication of jobs• Multiple jobs per client• High, normal and low priority• Work will be resubmitted if not completed
    12. 12. Background Jobs• Clients can fire and forget work to be done• Well suited for data marshalling• Minimal ability to track the status
    13. 13. De-duplication• Clients provide a unique job id• If more than one client provides the same job id, work is done once• Not a cache, once the job is done, the id is gone. The work will be done again.
    14. 14. Priority• High, Normal and Low priority options.• New items are inserted at the end of the queue based on priority• Priority is per job type, not global
    15. 15. Worker Selection• Uses the “game show method”• Workers that do multiple jobs will more likely get jobs “higher” in their list• Can appear to be clearing out one queue over another, but not really a design choice
    16. 16. Operational Visibility• Gearmand can report status about jobs and workers• It is only a view of current status, not historical• Use outside tools to graph what work was done when
    17. 17. Marshalling Data
    18. 18. Memcached Main Main Main Database Database Database Web Web Web WebServers Web Servers Web Servers Web Servers Web Servers Servers Servers Servers
    19. 19. Memcached Main Main Main Database Database Database Web Web Web WebServers Web Servers Web Servers Web Servers Web Servers Servers Servers Servers This is so 2005!
    20. 20. Main Main Optimized Database Database CRO Database or In N Main Main Proc ess Main Database Database Database Web Web Web WebServers Web Servers Web Servers Web Servers Web Servers Servers Servers Servers
    21. 21. Main Main Optimized Database Database CRO Database or In N Main Main Proc ess Main Database Database Database Web Web Web WebServers Web Servers Web Servers Web Servers Web Servers Servers Servers Servers This is so 2009!
    22. 22. Main Main Optimized Database Database Database Main Main Main Web Web Database Database Web Web DatabaseServers Web Servers Web Servers Web Servers Web Servers Servers Servers Servers Gearman Gearman Gearman Gearman Workers Gearman Workers Gearman Workers Gearman Workers Gearman Workers Workers Workers Workers Gearmand Backend Events
    23. 23. Why Gearman• Rid us of database spikes• Changes “feel” realtime• In the case of an issue, changes can queue up and happen when things are stable• Changes can happen asynchronously
    24. 24. SMTP Replacement• Large daily newsletter at 3PM• Email alerts go out on demand to thousands of readers as deals are published• Bottleneck was from double queuing in the mail queue• SMTP Server was a single point of failure
    25. 25. Web Web Cron Web Web Cron CronServers Web Servers Cron Jobs Web Servers Web Backend Jobs Servers Web Servers Jobs Jobs Servers Servers Events Servers Gearmand Gearman Gearman Gearman Gearman Gearman Gearman Gearman Gearman Workers Gearman Workers Workers Gearman Workers Gearman Workers Gearman Gearman Workers Gearman Workers Gearman Workers SMTP Workers Gearman Workers SMTP Workers Workers Workers Workers Workers Server Workers Server
    26. 26. Logging
    27. 27. Logging Options• Disk - reliable unless load is high. Can’t be queried easily in real time.• MySQL - Can make complex queries against it. Under high load, data can be lost• Other - (Spread, Scribe, etc.) New daemons to manage, learn, scale, etc.
    28. 28. Logging via Gearman• Frontend can fire and forget log data, returning immediately to the application• Log data is queued• Workers can process the logs in any number of ways• Log data can be stored any number of ways
    29. 29. Web Web Web WebServers Web Writing Log Data Servers Web Servers Web Servers Web Servers Servers Servers Servers Gearmand Gearman Gearman Gearman Gearman Gearman Gearman Gearman Gearman Workers Gearman Workers Workers Gearman Workers Gearman Workers Gearman Gearman Workers Gearman Workers Gearman Workers MySQL Workers Gearman Workers MySQL Workers Workers Workers Workers Workers Server Workers Server
    30. 30. Web Web Querying Log Data Web WebServers Web Servers Web Servers Web Servers (Map Reduce “ish”) Backend Servers Servers Servers App Gearmand Gearman Gearman Gearman Gearman Gearman Gearman Gearman Gearman Workers Gearman Workers Workers Gearman Workers Gearman Workers Gearman Gearman Workers Gearman Workers Gearman Workers MySQL Workers Gearman Workers MySQL Workers Workers Workers Workers Workers Server Workers Server
    31. 31. Request Funneling
    32. 32. Normalizing URIshttp://dealnews.com/?ref=google_10-corporate&s_kwcid=%7Bifcontent%3AContentNetwork%7D%7Bifsearch%3A%7Bkeyword%7D%7D%7C%7Bcreative%7D&WT.term=newdeals&WT.campaign=1799&WT.source=google&WT.medium=cpc&WT.content=606053200&cshift_ck=1880996632cs606053200&WT.srch=1http://dealnews.com/?sort=categoryhttp://dealnews.com/?view=large
    33. 33. Normalizing URIs http://dealnews.com/
    34. 34. Normalizing URIs• Define what parameters a request needs • sort • view • region • date • start• Throw out the rest• Sort what you need• Build the real URL
    35. 35. Normalizing URIs • http://dealnews.com/ • http://dealnews.com/?sort=category • http://dealnews.com/?ref=foobar • http://dealnews.com/?region=nyc All become:http://dealnews.com/?sort=category&view=large&region=nyc (assuming the user is in New York)
    36. 36. Why normalize/funnel?• We can now cache the data for this request and know it is the same data even if the original URI is different. (cache reuse)• We can fetch the content only once for all requests coming in for the content via request funneling.
    37. 37. Why normalize/funnel?• 72 Unique URIs for the front page in 3 minute spike. There were only 6 possible real versions. (normalizing)• Thousands of syndication requests hit the app servers between 10:43 and 10:45. There were only 86 unique URIs. (funneling)
    38. 38. Request Funneling Proxy ServerApache Apache Apache Apache Apache Child Child Child Child Child http://dealnews.com/?sort=category&view=large&region=nyc Gearmand Gearman Web Worker Server
    39. 39. What does a worker do?• Builds a new URI from the input data• Makes an HTTP request to an app server• If cacheable, stores the data in the cache (important!)• Returns the data (page) to the proxy (via Gearmand)
    40. 40. The Magical World of Gearman Brian Moon dealnews.com http://brian.moonspot.net/ @brianlmoon More Information: http://gearman.org/ Need to run PHP workers?https://github.com/brianlmoon/GearmanManager
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×