Faster Drupal sites using Queue API

1,705 views

Published on

Face it: most Drupal intranets / extranets / back-offices feel sluggish, and that's because they do too much during the page cycle. Make them snappier by deferring work to a Queue worker.

Published in: Software
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,705
On SlideShare
0
From Embeds
0
Number of Embeds
1,132
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Faster Drupal sites using Queue API

  1. 1. Frédéric G. MARAND (fgm) Yuriy GERASIMOV (ygerasimov) OSInet ygerasimov Faster web sites with Queue API
  2. 2. Frédéric G. Marand ● OSInet: performance/architecture consulting for internal teams at larger accounts ● Core contributor 4.7 to 8.2.x, MongoDB + XMLRPC maintainer + others ● Already 7 D8 customer projects , 4 before 8.0.0 ● Customer D8 in production since 07/2015 ● Frequently adds queueing to larger Drupal projects : Beanstalkd, RabbitMQ, Apache Kafka... fgm
  3. 3. Yuriy Gerasimov ● FFW ● Drupal architect & developer ● Contrib 7 modules: services, draggableviews ● Founder at Backtrac.io ygerasimov
  4. 4. Why use queues ? To have websites which are : ● Faster for visitors ● Snappier for editors ● More scaleable To process time-consuming jobs : ● Video encoding ● High-resolution gallery uploads and processing
  5. 5. Actual use cases ● Prepare content for non-Drupal front-ends ● Anticipate content generation ● Deferred submits, e.g. comments handling ● Slow operations: node saves, previews, image processing ● External data sources: pull, push ● Multi-step operations: batch
  6. 6. Cooking content for front-ends Front end
  7. 7. Anticipated content generation Blocks Ctools content types Controllers etc. Contrib : http://github.com/FGM/lazy Content created Served from cache Fresh Stale Expiredt0 t1 t2 Served from cache Regenerate cache time Usual Drupal Content created Served from cache Fresh Stale Fresht0 t1 t2 Served from cache + request update Store Served from cache time Anticipated content generation
  8. 8. Comments handling
  9. 9. “Pull” data sources (aggregator)
  10. 10. “Push” data sources
  11. 11. Image processing
  12. 12. Job servers ● How to get results ● Rerun failed jobs ● Separate queue for failed jobs ● Monitoring queues, workers ● Supervisor
  13. 13. Some implementations Queue D6 D7 D8 Memory core core Database OK core core AdvancedQueue OK Not yet Amazon SQS (aws_sqs) OK Not yet Beanstalkd OK OK evQueue Private Queue D6 D7 D8 Apache Kafka OK Started Gearman OK OK Not yet MongoDB OK Started PHPResque OK Not yet RabbitMQ OK OK Redis (redis[_queue]) OK OK Alpha
  14. 14. Queue API: concepts Queue: a minimally-featured FIFO Worker: the code actually doing the work Item: a piece of workload submitted to the queue Runner: the process triggering/monitoring workers Batch subsystem: a high-level API on top of Queue API D8: Manager, Plugins
  15. 15. D6/D7 Queue API D7: core D6: drupal_queue module Declaring queues: hook_cron_queue_info[_alter]() ● “Skip on cron”: enable decoupling from cron runs ● Time: max lifetime allocated to process items during a cron run, useless with skip on cron = TRUE ● Worker callback: an implementation of callback_queue_worker (mixed queue_item): void API useable without cron Default Runner: ● In the cron subsystem ● Pokemon exception handling
  16. 16. D8 Queue API API useable without cron Declaring queue workers: Service: plugin.manager.queue_worker Instantiates QueueWorker plugins Definition: ● Cron, not enabled by default ○ Time: max lifetime allocated to process items during a cron run ● Core examples : AggregatorRefresh, LocaleTranslation ● hook_queue_info_alter() Default Runner: In the cron subsystem: DrupalCoreCron::processQueues() SuspendQueueException: $q- >releaseItem()
  17. 17. Queue API methods: Queue QueueInterface ● Q::createItem(mixed $data: void ● Q::claimItem($lease_time = 3600: mixed $item ○ FALSE | stdClass + [item_id => int, data => mixed, created => timestamp] ○ $lease_time → Assumptions for runner, currently not used ● Q::deleteItem($item): void -> work done ● Q::releaseItem($item): bool ● Q::numberOfItems(): int → best guess, unreliable ● Q::createQueue() / Q::deleteQueue() ReliableQueueInterface: ordering, single execution
  18. 18. Queue API methods: others Queue service → QueueFactory::get($name, $reliable) QueueManager: a vanilla plugin manager ● In charge of hook_queue_info_alter() ● createInstance($plugin_id, $configuration) QueueWorkerInterface: ● processItem (mixed data) : void @throws SuspendQueueException
  19. 19. Queue Runners Core / Contrib ● Core Cron / Elysia Cron / Queue_Runner ● Drush: queue-list / queue-run ● Similar limitations: ○ Default on in D6 / D7, default off in D8 ○ Limited timeout support: non preemptive ○ Single threaded, single process across queues Custom runners ● Provided by queue modules or per-project one-offs ● Preemption, parallel execution...
  20. 20. Queue API limitations Limited FIFO paradigm ● D8: non-Reliable QueueInterface: datagram No monitoring No queue disciplines ● Priority management ● Tagging ● Delay, burying ... Implementations may provide more ● Item structure is free-form: add richer interfaces No Peek(), no LIFO, no deduplication: hacks
  21. 21. Performance edge Runners: ● Avoid active polling à la core DB ● Use a blocking layer + select() ● Parallel handling of multiple queues → multiple runners, scheduling Workers: read after write ● Write in the queue → cache invalidated ● Read again→ cache primed
  22. 22. Sprints: all week https://www.flickr.com/photos/amazeelabs/ 9965814443/in/faves-38914559@N03/ Sprint with the Community until Sunday We have tasks for every skillset. Mentors are available for new contributors. Follow @drupalmentoring.

×