Delayed operations
with queues
Yuriy Gerasimov
Frédéric G. Marand
Session track: PHP
Who are we?
Yuriy Gerasimov
● FFW
● Drupal architect & developer
● Contrib 7 modules: services, draggableviews
● Founder at Backtrac.io
ygerasimov
Frédéric G. Marand
● OSInet: performance/architecture consulting
for internal teams at larger accounts
● Core contributor 4.7 to 8.0.x, MongoDB + XMLRPC
maintainer + others
● Already 4 D8 customer projects before 8.0.0
● Customer D8 in production since 07/2015
● Frequently adds queueing to larger Drupal projects
fgm
Why use queues ?
To have websites which are :
● Faster for visitors
● Snappier for editors
● More scaleable
To process time-consuming jobs :
● Video encoding
● High-resolution gallery uploads and processing
Concrete use cases
● Prepare content for non-Drupal front-ends
● Anticipate content generation
● Deferred submits, e.g. comments handling
● Slow operations: node saves, previews, image processing
● External data sources: pull, push
● Multi-step operations: batch
Cooking for front-ends
Front
end
Anticipated content generation
Blocks
Ctools content types
Controllers
etc.
Contrib :
http://github.com/FGM/lazy
Content created Served from cache
Fresh Stale Expiredt0
t1
t2
Served from cache Regenerate cache
time
Usual Drupal
Content created Served from cache
Fresh Stale Fresht0
t1
t2
Served from cache
+ request update Store
Served from cache
time
Anticipated content generation
Comment handling
“Pull” data sources (aggregator)
“Push” data sources
Image processing
Job servers
● How to get results
● Rerun failed jobs
● Separate queue for failed jobs
● Monitoring queues, workers
● Supervisor
Some implementations
Queue D6 D7 D8
Memory core core
Database OK core core
AdvancedQueue OK Not yet
Amazon SQS (aws_sqs) OK Not yet
Beanstalkd OK 8.1/8.2
evQueue Started
Queue D6 D7 D8
IronMQ (iron.io) OK Not yet
Gearman OK OK Not yet
MongoDB OK Started
PHPResque
RabbitMQ OK Not yet
Redis (redis_queue) OK OK Not yet
Queues API: concepts
Queue: a minimally-featured FIFO
Worker: the code actually doing the work
Item: a piece of workload submitted to the queue
Runner: the process triggering/monitoring workers
Batch subsystem: a high-level API on top of Queue API
D8: Manager, Plugins
D6/D7 Queue API
D7: core
D6: drupal_queue module
Declaring queues:
hook_cron_queue_info[_alter]()
● “Skip on cron”: enable decoupling from cron runs
● Time: max lifetime allocated to process items
during a cron run, useless with skip on cron =
TRUE
● Worker callback: an implementation of
callback_queue_worker (mixed
queue_item): void
API useable without cron
Default Runner:
● In the cron subsystem
● Pokemon exception handling
D8 Queue API
API useable without cron Declaring queue workers:
Service: plugin.manager.queue_worker
Instantiates QueueWorker plugins
Definition:
● Cron, not enabled by default
○ Time: max lifetime allocated to
process items during a cron run
● Core examples : AggregatorRefresh,
LocaleTranslation
● hook_queue_info_alter()
Default Runner:
In the cron subsystem:
DrupalCoreCron::processQueues()
SuspendQueueException: $q-
>releaseItem()
Queue API methods: Queue
QueueInterface
● Q::createItem(mixed $data: void
● Q::claimItem($lease_time = 3600: mixed $item
○ FALSE | stdClass + [item_id => int, data => mixed, created => timestamp]
○ $lease_time → Assumptions for runner, currently not used
● Q::deleteItem($item): void -> work done
● Q::releaseItem($item): bool
● Q::numberOfItems(): int → best guess, unreliable
● Q::createQueue() / Q::deleteQueue()
ReliableQueueInterface: ordering, single execution
Queue API methods: others
Queue service → QueueFactory::get($name, $reliable)
QueueManager: a vanilla plugin manager
● In charge of hook_queue_info_alter()
● createInstance($plugin_id, $configuration)
QueueWorkerInterface:
● processItem (mixed data) : void @throws SuspendQueueException
Queue Runners
Core / Contrib
● Core Cron / Elysia Cron / Queue_Runner
● Drush: queue-list / queue-run
● Similar limitations:
○ Default on in D6 / D7, default off in D8
○ Limited timeout support: non preemptive
○ Single threaded, single process across queues
Custom runners
● Provided by queue modules or per-project one-offs
● Preemption, parallel execution...
Queue API limitations
Limited FIFO paradigm
● D8: non-Reliable
QueueInterface: datagram
No monitoring
No queue disciplines
● Priority management
● Tagging
● Delay, burying ...
Implementations may provide more
● Item structure is free-form: add
richer interfaces
No Peek(), no LIFO,
no deduplication: hacks
Performance edge
Runners:
● Avoid active polling à la core DB
● Use a blocking layer + select()
● Parallel handling of multiple queues → multiple runners, scheduling
Workers: read after write
● Write in the queue → cache invalidated
● Read again→ cache primed
Sprint: Friday
https://www.flickr.
com/photos/amazeelabs/9965814443/in/fav
es-38914559@N03/
Sprint with the Community on Friday.
We have tasks for every skillset.
Mentors are available for new contributors.
An optional Friday morning workshop for first-
time sprinters will help you get set up.
Follow @drupalmentoring.
Delayed operations with queues for website performance
Delayed operations with queues for website performance

Delayed operations with queues for website performance

  • 2.
    Delayed operations with queues YuriyGerasimov Frédéric G. Marand Session track: PHP
  • 3.
  • 4.
    Yuriy Gerasimov ● FFW ●Drupal architect & developer ● Contrib 7 modules: services, draggableviews ● Founder at Backtrac.io ygerasimov
  • 5.
    Frédéric G. Marand ●OSInet: performance/architecture consulting for internal teams at larger accounts ● Core contributor 4.7 to 8.0.x, MongoDB + XMLRPC maintainer + others ● Already 4 D8 customer projects before 8.0.0 ● Customer D8 in production since 07/2015 ● Frequently adds queueing to larger Drupal projects fgm
  • 6.
    Why use queues? To have websites which are : ● Faster for visitors ● Snappier for editors ● More scaleable To process time-consuming jobs : ● Video encoding ● High-resolution gallery uploads and processing
  • 7.
    Concrete use cases ●Prepare content for non-Drupal front-ends ● Anticipate content generation ● Deferred submits, e.g. comments handling ● Slow operations: node saves, previews, image processing ● External data sources: pull, push ● Multi-step operations: batch
  • 8.
  • 9.
    Anticipated content generation Blocks Ctoolscontent types Controllers etc. Contrib : http://github.com/FGM/lazy Content created Served from cache Fresh Stale Expiredt0 t1 t2 Served from cache Regenerate cache time Usual Drupal Content created Served from cache Fresh Stale Fresht0 t1 t2 Served from cache + request update Store Served from cache time Anticipated content generation
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
    Job servers ● Howto get results ● Rerun failed jobs ● Separate queue for failed jobs ● Monitoring queues, workers ● Supervisor
  • 15.
    Some implementations Queue D6D7 D8 Memory core core Database OK core core AdvancedQueue OK Not yet Amazon SQS (aws_sqs) OK Not yet Beanstalkd OK 8.1/8.2 evQueue Started Queue D6 D7 D8 IronMQ (iron.io) OK Not yet Gearman OK OK Not yet MongoDB OK Started PHPResque RabbitMQ OK Not yet Redis (redis_queue) OK OK Not yet
  • 16.
    Queues API: concepts Queue:a minimally-featured FIFO Worker: the code actually doing the work Item: a piece of workload submitted to the queue Runner: the process triggering/monitoring workers Batch subsystem: a high-level API on top of Queue API D8: Manager, Plugins
  • 17.
    D6/D7 Queue API D7:core D6: drupal_queue module Declaring queues: hook_cron_queue_info[_alter]() ● “Skip on cron”: enable decoupling from cron runs ● Time: max lifetime allocated to process items during a cron run, useless with skip on cron = TRUE ● Worker callback: an implementation of callback_queue_worker (mixed queue_item): void API useable without cron Default Runner: ● In the cron subsystem ● Pokemon exception handling
  • 18.
    D8 Queue API APIuseable without cron Declaring queue workers: Service: plugin.manager.queue_worker Instantiates QueueWorker plugins Definition: ● Cron, not enabled by default ○ Time: max lifetime allocated to process items during a cron run ● Core examples : AggregatorRefresh, LocaleTranslation ● hook_queue_info_alter() Default Runner: In the cron subsystem: DrupalCoreCron::processQueues() SuspendQueueException: $q- >releaseItem()
  • 19.
    Queue API methods:Queue QueueInterface ● Q::createItem(mixed $data: void ● Q::claimItem($lease_time = 3600: mixed $item ○ FALSE | stdClass + [item_id => int, data => mixed, created => timestamp] ○ $lease_time → Assumptions for runner, currently not used ● Q::deleteItem($item): void -> work done ● Q::releaseItem($item): bool ● Q::numberOfItems(): int → best guess, unreliable ● Q::createQueue() / Q::deleteQueue() ReliableQueueInterface: ordering, single execution
  • 20.
    Queue API methods:others Queue service → QueueFactory::get($name, $reliable) QueueManager: a vanilla plugin manager ● In charge of hook_queue_info_alter() ● createInstance($plugin_id, $configuration) QueueWorkerInterface: ● processItem (mixed data) : void @throws SuspendQueueException
  • 21.
    Queue Runners Core /Contrib ● Core Cron / Elysia Cron / Queue_Runner ● Drush: queue-list / queue-run ● Similar limitations: ○ Default on in D6 / D7, default off in D8 ○ Limited timeout support: non preemptive ○ Single threaded, single process across queues Custom runners ● Provided by queue modules or per-project one-offs ● Preemption, parallel execution...
  • 22.
    Queue API limitations LimitedFIFO paradigm ● D8: non-Reliable QueueInterface: datagram No monitoring No queue disciplines ● Priority management ● Tagging ● Delay, burying ... Implementations may provide more ● Item structure is free-form: add richer interfaces No Peek(), no LIFO, no deduplication: hacks
  • 23.
    Performance edge Runners: ● Avoidactive polling à la core DB ● Use a blocking layer + select() ● Parallel handling of multiple queues → multiple runners, scheduling Workers: read after write ● Write in the queue → cache invalidated ● Read again→ cache primed
  • 24.
    Sprint: Friday https://www.flickr. com/photos/amazeelabs/9965814443/in/fav es-38914559@N03/ Sprint withthe Community on Friday. We have tasks for every skillset. Mentors are available for new contributors. An optional Friday morning workshop for first- time sprinters will help you get set up. Follow @drupalmentoring.