Face it: most Drupal intranets / extranets / back-offices feel sluggish, and that's because they do too much during the page cycle. Make them snappier by deferring work to a Queue worker.
4. Why use queues ?
To have websites which are :
● Faster for visitors
● Snappier for editors
● More scaleable
To process time-consuming jobs :
● Video encoding
● High-resolution gallery uploads and processing
5. Actual use cases
● Prepare content for non-Drupal front-ends
● Anticipate content generation
● Deferred submits, e.g. comments handling
● Slow operations: node saves, previews, image processing
● External data sources: pull, push
● Multi-step operations: batch
7. Anticipated content generation
Blocks
Ctools content types
Controllers
etc.
Contrib :
http://github.com/FGM/lazy
Content created Served from cache
Fresh Stale Expiredt0 t1 t2
Served from cache Regenerate cache
time
Usual Drupal
Content created Served from cache
Fresh Stale Fresht0 t1 t2
Served from cache
+ request update Store
Served from cache
time
Anticipated content generation
12. Job servers
● How to get results
● Rerun failed jobs
● Separate queue for failed jobs
● Monitoring queues, workers
● Supervisor
13. Some implementations
Queue D6 D7 D8
Memory core core
Database OK core core
AdvancedQueue OK Not yet
Amazon SQS (aws_sqs) OK Not yet
Beanstalkd OK OK
evQueue Private
Queue D6 D7 D8
Apache Kafka OK Started
Gearman OK OK Not yet
MongoDB OK Started
PHPResque OK Not yet
RabbitMQ OK OK
Redis (redis[_queue]) OK OK Alpha
14. Queue API: concepts
Queue: a minimally-featured FIFO
Worker: the code actually doing the work
Item: a piece of workload submitted to the queue
Runner: the process triggering/monitoring workers
Batch subsystem: a high-level API on top of Queue API
D8: Manager, Plugins
15. D6/D7 Queue API
D7: core
D6: drupal_queue module
Declaring queues:
hook_cron_queue_info[_alter]()
● “Skip on cron”: enable decoupling from cron runs
● Time: max lifetime allocated to process items
during a cron run, useless with skip on cron =
TRUE
● Worker callback: an implementation of
callback_queue_worker (mixed
queue_item): void
API useable without cron
Default Runner:
● In the cron subsystem
● Pokemon exception handling
16. D8 Queue API
API useable without cron Declaring queue workers:
Service: plugin.manager.queue_worker
Instantiates QueueWorker plugins
Definition:
● Cron, not enabled by default
○ Time: max lifetime allocated to
process items during a cron run
● Core examples : AggregatorRefresh,
LocaleTranslation
● hook_queue_info_alter()
Default Runner:
In the cron subsystem:
DrupalCoreCron::processQueues()
SuspendQueueException: $q-
>releaseItem()
17. Queue API methods: Queue
QueueInterface
● Q::createItem(mixed $data: void
● Q::claimItem($lease_time = 3600: mixed $item
○ FALSE | stdClass + [item_id => int, data => mixed, created => timestamp]
○ $lease_time → Assumptions for runner, currently not used
● Q::deleteItem($item): void -> work done
● Q::releaseItem($item): bool
● Q::numberOfItems(): int → best guess, unreliable
● Q::createQueue() / Q::deleteQueue()
ReliableQueueInterface: ordering, single execution
18. Queue API methods: others
Queue service → QueueFactory::get($name, $reliable)
QueueManager: a vanilla plugin manager
● In charge of hook_queue_info_alter()
● createInstance($plugin_id, $configuration)
QueueWorkerInterface:
● processItem (mixed data) : void @throws SuspendQueueException
19. Queue Runners
Core / Contrib
● Core Cron / Elysia Cron / Queue_Runner
● Drush: queue-list / queue-run
● Similar limitations:
○ Default on in D6 / D7, default off in D8
○ Limited timeout support: non preemptive
○ Single threaded, single process across queues
Custom runners
● Provided by queue modules or per-project one-offs
● Preemption, parallel execution...
20. Queue API limitations
Limited FIFO paradigm
● D8: non-Reliable
QueueInterface: datagram
No monitoring
No queue disciplines
● Priority management
● Tagging
● Delay, burying ...
Implementations may provide more
● Item structure is free-form: add
richer interfaces
No Peek(), no LIFO,
no deduplication: hacks
21. Performance edge
Runners:
● Avoid active polling à la core DB
● Use a blocking layer + select()
● Parallel handling of multiple queues → multiple runners, scheduling
Workers: read after write
● Write in the queue → cache invalidated
● Read again→ cache primed