django-celerya distributed task queue                           alex@eftimie.ro                               2013/04/11
problemfast sites > slow sitescaching, load balancing, CDN, nosql, ajax, ...
special problemssend an email to 10.000 usersadd watermark to an uploaded videogenerate a big PDF
oldschool solution: crontabthe good:   - simple   - easy to setupthe bad:   - too simple   - hard to scale
alternativesscheduled (crontab-ish):  - APSchedulerparallel work (celery-ish):  - gearman  - Huey  - django-ztask         ...
introducing Celery
introducing Celeryasynchronous job queue/task queuebased on distributed message passingfocused on real time operationworks...
how it works?                                        worker 1website                                    .  (view)        c...
task@celery.taskdef add(x, y):    return x + y...add.delay(2, 2) #somewhere in a viewatomicideally idempotentsame environm...
task statesPENDINGSTARTEDSUCCESSFAILURERETRYREVOKED
few tips about tasksgranularitydata localitystate   “asserting the world is the responsibility of the task”
subtaskstasks spawned from within another taskcalling:add.subtask(1, 1).delay()add.s(1, 1).delay() # this is a shortcutthe...
subtasks - partialsadd.s(1, 1).delay()1 + 1partial:partial = add.s(1) # incomplete definitionpartial.delay(2)1 + 2        ...
subtasks: chainssubtasks running one after anotherchain(add.s(4, 4), mul.s(8), mul.s(10))oradd.s(4, 4) | mul.s(8) | mul.s(...
subtasks: groupsindependent tasks running in parallelg = group(add.s(2, 2), add.s(4, 4))res = g()res.get()[4, 8]
subtasks: chordssame as groups, but apply callback on results@taskdef ts(numbers):    return sum(numbers)chord(add.s(i,i) ...
subtasks: chunkssplit a long list of arguments into partsres = add.chunks(zip(range(100), range(100)), 10)()>>> res.get()[...
integrate with django projecttasks module inside django appconfiguration setup   - message broker   - result backendstart ...
hidden ad            supervisord.org
case study: korectquiz exam management and automatic paperprocessing using OMRexisting solution: desktop app, ~2 tests/min...
django + celery version~ 20 tests/minutesame machine, 4 worker processesparallelized parts:   - print file generation   - ...
example - generate print fileadd a Download object to dbdelay the task (not the real task):chord([test_pdf for t in tests]...
last slidehigh availability, high performance solutioneasy to set upfun to use                 celeryproject.org
?
Upcoming SlideShare
Loading in...5
×

Django Celery - A distributed task queue

1,575

Published on

A distributed task queue, as presented at tasks #8 by softbinator

Published in: Technology, Self Improvement
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,575
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Django Celery - A distributed task queue

  1. 1. django-celerya distributed task queue alex@eftimie.ro 2013/04/11
  2. 2. problemfast sites > slow sitescaching, load balancing, CDN, nosql, ajax, ...
  3. 3. special problemssend an email to 10.000 usersadd watermark to an uploaded videogenerate a big PDF
  4. 4. oldschool solution: crontabthe good: - simple - easy to setupthe bad: - too simple - hard to scale
  5. 5. alternativesscheduled (crontab-ish): - APSchedulerparallel work (celery-ish): - gearman - Huey - django-ztask not a fan :-)
  6. 6. introducing Celery
  7. 7. introducing Celeryasynchronous job queue/task queuebased on distributed message passingfocused on real time operationworks for scheduled tasks too!open source and integrated with Django
  8. 8. how it works? worker 1website . (view) celery MB . . worker n results
  9. 9. task@celery.taskdef add(x, y): return x + y...add.delay(2, 2) #somewhere in a viewatomicideally idempotentsame environment as the websitereal-time or scheduled
  10. 10. task statesPENDINGSTARTEDSUCCESSFAILURERETRYREVOKED
  11. 11. few tips about tasksgranularitydata localitystate “asserting the world is the responsibility of the task”
  12. 12. subtaskstasks spawned from within another taskcalling:add.subtask(1, 1).delay()add.s(1, 1).delay() # this is a shortcutthe primitives:chains, groups, chords, maps, chunks
  13. 13. subtasks - partialsadd.s(1, 1).delay()1 + 1partial:partial = add.s(1) # incomplete definitionpartial.delay(2)1 + 2 # same as: add.s(1, 2).delay()partial.delay(0)1 + 0
  14. 14. subtasks: chainssubtasks running one after anotherchain(add.s(4, 4), mul.s(8), mul.s(10))oradd.s(4, 4) | mul.s(8) | mul.s(10)((4 + 4) * 8) * 10)
  15. 15. subtasks: groupsindependent tasks running in parallelg = group(add.s(2, 2), add.s(4, 4))res = g()res.get()[4, 8]
  16. 16. subtasks: chordssame as groups, but apply callback on results@taskdef ts(numbers): return sum(numbers)chord(add.s(i,i) for i in range(3))(ts.s())6 # sum([0, 2, 4])chord(headers)(callback)
  17. 17. subtasks: chunkssplit a long list of arguments into partsres = add.chunks(zip(range(100), range(100)), 10)()>>> res.get()[[0, 2, 4, 6, 8, 10, 12, 14, 16, 18], [20, 22, 24, 26, 28, 30, 32, 34, 36, 38], [40, 42, 44, 46, 48, 50, 52, 54, 56, 58], [60, 62, 64, 66, 68, 70, 72, 74, 76, 78], [80, 82, 84, 86, 88, 90, 92, 94, 96, 98], [100, 102, 104, 106, 108, 110, 112, 114, 116, 118], [120, 122, 124, 126, 128, 130, 132, 134, 136, 138], [140, 142, 144, 146, 148, 150, 152, 154, 156, 158], [160, 162, 164, 166, 168, 170, 172, 174, 176, 178], [180, 182, 184, 186, 188, 190, 192, 194, 196, 198]] (10 tasks of 10 adds each)
  18. 18. integrate with django projecttasks module inside django appconfiguration setup - message broker - result backendstart a worker$ python manage.py celery worker (start another, tune concurrency, monitor, ...)
  19. 19. hidden ad supervisord.org
  20. 20. case study: korectquiz exam management and automatic paperprocessing using OMRexisting solution: desktop app, ~2 tests/minuteusing ReportLab, PyPDF, OpenCV
  21. 21. django + celery version~ 20 tests/minutesame machine, 4 worker processesparallelized parts: - print file generation - paper scanning - correcting and grading - question usage report
  22. 22. example - generate print fileadd a Download object to dbdelay the task (not the real task):chord([test_pdf for t in tests])(merge_pdf)update page with dw status...at the end of merge_pdf: update status, flash user
  23. 23. last slidehigh availability, high performance solutioneasy to set upfun to use celeryproject.org
  24. 24. ?

×