Python Celery           The Distributed Task Queue    Mahendra M    @mahendra                http://creativecommons.org/li...
About me    ●   Solutions Architect at Infosys, Product Incubation          Group    ●   Worked on FOSS for 10 years    ● ...
Job Queues – the need    ●   More complex systems on the web    ●   Asynchronous processing is required           ●   Flic...
How it works                                                  Worker            1                                  2    Us...
How it works                                                          Worker                                              ...
How it works                                                           Worker                                             ...
AMQP    ●   Advanced Message Queuing Protocol           ●   Self explanatory :­)    ●   Open, language agnostic, implement...
AMQP        
AMQP    Queues are bound to exchanges to determine message      delivery    ●   Direct ­ From Exchange to Queue    ●   Top...
AMQP as a job queue    ●   AMQP structure is similar to our job queue design    ●   Jobs are sent as messages    ●   Job r...
Python Celery    ●   Python based distributed task queue built on top of          message queues    ●   Very robust, good ...
Python Celery ...    ●   Web hooks    ●   Job Routing            ●   based on AMQP message routing    ●   Remote control o...
Celery Architecture               
Defining a task    from celery.decorators import task    @task    def add(x, y):        return x + y    >>> result = add.d...
Running a task    ●   Synchronous run           ●   task.apply( … )    ●   Asynchrnous           ●   task.apply_async( … )...
Django Features    ●   djcelery in INSTALLED_APPS    ●   Uses Django features           ●   ORM for storing task details  ...
Demo       
Advantages of AMQP    ●   Scaling is an ”admin” job           ●   Workers can be added and removed any time           ●   ...
Where should I use    ●   Background computations    ●   Anything outside the request­response cycle    ●   Run System com...
Where to avoid ?    ●   Ensure that you absolutely need a task queue    ●   Sometimes it might be easier to avoid it    ● ...
Links    ●   http://celeryproject.org    ●   http://celery.org/docs/getting­started/    ●   http://amqp.org/    ●   http:/...
Upcoming SlideShare
Loading in...5
×

Introduction to Python Celery

1,917

Published on

Introduction to Python Celery

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,917
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
36
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Introduction to Python Celery

  1. 1. Python Celery The Distributed Task Queue Mahendra M @mahendra http://creativecommons.org/licenses/by-sa/3.0/   
  2. 2. About me ● Solutions Architect at Infosys, Product Incubation  Group ● Worked on FOSS for 10 years ● BLUG, FOSS.in (ex) member ● Linux, NetBSD embedded developer ● Mostly Python programmer ● Mostly sticks to server side stuff ● Loves CouchDB and Twisted   
  3. 3. Job Queues – the need ● More complex systems on the web ● Asynchronous processing is required ● Flickr – Image resizing ● User profiles – synchronization ● Asynchronous database updates ● Click counters ● Likes, favourites, recommend   
  4. 4. How it works Worker 1 2 User 3 Web service Job Queue Worker Placing a request   
  5. 5. How it works Worker 4 2 User Web service Job Queue 5 6 Worker The job is executed   
  6. 6. How it works Worker 4 2 User 7 Web service Job Queue 5 8 6 Worker Client fetches the result   
  7. 7. AMQP ● Advanced Message Queuing Protocol ● Self explanatory :­) ● Open, language agnostic, implementation agnostic ● Transmits messages from producers to consumers ● Immensely popular ● Open source implementations available.   
  8. 8. AMQP   
  9. 9. AMQP Queues are bound to exchanges to determine message  delivery ● Direct ­ From Exchange to Queue ● Topic ­ Queue is selected based on a topic ● Fanout – All queues are selected ● Headers – based on message headers   
  10. 10. AMQP as a job queue ● AMQP structure is similar to our job queue design ● Jobs are sent as messages ● Job results are sent back as messages ● Celery Framework simplifies this for us   
  11. 11. Python Celery ● Python based distributed task queue built on top of  message queues ● Very robust, good error handling, guaranteed ... ● Distributed (across machines) ● Concurrent within a box ● Supports job scheduling (eta, cron, date, …) ● Synchronous and asynchronous operations ● Retries and task grouping   
  12. 12. Python Celery ... ● Web hooks ● Job Routing  ● based on AMQP message routing ● Remote control of workers ● Rate limit, delete, revoke tasks ● Monitoring ● Tracebacks of errors, Email notifications ● Django Integration   
  13. 13. Celery Architecture   
  14. 14. Defining a task from celery.decorators import task @task def add(x, y): return x + y >>> result = add.delay(4, 4) >>> result.wait() # wait for result 8 >>>   
  15. 15. Running a task ● Synchronous run ● task.apply( … ) ● Asynchrnous ● task.apply_async( … ) ● Tasksets – schedule task with different arguments ● Think of it like map reduce ● Scheduled execution ● Auto retries and max_retry support  ● Ensure worker availability  
  16. 16. Django Features ● djcelery in INSTALLED_APPS ● Uses Django features ● ORM for storing task details ● settings.py for configuration ● Celery commands are part of django commands ● Run celery workers using manage.py ● Task registeration and auto discovery ­ tasks.py ● Schedule jobs directly from view handlers ● View handlers for task status monitoring via Ajax   
  17. 17. Demo   
  18. 18. Advantages of AMQP ● Scaling is an ”admin” job ● Workers can be added and removed any time ● Scale on need basis ● Deploy on cloud setups ● Jobs can routed to workers based on admin setups ● Jobs can be prioritized based on AMQP protocol ● Not supported in rabbitmq (not sure of 2.x) ● Can be deployed on a single node also.   
  19. 19. Where should I use ● Background computations ● Anything outside the request­response cycle ● Run System commands or applications ● Imagemagick (convert) for resize ● Integration with external systems (APIs) ● Use webhooks for Integrating independent systems ● Result aggregations (db updates, like, ratings ..)   
  20. 20. Where to avoid ? ● Ensure that you absolutely need a task queue ● Sometimes it might be easier to avoid it ● Simple database updates / inserts (log like) ● Sending emails/sms (it is already a message  queue ...)   
  21. 21. Links ● http://celeryproject.org ● http://celery.org/docs/getting­started/ ● http://amqp.org/ ● http://en.wikipedia.org/AMQP ● http://rabbitmq.org/ ● http://slideshare.net/search/slideshow?q=celery   
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×