Introduction to Python Celery

2,546 views
2,266 views

Published on

Introduction to Python Celery

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
2,546
On SlideShare
0
From Embeds
0
Number of Embeds
38
Actions
Shares
0
Downloads
42
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

Introduction to Python Celery

  1. 1. Python Celery The Distributed Task Queue Mahendra M @mahendra http://creativecommons.org/licenses/by-sa/3.0/   
  2. 2. About me ● Solutions Architect at Infosys, Product Incubation  Group ● Worked on FOSS for 10 years ● BLUG, FOSS.in (ex) member ● Linux, NetBSD embedded developer ● Mostly Python programmer ● Mostly sticks to server side stuff ● Loves CouchDB and Twisted   
  3. 3. Job Queues – the need ● More complex systems on the web ● Asynchronous processing is required ● Flickr – Image resizing ● User profiles – synchronization ● Asynchronous database updates ● Click counters ● Likes, favourites, recommend   
  4. 4. How it works Worker 1 2 User 3 Web service Job Queue Worker Placing a request   
  5. 5. How it works Worker 4 2 User Web service Job Queue 5 6 Worker The job is executed   
  6. 6. How it works Worker 4 2 User 7 Web service Job Queue 5 8 6 Worker Client fetches the result   
  7. 7. AMQP ● Advanced Message Queuing Protocol ● Self explanatory :­) ● Open, language agnostic, implementation agnostic ● Transmits messages from producers to consumers ● Immensely popular ● Open source implementations available.   
  8. 8. AMQP   
  9. 9. AMQP Queues are bound to exchanges to determine message  delivery ● Direct ­ From Exchange to Queue ● Topic ­ Queue is selected based on a topic ● Fanout – All queues are selected ● Headers – based on message headers   
  10. 10. AMQP as a job queue ● AMQP structure is similar to our job queue design ● Jobs are sent as messages ● Job results are sent back as messages ● Celery Framework simplifies this for us   
  11. 11. Python Celery ● Python based distributed task queue built on top of  message queues ● Very robust, good error handling, guaranteed ... ● Distributed (across machines) ● Concurrent within a box ● Supports job scheduling (eta, cron, date, …) ● Synchronous and asynchronous operations ● Retries and task grouping   
  12. 12. Python Celery ... ● Web hooks ● Job Routing  ● based on AMQP message routing ● Remote control of workers ● Rate limit, delete, revoke tasks ● Monitoring ● Tracebacks of errors, Email notifications ● Django Integration   
  13. 13. Celery Architecture   
  14. 14. Defining a task from celery.decorators import task @task def add(x, y): return x + y >>> result = add.delay(4, 4) >>> result.wait() # wait for result 8 >>>   
  15. 15. Running a task ● Synchronous run ● task.apply( … ) ● Asynchrnous ● task.apply_async( … ) ● Tasksets – schedule task with different arguments ● Think of it like map reduce ● Scheduled execution ● Auto retries and max_retry support  ● Ensure worker availability  
  16. 16. Django Features ● djcelery in INSTALLED_APPS ● Uses Django features ● ORM for storing task details ● settings.py for configuration ● Celery commands are part of django commands ● Run celery workers using manage.py ● Task registeration and auto discovery ­ tasks.py ● Schedule jobs directly from view handlers ● View handlers for task status monitoring via Ajax   
  17. 17. Demo   
  18. 18. Advantages of AMQP ● Scaling is an ”admin” job ● Workers can be added and removed any time ● Scale on need basis ● Deploy on cloud setups ● Jobs can routed to workers based on admin setups ● Jobs can be prioritized based on AMQP protocol ● Not supported in rabbitmq (not sure of 2.x) ● Can be deployed on a single node also.   
  19. 19. Where should I use ● Background computations ● Anything outside the request­response cycle ● Run System commands or applications ● Imagemagick (convert) for resize ● Integration with external systems (APIs) ● Use webhooks for Integrating independent systems ● Result aggregations (db updates, like, ratings ..)   
  20. 20. Where to avoid ? ● Ensure that you absolutely need a task queue ● Sometimes it might be easier to avoid it ● Simple database updates / inserts (log like) ● Sending emails/sms (it is already a message  queue ...)   
  21. 21. Links ● http://celeryproject.org ● http://celery.org/docs/getting­started/ ● http://amqp.org/ ● http://en.wikipedia.org/AMQP ● http://rabbitmq.org/ ● http://slideshare.net/search/slideshow?q=celery   

×