Celery: The Distributed Task Queue
Upcoming SlideShare
Loading in...5
×
 

Celery: The Distributed Task Queue

on

  • 8,075 views

An introduction to Celery from the April 6, 2010 meetup of ZPUGDC.

An introduction to Celery from the April 6, 2010 meetup of ZPUGDC.

Statistics

Views

Total Views
8,075
Views on SlideShare
7,665
Embed Views
410

Actions

Likes
14
Downloads
160
Comments
0

4 Embeds 410

http://vatsalad.wordpress.com 349
http://www.slideshare.net 55
http://127.0.0.1 4
http://www.365dailyjournal.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Celery: The Distributed Task Queue Celery: The Distributed Task Queue Presentation Transcript

  • Celery An introduction to the distributed task queue. Rich Leland ZPUGDC // April 6, 2010 @richleland richard_leland@discovery.com http://creative.discovery.com
  • What is Celery? A task queue based on distributed message passing.
  • What is Celery? An asynchronous, concurrent, distributed, super-awesome task queue.
  • A brief history • First commit in April 2009 as "crunchy" • Originally built for use with Django • Django is still a requirement • Don't be scurred! No Django app required! • It's for the ORM, caching, and signaling • Future is celery using SQLAlchemy and louie
  • Why should I use Celery?
  • User perspective • Minimize request/response cycle • Smoother user experience • Difference between pleasant and unpleasant
  • Developer perspective • Offload time/cpu intensive processes • Scalability - add workers as needed • Flexibility - many points of customization • About to turn 1 (apr 24) • Actively developed • Great documentation • Lots of tutorials
  • LATENCY == DELAY == NOT GOOD!
  • Business perspective • Latency == $$$ • Every 100ms of latency cost Amazon 1% in sales • Google found an extra .5 seconds in search page generation time dropped traffic by 20% • 5ms latency in an electronic trading platform could mean $4 million in lost revenues per millisecond http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it
  • Example Uses • Image processing • Calculate points and award badges • Upload files to a CDN • Re-generate static files • Generate graphs for enormous data sets periodically • Send blog comments through a spam filter • Transcoding of audio and video
  • What do I need?
  • Users requests responses Result Store Application tasks Message Queue Worker 1 Worker 2 Worker 3 ... Worker N
  • Users Database memcached requests responses MongoDB Redis Tokyo Tyrant AMQP Application tasks RabbitMQ Stomp Redis Database celeryd celeryd celeryd ... celeryd
  • USE RABBITMQ!
  • Installation
  • Installation 1. Install message queue from source or w/package mgr 2. pip install celery 3. pip install -r http://github.com/ask/celery/blob/v1.0.2/ contrib/requirements/default.txt?raw=true 4. Configure application 5. Launch services (app server, rabbitmq, celeryd, etc.)
  • Usage
  • Configure • celeryconf.py for pure python • settings.py within a Django project
  • Define a task from celery.decorators import task @task def add(x, y): return x + y
  • Execute the task >>> from tasks import add >>> add.delay(4, 4) <AsyncResult: 889143a6-39a2-4e52-837b-d80d33efb22d>
  • Analyze the results >>> result = add.delay(4, 4) >>> result.ready() # has task has finished processing? False >>> result.result # task is not ready, so no return value yet. None >>> result.get() # wait until the task is done and get retval. 8 >>> result.result # access result 8 >>> result.successful() True
  • The Task class class CanDrinkTask(Task): """ A task that determines if a person is 21 years of age or older. """ def run(self, person_id, **kwargs): logger = self.get_logger(**kwargs) logger.info("Running determine_can_drink task for person %s" % person_id) person = Person.objects.get(pk=person_id) now = date.today() diff = now - person.date_of_birth # i know, i know, this doesn't account for leap year age = diff.days / 365 if age >= 21: person.can_drink = True person.save() else: person.can_drink = False person.save() return True
  • Task retries class CanDrinkTask(Task): """ A task that determines if a person is 21 years of age or older. """ default_retry_delay = 5 * 60 # retry in 5 minutes max_retries = 5 def run(self, person_id, **kwargs): logger = self.get_logger(**kwargs) logger.info("Running determine_can_drink task for person %s" % person_id) ...
  • The PeriodicTask class class FullNameTask(PeriodicTask): """ A periodic task that concatenates fields to form a person's full name. """ run_every = timedelta(seconds=60) def run(self, **kwargs): logger = self.get_logger(**kwargs) logger.info("Running full name task.") for person in Person.objects.all(): person.full_name = " ".join([person.prefix, person.first_name, person.middle_name, person.last_name, person.suffix]).strip() person.save() return True
  • Holy chock full of features Batman! • Messaging • Remote-control • Distribution • Monitoring • Concurrency • Serialization • Scheduling • Tracebacks • Performance • Retries • Return values • Task sets • Result stores • Web views • Webhooks • Error reporting • Rate limiting • Supervising • Routing • init scripts
  • Resources
  • Community • Friendly core dev: Ask Solem Hoel • IRC: #celery • Mailing lists: celery-users • Twitter: @ask
  • Docs and articles Celery • http://celeryproject.org • http://ask.github.com/celery/ • http://ask.github.com/celery/tutorials/external.html Message Queues • http://amqp.org • http://bit.ly/amqp_intro • http://rabbitmq.com/faq.html
  • Thank you! Rich Leland Discovery Creative @richleland richard_leland@discovery.com http://creative.discovery.com