An Introduction to Celery
Upcoming SlideShare
Loading in...5
×
 

An Introduction to Celery

on

  • 34,964 views

Given to PyWeb-IL 8 on 29th Sept 2009

Given to PyWeb-IL 8 on 29th Sept 2009

Statistics

Views

Total Views
34,964
Views on SlideShare
27,710
Embed Views
7,254

Actions

Likes
61
Downloads
462
Comments
1

30 Embeds 7,254

http://ask.github.com 3621
http://docs.celeryproject.org 1323
http://celery.readthedocs.org 1010
http://celeryproject.org 635
http://ask.github.io 147
http://www.slideshare.net 110
http://celeryq.org 102
http://packages.python.org 97
http://readthedocs.org 68
http://docs.celeryq.org 41
http://harajuku-tech.posterous.com 19
http://www.celeryproject.org 14
https://celery.readthedocs.org 13
http://celery.github.com 8
http://www.celeryq.org 8
http://www.slideee.com 7
http://ceng.mu.edu.tr 6
http://webcache.googleusercontent.com 5
http://mysocialos.org 4
http://www.linkedin.com 3
http://localhost 2
http://celeryproject.toforge.com 2
http://www.celeryproject.com 2
http://static.slidesharecdn.com 1
http://www.mefeedia.com 1
http://74.125.47.132 1
http://celeryproject.com 1
https://www.linkedin.com 1
http://docs.openquake.org 1
http://www.techgig.com 1
More...

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • #36: class definition should be FetchMentionsTask(PeriodicTask)
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

An Introduction to Celery An Introduction to Celery Presentation Transcript

  • Celery A Distributed Task Queue Idan Gazit PyWeb-IL 8 / 29th September 2009
  • What is Celery?
  • Celery is a... Distributed Asynchronous Task Queue For Django
  • Celery is a... Distributed Asynchronous Task Queue For Django
  • Celery is a... Distributed Asynchronous Task Queue For Django
  • Celery is a... Distributed Asynchronous Task Queue For Django
  • Celery is a... Distributed Asynchronous Task Queue sin For Django 0.8 ce
  • What can I use it for? http://www.flickr.com/photos/jabzg/2145312172/
  • Potential Uses » Anything that needs to run asynchronously, e.g. outside of the request-response cycle. » Background computation of ‘expensive queries’ (ex. denormalized counts) » Interactions with external API’s (ex. Twitter) » Periodic tasks (instead of cron & scripts) » Long-running actions with results displayed via AJAX.
  • How does it work? http://www.flickr.com/photos/tomypelluz/14638999/
  • Celery Architecture AMQP celery task result user broker workers store
  • Celery Architecture user submit: tasks task sets periodic tasks retryable tasks
  • Celery Architecture AMQP celery broker workers broker pushes tasks to worker(s)
  • Celery Architecture celery workers workers execute tasks in parallel (multiprocessing)
  • Celery Architecture celery task result workers store task result (tombstone) is written to task store: ‣RDBMS ‣memcached ‣Tokyo Tyrant ‣MongoDB ‣AMQP (new in 0.8)
  • Celery Architecture task result user store read task result
  • Celery Architecture Celery uses... Carrot to talk to... AMQP Broker
  • Celery Architecture Celery pip install celery Carrot (dependency of celery) AMQP Broker
  • Celery Architecture Celery pip install celery Carrot (dependency of celery) RabbitMQ http://www.rabbitmq.com
  • AMQP is... Complex.
  • AMQP is Complex » VHost » Routing Keys » Exchanges » Bindings » Direct » Queues » Fanout » Durable » Topic » Temporary » Auto-Delete
  • bit.ly/amqp_intro
  • I Can Haz Celery?
  • Adding Celery 1. get & install an AMQP broker (pay attention to vhosts, permissions) 2. add Celery to INSTALLED_APPS 3. add a few settings: AMQP_SERVER = "localhost" AMQP_PORT = 5672 AMQP_USER = "myuser" AMQP_PASSWORD = "mypassword" AMQP_VHOST = "myvhost" 4. manage.py syncdb
  • Celery Workers » Run at least 1 celery worker server » manage.py celeryd (--detatch for production) » Can be on different machines » Celery guarantees that tasks are only executed once
  • Tasks
  • Tasks » Define tasks in your app » app_name/tasks.py » register & autodiscovery (like admin.py)
  • Task from celery.task import Task from celery.registry import tasks class FetchUserInfoTask(Task): def run(self, screen_name, **kwargs): logger = self.get_logger(**kwargs) try: user = twitter.users.show(id=screen_name) logger.debug("Successfully fetched {0}".format(screen_name)) except TwitterError: logger.error("Unable to fetch {0}: {1}".format( screen_name, TwitterError)) raise return user tasks.register(FetchUserInfoTask)
  • Run It! >>> from myapp.tasks import FetchUserInfoTask >>> result = FetchUserInfoTask.delay('idangazit')
  • Task Result » result.ready() true if task has finished » result.result the return value of the task or exception instance if the task failed » result.get() blocks until the task is complete then returns result or exception » result.successful() returns True/False of task success
  • Why even check results?
  • Chained Tasks from celery.task import Task from celery.registry import tasks class FetchUserInfoTask(Task): def run(self, screen_name, **kwargs): logger = self.get_logger(**kwargs) try: user = twitter.users.show(id=screen_name) logger.debug("Successfully fetched {0}".format(screen_name)) except TwitterError: logger.error("Unable to fetch {0}: {1}".format( screen_name, TwitterError)) raise else: ProcessUserTask.delay(user) return user tasks.register(FetchUserInfoTask)
  • Task Retries
  • Task Retries from celery.task import Task from celery.registry import tasks class FetchUserInfoTask(Task): default_retry_delay = 5 * 60 # retry in 5 minutes max_retries = 5 def run(self, screen_name, **kwargs): logger = self.get_logger(**kwargs) try: user = twitter.users.show(id=screen_name) logger.debug("Successfully fetched {0}".format(screen_name)) except TwitterError, exc: self.retry(args=[screen_name,], kwargs=**kwargs, exc) else: ProcessUserTask.delay(user) return user tasks.register(FetchUserInfoTask)
  • Periodic Tasks
  • Periodic Tasks from celery.task import PeriodicTask from celery.registry import tasks from datetime import timedelta class FetchMentionsTask(Task): run_every = timedelta(seconds=60) def run(self, **kwargs): logger = self.get_logger(**kwargs) mentions = twitter.statuses.mentions() for m in mentions: ProcessMentionTask.delay(m) return len(mentions) tasks.register(FetchMentionsTask)
  • Task Sets
  • Task Sets >>> from myapp.tasks import FetchUserInfoTask >>> from celery.task import TaskSet >>> ts = TaskSet(FetchUserInfoTask, args=( ['ahikman'], {}, ['idangazit'], {}, ['daonb'], {}, ['dibau_naum_h'], {})) >>> ts_result = ts.run() >>> list_of_return_values = ts.join()
  • MOAR SELRY!
  • Celery.Views
  • Celery.Views » Celery ships with some django views for launching /getting the status of tasks. » JSON views perfect for use in your AJAX (err, AJAJ) calls. » celery.views.apply(request, task_name, *args) » celery.views.is_task_done(request, task_id) » celery.views.task_status(request, task_id)
  • Routable Tasks
  • Routable Tasks » "I want tasks of type X to only execute on this specific server" » Some extra settings in settings.py: CELERY_AMQP_EXCHANGE = "tasks" CELERY_AMQP_PUBLISHER_ROUTING_KEY = "task.regular" CELERY_AMQP_EXCHANGE_TYPE = "topic" CELERY_AMQP_CONSUMER_QUEUE = "foo_tasks" CELERY_AMQP_CONSUMER_ROUTING_KEY = "foo.#" » set the task's routing key: class MyRoutableTask(Task): routing_key = 'foo.bars'
  • like django, it's just python.
  • Support #celery on freenode http://groups.google.com/group/celery-users/ AskSol (the author) is friendly & helpful
  • Fin. @idangazit idan@pixane.com