Scaling up task
processing
with Celery
Stockholm Python User Group – May 7th 2014
Hej, my name is Nicolas Grasset, and I work here at Lifesum.
What is Celery?
“Celery is an asynchronous task queue based on
distributed message passing”
Celery runs on Python (2.5, 2.6, 2.7, 3.2, 3.3) PyPy (1.8, 1.9) and Jython (2.5, 2.7).
It is maintained by Ask Solem, supported by Pivotal and released under BSD License
!
!
@user_required!
def verify_in_app_purchase(request):!
"""!
API to verify in-app purchase against iTunes!
"""!
!
receipt = request.POST.get('receipt', None)!
!
if not receipt:!
raise Http404!
!
# Warning, this takes 2sec on average!
request.user.save_receipt_if_valid( receipt )!
!
return Response({'thank': 'you'})!
!
!
Why task queues?
!
!
@user_required!
def verify_in_app_purchase(request):!
"""!
API to verify in-app purchase against iTunes!
"""!
!
receipt = request.POST.get('receipt', None)!
!
if not receipt:!
raise Http404!
!
# Warning, this takes 2sec on average!
request.user.save_receipt_if_valid( receipt )!
!
return Response({'thank': 'you'})!
!
!
task queues
Distributed?
Producer
Producer
Producer
Broker
Broker
Worker
Worker
Worker
Worker
Worker
…for High Availability and speed
Flask + Celery
Django + Celery
Celery + Celery
Redis
RabbitMQ
Celery
Celery
Celery
Celery
Celery
Celery is simple
!
!
!
!
!
from celery import Celery!
!
app = Celery('hello', broker='amqp://guest@localhost//')!
!
@app.task(ignore_result=True)!
def hello():!
return 'hello world'!
!
hello.delay()!
!
!
!
!
!
10 things I learnt on
the way
1. Start simple
Single broker. Single queue
aptitude install rabbitmq-server
2. Watch for the results
!
!
!
!
!
from celery import Celery!
!
app = Celery('hello', broker='amqp://guest@localhost//')!
!
@app.task(ignore_result=True)!
def hello():!
return 'hello world'!
!
@app.task!
def hello_memory_usage():!
return 'hello world'!
!
!
!
3. Monitor it!
4. Consume tasks faster than
you produce them.
5. Tweaking for concurrency
!
python -m celery worker -E --time-limit=300 -c 64 -P eventlet!
python -m celery worker -E --time-limit=300 -c 8 -P prefork!
python -m celery worker -E --time-limit=300 --autoscale=10,3 --maxtasksperchild=10000!
!
import eventlet!
!
eventlet.monkey_patch()!
eventlet.monkey_patch(MySQLdb=True)!
!
6. Scale worker/producer in pair
MySQL on RDS
with fail-over
Python Application
~2-16 Auto-Scaling
instances
nginx http + celery workers
Elastic Load Balancer
*.lifesum.com
MySQL read-replica
Redis cluster for cache
Sphinx SearchRabbit MQ ElasticSearch
Amazon
CloudFront
S3 Storage Sendgrid.com
Email delivery
getPusher.com
Socket Communication
Parse.com
Mobile Push Notifications
iPhone apps
Android apps
Web users
System architecture, March 2014
7. Keep your tasks clean
!
!
from celery.task import task!
!
@task(ignore_result=True)!
def will_take_some_time(list_of_users, sometimes=False):!
my_favorite_utils(list_of_users, sometimes)!
!
8. Manage DB transactions
!
from django.db import transaction!
from djcelery_transactions import task!
from models import Message!
!
@task(ignore_result=True)!
def mark_messages_as_read(user_id, day):!
with transaction.commit_on_success():!
Message.objects.filter(user_id=user_id, day=day).update(read=True)!
!
!
@task(ignore_result=True)!
def create_messages(list_of_users, day, message):!
with transaction.commit_on_success():!
for user_id in list_of_users:!
Message.objects.create(user_id=user_id, day=day, text=message)!
mark_messages_as_read.delay(user_id, day)!
9. Replace your cron jobs
!
from celery.schedules import crontab!
!
CELERYBEAT_SCHEDULE = {!
# Executes every Monday morning at 7:30 A.M!
'add-every-monday-morning': {!
'task': 'tasks.add',!
'schedule': crontab(hour=7, minute=30, day_of_week=1),!
'args': (16, 16),!
},!
}!
!
python -m celery worker -B!
10. Locks
!
LOCK_EXPIRE = 60 * 5 # Lock expires in 5 minutes!
!
@task(ignore_result=True)!
def delete_messages(user_id):!
!
lock_id = ‘lock-mark_messages_as_read-{0}’.format(user_id)!
acquire_lock = lambda: cache.add(lock_id, 'true', LOCK_EXPIRE)!
release_lock = lambda: cache.delete(lock_id)!
!
if acquire_lock():!
try:!
Message.objects.filter(user_id=user_id).delete()!
finally:!
release_lock()!
return True!
!
logger.warning("Lock was not acquired!")!
Thanks! Reach me by email at nicolas@lifesum.com
or on Twitter @fellowshipofone

Scaling up task processing with Celery

  • 1.
    Scaling up task processing withCelery Stockholm Python User Group – May 7th 2014
  • 2.
    Hej, my nameis Nicolas Grasset, and I work here at Lifesum.
  • 3.
  • 4.
    “Celery is anasynchronous task queue based on distributed message passing”
  • 5.
    Celery runs onPython (2.5, 2.6, 2.7, 3.2, 3.3) PyPy (1.8, 1.9) and Jython (2.5, 2.7). It is maintained by Ask Solem, supported by Pivotal and released under BSD License
  • 6.
    ! ! @user_required! def verify_in_app_purchase(request):! """! API toverify in-app purchase against iTunes! """! ! receipt = request.POST.get('receipt', None)! ! if not receipt:! raise Http404! ! # Warning, this takes 2sec on average! request.user.save_receipt_if_valid( receipt )! ! return Response({'thank': 'you'})! ! ! Why task queues?
  • 7.
    ! ! @user_required! def verify_in_app_purchase(request):! """! API toverify in-app purchase against iTunes! """! ! receipt = request.POST.get('receipt', None)! ! if not receipt:! raise Http404! ! # Warning, this takes 2sec on average! request.user.save_receipt_if_valid( receipt )! ! return Response({'thank': 'you'})! ! ! task queues
  • 8.
  • 9.
    …for High Availabilityand speed Flask + Celery Django + Celery Celery + Celery Redis RabbitMQ Celery Celery Celery Celery Celery
  • 10.
    Celery is simple ! ! ! ! ! fromcelery import Celery! ! app = Celery('hello', broker='amqp://guest@localhost//')! ! @app.task(ignore_result=True)! def hello():! return 'hello world'! ! hello.delay()! ! ! ! ! !
  • 11.
    10 things Ilearnt on the way
  • 12.
    1. Start simple Singlebroker. Single queue aptitude install rabbitmq-server
  • 13.
    2. Watch forthe results ! ! ! ! ! from celery import Celery! ! app = Celery('hello', broker='amqp://guest@localhost//')! ! @app.task(ignore_result=True)! def hello():! return 'hello world'! ! @app.task! def hello_memory_usage():! return 'hello world'! ! ! !
  • 14.
  • 15.
    4. Consume tasksfaster than you produce them.
  • 16.
    5. Tweaking forconcurrency ! python -m celery worker -E --time-limit=300 -c 64 -P eventlet! python -m celery worker -E --time-limit=300 -c 8 -P prefork! python -m celery worker -E --time-limit=300 --autoscale=10,3 --maxtasksperchild=10000! ! import eventlet! ! eventlet.monkey_patch()! eventlet.monkey_patch(MySQLdb=True)! !
  • 17.
    6. Scale worker/producerin pair MySQL on RDS with fail-over Python Application ~2-16 Auto-Scaling instances nginx http + celery workers Elastic Load Balancer *.lifesum.com MySQL read-replica Redis cluster for cache Sphinx SearchRabbit MQ ElasticSearch Amazon CloudFront S3 Storage Sendgrid.com Email delivery getPusher.com Socket Communication Parse.com Mobile Push Notifications iPhone apps Android apps Web users System architecture, March 2014
  • 18.
    7. Keep yourtasks clean ! ! from celery.task import task! ! @task(ignore_result=True)! def will_take_some_time(list_of_users, sometimes=False):! my_favorite_utils(list_of_users, sometimes)! !
  • 19.
    8. Manage DBtransactions ! from django.db import transaction! from djcelery_transactions import task! from models import Message! ! @task(ignore_result=True)! def mark_messages_as_read(user_id, day):! with transaction.commit_on_success():! Message.objects.filter(user_id=user_id, day=day).update(read=True)! ! ! @task(ignore_result=True)! def create_messages(list_of_users, day, message):! with transaction.commit_on_success():! for user_id in list_of_users:! Message.objects.create(user_id=user_id, day=day, text=message)! mark_messages_as_read.delay(user_id, day)!
  • 20.
    9. Replace yourcron jobs ! from celery.schedules import crontab! ! CELERYBEAT_SCHEDULE = {! # Executes every Monday morning at 7:30 A.M! 'add-every-monday-morning': {! 'task': 'tasks.add',! 'schedule': crontab(hour=7, minute=30, day_of_week=1),! 'args': (16, 16),! },! }! ! python -m celery worker -B!
  • 21.
    10. Locks ! LOCK_EXPIRE =60 * 5 # Lock expires in 5 minutes! ! @task(ignore_result=True)! def delete_messages(user_id):! ! lock_id = ‘lock-mark_messages_as_read-{0}’.format(user_id)! acquire_lock = lambda: cache.add(lock_id, 'true', LOCK_EXPIRE)! release_lock = lambda: cache.delete(lock_id)! ! if acquire_lock():! try:! Message.objects.filter(user_id=user_id).delete()! finally:! release_lock()! return True! ! logger.warning("Lock was not acquired!")!
  • 22.
    Thanks! Reach meby email at nicolas@lifesum.com or on Twitter @fellowshipofone