• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Scaling Django with gevent
 

Scaling Django with gevent

on

  • 15,370 views

 

Statistics

Views

Total Views
15,370
Views on SlideShare
14,974
Embed Views
396

Actions

Likes
25
Downloads
90
Comments
2

5 Embeds 396

http://in.pycon.org 361
https://twitter.com 21
http://www.linkedin.com 11
https://www.linkedin.com 2
https://si0.twimg.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

12 of 2 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Great job!!
    Are you sure you want to
    Your message goes here
    Processing…
  • Interesting! Thanks.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Scaling Django with gevent Scaling Django with gevent Presentation Transcript

    • Scaling Django with Gevent Mahendra M @mahendra https://github.com/mahendra
    • @mahendra● Python developer for 6 years● FOSS enthusiast/volunteer for 14 years ● Bangalore LUG and Infosys LUG ● FOSS.in and LinuxBangalore/200x● Gevent user for 1 year● Twisted user for 5 years (before migrating) ● Added twisted support libraries like mustaine
    • Concurrency models● Multi-Process● Threads● Event driven● Coroutines
    • Process/Threadrequest dispatch() worker_1() read(fp) db_rd() db_wr() sock_wr() worker_n()
    • Process/Thread● There are blocking sections in the code● Python GIL is an issue in thread based concurrency
    • Event drivenevent_1 hdler_1() ev()event_2 block_on_events() hdler_2() Events are postedevent_n hdler_n()
    • Event driven web server request open(fp) reg() opened parse() event_loop() read_sql() reg()sql_read wri_sql() reg()sql_writ sock_wr() reg()responded close()
    • Two years back● Using python twisted for half of our products● Using django for the other half● Quite a nightmare
    • Python twisted● An event driven library (very scalable)● Using epoll or kqueue Server 1 Server 2 Nginx Client (SSL & LB) . . . Server N Proc 1 (:8080) Proc 2 (:8080) Proc N (:8080)
    • GeventA coroutine-based Python networking library thatuses greenlet to provide a high-level synchronousAPI on top of the libevent event loop.
    • GeventA coroutine-based Python networking library thatuses greenlet to provide a high-level synchronousAPI on top of the libevent event loop.
    • Coroutines● Python coroutines are almost similar to generators.def abc( seq ): lst = list( seq ) for i in lst: value = yield i if cmd is not None: lst.append( value )r = abc( [1,2,3] )r.send( 4 )
    • Gevent features● Fast event-loop based on libevent (epoll, kqueue etc.)● Lightweight execution units based on greenlets (coroutines)● Monkey patching support● Simple API● Fast WSGI server
    • Greenlets● Primitive notion of micro-threads with no implicit scheduling● Just co-routines or independent pseudo- threads● Other systems like gevent build micro-threads on top of greenlets.● Execution happens by switching execution among greenlet stacks● Greenlet switching is not implicit (switch())
    • Greenlet executionMain greenlet pause() abc() Child greenlet func_1() pause() some() reg() func_2()
    • Greenlet codefrom greenlet import greenletdef test1(): gr2.switch()def test2(): gr1.switch()gr1 = greenlet(test1)gr2 = greenlet(test2)gr1.switch()
    • How does gevent work● Creates an implicit event loop inside a dedicated greenlet● When a function in gevent wants to block, it switches to the greenlet of the event loop. This will schedule another child greenlet to run● The eventloop automatically picks up the fastest polling mechanism available in the system● One event loop runs inside a single OS thread (process)
    • Gevent codeimport geventfrom gevent import socketurls = [www.google.com, www.example.com,www.python.org]jobs = [gevent.spawn(socket.gethostbyname, url) forurl in urls]gevent.joinall(jobs, timeout=2)[job.value for job in jobs][74.125.79.106, 208.77.188.166, 82.94.164.162]
    • Gevent apis● Greenlet management (spawn, timeout, schedule)● Greenlet local data● Networking (socket, ssl, dns, select)● Synchronization ● Event – notify multiple listeners ● Queue – synchronized producer/consumer queues ● Locking – Semaphores● Greenlet pools● TCP/IP and WSGI servers
    • Gevent advantages● Almost synchronous code. No callbacks and deferreds● Lightweight greenlets● Good concurrency● No issues of python GIL● No need for in-process locking, since a greenlet cannot be pre-empted
    • Gevent issues● A greenlet will run till it blocks or switches ● Be vary of large/infinite loops● Monkey patching is required for un-supported blocking libraries. Might not work well with some libraries
    • Our django dream● We love django● I like twisted, but love django more ● Coding complexity ● Lack of developers for hire ● Deployment complexity● Gevent saved the day
    • The Django Problem● In a HTTP request cycle, we wanted the following operations ● Fetch some metadata for an item being sold ● Purchase the item for the user in the billing system ● Fetch ads to be shown along with the item ● Fetch recommendations based on this item● In parallel … !! ● Twisted was the only option
    • Twisted codedef handle_purchase( rqst ): defs = [] defs.append( biller() ) defs.append( ads() ) defs.append( recos() ) defs.append( meta() ) def = DeferredList( defs, … ) def.addCallback( send_response() ) return NOT_DONE_YET
    • Twisted issues● The issues were with everything else ● Header management ● Templates for response ● ORM support ● SOAP, REST, Hessian/Burlap support – We liked to use suds, requests, mustaine etc. ● Session management and auth ● Caching support● The above are djangos strength ● Djangos vibrant eco-system (celery, south, tastypie)
    • gunicorn● A python WSGI HTTP server● Supports running code under worker, eventlet, gevent etc. ● Uses monkey patching● Excellent django support ● gunicorn_django app.settings● Enabled gevent support for our app by default without any code changes● Spawns and manages worker processes and distributes load amongst them
    • Migrating our productsdef handle_purchase( request ): jobs = [] jobs.append( gevent.spawn( biller, … ) ) jobs.append( gevent.spawn( ads, … ) ) jobs.append( gevent.spawn( meta, … ) ) jobs.append( gevent.spawn( reco, … ) ) gevent.joinall()
    • Migrating our products● Migrating our entire code base (2 products) took around 1 week to finish● Was easier because we were already using inlineCallbacks() decorator of twisted● Only small parts of our code had to be migrated
    • Deployment Gunicorn 1 Gunicorn 2 NginxClient (SSL & LB) . . . Gunicorn N Proc 1 Proc 2 Proc N
    • Life today● Single framework for all 4 products● Use djangos awesome features and ecosystem● Increased scalability. More so with celery.● Use blocking python libraries without worrying too much● No more usage of python-twisted● Coding, testing and maintenance is much easier● We are hiring!!
    • Links● http://greenlet.readthedocs.org/en/latest/index.html● http://www.gevent.org/● http://in.pycon.org/2010/talks/48-twisted-programming