Gevent at TellApart

Kevin
Ballard

kevin(at)tellapart(dot)com

Image
©2003-‐2012
`DivineError

TellApart’s
Infrastructure
Overview

•  Millions
of
daily
acIve
users

•  Page-‐views
across
mulIple
sites

•  Real-‐Time
Bidding
integraIon

- Very
high
volume,
low
latency

- Response
Ime:
50
percenIle:
17ms,
95
percenIle:
50
ms

•  All
requests
require
user
data

•  EnIrely
Amazon
Web
Services
(AWS),
in
2
parallel
regions

2

What
is
gevent?

gevent
is
a
corouIne-‐based
Python

networking
library
that
uses
greenlet
to

provide
a
high-‐level
synchronous
API
on

top
of
the
libevent
event
loop.

•  EssenIally,
allows
normally
synchronous
code
to
run

asynchronously

3

What
is
gevent?

lib·∙e·∙vent
(ˈlib-‐i-‐ˈvent):
eﬃcient
cross-‐pla]orm
library
for

execuIng
callbacks
when
speciﬁc
events
occur
or
a

Imeout
has
been
reached.
Includes
several

networking
libraries
(e.g.
DNS,
HTTP)

green·∙let
(ˈgrēn-‐lət):
lightweight
co-‐rouInes
for
in-‐process

concurrent
programming.
Ported
from
Stackless

Python
as
a
library
for
the
CPython
interpreter

4

How
does
gevent
work?

•  One
gevent
“hub”
per
process

•  Monkey-‐patch
blocking
libraries

- socket,
thread,
select,
etc.

•  Use
greenlets
like
threads

•  Blocking
calls
switch
to
another
(ready)
greenlet

5

Example
Server

mod_wsgi:
gevent:

6

Example
Server

•  Server
implementaIon
is
the
same

•  DB
lookup
blocks
on
network
IO

•  With
gevent,
greenlet
gets
swapped
out
so
another

request
can
be
served

•  When
the
DB
request
ﬁnishes,
the
greenlet
will

conInue
where
it
lej
oﬀ

7

Advantages

•  Write
code
as
though
it
were
synchronous
(mostly)

- No
‘callback
spaghen’
like
with
a
callback
framework

- Exact
same
code
can
run
synchronously
(e.g.
unit
tests)

•  Greenlets
are
very
lightweight

- 100’s
or
1000’s
can
run
concurrently

- No
context
switch

o  Same
order
of
magnitude
as
a
funcIon
call

- No
GIL
related
performance
issues

•  Co-‐operaIve
concurrency
makes
synchronizaIon
easy

- Greenlets
cannot
be
preempted

- No
need
for
in-‐process
atomic
locks

- Ojen
eliminates
the
need
for
synchronizaIon

o  As
long
as
there
are
no
blocking
calls
in
the
criIcal
secIon

8

Advantages
(conInued)

•  gevent
is
fast

- Very
thorough
set
of
benchmarks
by
Nicholas
Piël
hrp://nichol.as/benchmark-‐of-‐python-‐web-‐servers

And
then
there
is
Gevent
[...]

[…]
if
you
want
to
dive
into
high
performance
websockets
with

lots
of
concurrent
connecIons
you
really
have
to
go
with
an

asynchronous
framework.
Gevent
seems
like
the
perfect

companion
for
that,
at
least
that
is
what
we
are
going
to
use.

9

Problems

•  Monkey-‐patching

- Doesn’t
play
well
with
C
extensions

o  Blocking
code
in
C
libraries
will
cause
the
process
to
block

- Can
confuse
some
libraries

o  e.g.
thread-‐local
storage

•  Breaks
analysis
tools

- cProfile
produces
garbage

- AlternaIve
tools
available

o  gevent-‐profiler
(Meebo)

o  gevent_request_profiler
(TellApart)

•  Co-‐operaIve
scheduling

- Rogue
greenlets
can
Ie
up
the
enIre
process

o  e.g.
CPU
bound
background
worker

- Long-‐running
tasks
have
to
periodically
yield

10

Problems

•  Same
server
as
before

•  Processing
in
loop
can
take
long

•  Can
hurt
latency
of
other
requests

•  Add
‘gevent.sleep(0)’
to
loop

•  Allows
other
greenlets
to
run

11

Uses

•  We
use
gevent
everywhere
we
use
Python

•  TellApart
Front
End
(TAFE)

- gevent
WSGI
server
with
a
micro-‐framework

- One
process
per
core

- Nginx
reverse-‐proxy
in
front

•  Database
Proxy
(moxie)

- Thrij
service

- ConnecIon
pooling
across
clients

- Minimal
addiIonal
latency
(~2ms)

12

Case
Study
-‐
Taba

•  Taba
is
a
distributed
Event
AggregaIon
Service

•  Provides
near
real-‐Ime
metrics
from
across
a
cluster

•  At
TellApart:

- 10,000
individual
Tabs

- 100’s
of
event
source
clients

- 20,000,000
events
/
minute

- 25
seconds
latency
from
real-‐Ime

13

Case
Study
-‐
Taba

•  Implement
Imeouts

very
easily

•  FuncIon
doesn’t
need

to
know
it’s
being
Imed

14

Case
Study
–
Taba

•  Perform
simultaneous

lookups
to
a
sharded

database

•  No
thread
pools

•  No
need
for
locking

15

Case
Study
–
Taba

•  Streaming
from
DB
in

batches

•  No
thread
pool

•  Trivial
synchronizaIon

•  Process
data
while
the

next
batch
is
retrieved

16

Thank
you!

Kevin
Ballard

kevin(at)tellapart(dot)com

17

Gevent at TellApart

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Gevent at TellApart

Similar to Gevent at TellApart (20)

Recently uploaded

Recently uploaded (20)

Gevent at TellApart