Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

gevent at TellApart


Published on

Published in: Technology
  • Be the first to comment

gevent at TellApart

  1. 1. Kevin  Ballard   kevin(at)tellapart(dot)com  Image  ©2003-­‐2012  `DivineError  
  2. 2. TellApart’s  Infrastructure  Overview   •  Millions  of  daily  acIve  users   •  Page-­‐views  across  mulIple  sites   •  Real-­‐Time  Bidding  integraIon   - Very  high  volume,  low  latency   - Response  Ime:  50  percenIle:  17ms,  95  percenIle:  50  ms     •  All  requests  require  user  data   •  EnIrely  Amazon  Web  Services  (AWS),  in  2  parallel  regions   2  
  3. 3. What  is  gevent?   gevent  is  a  corouIne-­‐based  Python   networking  library  that  uses  greenlet  to   provide  a  high-­‐level  synchronous  API  on   top  of  the  libevent  event  loop.   •  EssenIally,  allows  normally  synchronous  code  to  run   asynchronously   3  
  4. 4. What  is  gevent?   lib·∙e·∙vent  (ˈlib-­‐i-­‐ˈvent):  efficient  cross-­‐pla]orm  library  for   execuIng  callbacks  when  specific  events  occur  or  a   Imeout  has  been  reached.  Includes  several   networking  libraries  (e.g.  DNS,  HTTP)     green·∙let  (ˈgrēn-­‐lət):  lightweight  co-­‐rouInes  for  in-­‐process   concurrent  programming.  Ported  from  Stackless   Python  as  a  library  for  the  CPython  interpreter         4  
  5. 5. How  does  gevent  work?   •  One  gevent  “hub”  per  process   •  Monkey-­‐patch  blocking  libraries   - socket,  thread,  select,  etc.   •  Use  greenlets  like  threads   •  Blocking  calls  switch  to  another  (ready)  greenlet   5  
  6. 6. Example  Server  mod_wsgi:   gevent:   6  
  7. 7. Example  Server   •  Server  implementaIon  is  the  same   •  DB  lookup  blocks  on  network  IO   •  With  gevent,  greenlet  gets  swapped  out  so  another   request  can  be  served   •  When  the  DB  request  finishes,  the  greenlet  will   conInue  where  it  lej  off     7  
  8. 8. Advantages   •  Write  code  as  though  it  were  synchronous  (mostly)   - No  ‘callback  spaghen’  like  with  a  callback  framework   - Exact  same  code  can  run  synchronously  (e.g.  unit  tests)   •  Greenlets  are  very  lightweight   - 100’s  or  1000’s  can  run  concurrently   - No  context  switch   o  Same  order  of  magnitude  as  a  funcIon  call   - No  GIL  related  performance  issues     •  Co-­‐operaIve  concurrency  makes  synchronizaIon  easy   - Greenlets  cannot  be  preempted   - No  need  for  in-­‐process  atomic  locks   - Ojen  eliminates  the  need  for  synchronizaIon   o  As  long  as  there  are  no  blocking  calls  in  the  criIcal  secIon   8  
  9. 9. Advantages  (conInued)   •  gevent  is  fast   - Very  thorough  set  of  benchmarks  by  Nicholas  Piël hrp://­‐of-­‐python-­‐web-­‐servers   And  then  there  is  Gevent  [...]     […]  if  you  want  to  dive  into  high  performance  websockets  with   lots  of  concurrent  connecIons  you  really  have  to  go  with  an   asynchronous  framework.  Gevent  seems  like  the  perfect   companion  for  that,  at  least  that  is  what  we  are  going  to  use.     9  
  10. 10. Problems   •  Monkey-­‐patching   - Doesn’t  play  well  with  C  extensions   o  Blocking  code  in  C  libraries  will  cause  the  process  to  block   - Can  confuse  some  libraries   o  e.g.  thread-­‐local  storage   •  Breaks  analysis  tools   - cProfile  produces  garbage   - AlternaIve  tools  available   o  gevent-­‐profiler  (Meebo)   o  gevent_request_profiler  (TellApart)   •  Co-­‐operaIve  scheduling   - Rogue  greenlets  can  Ie  up  the  enIre  process   o  e.g.  CPU  bound  background  worker   - Long-­‐running  tasks  have  to  periodically  yield   10  
  11. 11. Problems   •  Same  server  as  before     •  Processing  in  loop  can  take  long   •  Can  hurt  latency  of  other  requests   •  Add  ‘gevent.sleep(0)’  to  loop   •  Allows  other  greenlets  to  run   11  
  12. 12. Uses   •  We  use  gevent  everywhere  we  use  Python   •  TellApart  Front  End  (TAFE)   - gevent  WSGI  server  with  a  micro-­‐framework   - One  process  per  core   - Nginx  reverse-­‐proxy  in  front   •  Database  Proxy  (moxie)   - Thrij  service   - ConnecIon  pooling  across  clients   - Minimal  addiIonal  latency  (~2ms)   12  
  13. 13. Case  Study  -­‐  Taba   •  Taba  is  a  distributed  Event  AggregaIon  Service   •  Provides  near  real-­‐Ime  metrics  from  across  a  cluster   •  At  TellApart:   - 10,000  individual  Tabs   - 100’s  of  event  source  clients   - 20,000,000  events  /  minute   - 25  seconds  latency  from  real-­‐Ime   13  
  14. 14. Case  Study  -­‐  Taba   •  Implement  Imeouts     very  easily   •  FuncIon  doesn’t  need     to  know  it’s  being  Imed   14  
  15. 15. Case  Study  –  Taba   •  Perform  simultaneous   lookups  to  a  sharded   database   •  No  thread  pools     •  No  need  for  locking   15  
  16. 16. Case  Study  –  Taba   •  Streaming  from  DB  in   batches   •  No  thread  pool   •  Trivial  synchronizaIon   •  Process  data  while  the   next  batch  is  retrieved   16  
  17. 17. Thank  you!     Kevin  Ballard   kevin(at)tellapart(dot)com    17