SlideShare a Scribd company logo
1 of 33
Download to read offline
THE JOURNEY OF ASYNCIO ADOPTION
IN INSTAGRAM
Jimmy Lai
in PyCon TW 2018
OUTLINE
2
1 What's asyncio?
2 Asyncio Adoption in Instagram
3 Q&A
ABOUT ME - JIMMY LAI
• Software Engineer in Instagram Infrastructure
• I like Python
• Recent interests: Python efficiency
• profiling
• Cython
• asyncio
3
INSTAGRAM BACKEND
• Python + Django
• Serving with uwsgi
• Data fetching from backends
• No. of processes > No. CPU
4
Server
uwsgi
Django process
sharedmemory
memcached
cassandra
thrift services
https://instagram-engineering.com/
...
CPU
Django process
Django process
Django process
Django process
Django process
BLOCKING I/O PROBLEMS
• Slow API: API takes longer time to finish. Bad user experience.
• CPU idle: Context switch between processes come with overhead.
• Harakiri: Long request process termination (uwsgi Harakiri). Restarting process has high
overhead.
5
WHAT'S ASYNCIO
• Asynchronous I/O
• Running I/O concurrently
• Blocking IO mode
• Async IO mode
6https://rarehistoricalphotos.com/samuel-reshevsky-age-8-france-1920/
• Simultaneous Exhibition
CPU I/O CPU I/O
CPU I/O
CPU I/O
CPU I/O
CPU I/O
time
ASYNCIO AS SOLUTION
• Slow API: API runs faster and user get better experiences.
• CPU idle: In-thread context switch vs process context switch.
• Harakiri: Just cancel pending async call. No need to kill process.
7
MYTHS ABOUT ASYNCIO
1. asyncio is multi-processes or parallel computing. It's single single-threaded.
• Only one function could be executed at one time.
• Only I/O could run concurrently.
2. asyncio is always faster regarding CPU and Latency.
• Overhead of event loop and context switch could be significant.
8
CPYTHON ASYNCIO
• asyncio module became available starting in CPython 3.4
• Instagram used version 2.7 for a long time and migrated to 3.5 in 2017
9
ASYNC SYNTAX
• async def, await, coroutine
10
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
11
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
12
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
• gather async functions to run IO concurrently
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
9 In [3]: async def run():
10 ...: results = await asyncio.gather(
11 ...: sleep_and_return(1),
12 ...: sleep_and_return(1),
13 ...: sleep_and_return(2),
14 ...: )
15 ...: print(results)
16 ...:
17
18 In [4]: %timeit -r 1
asyncio.get_event_loop().run_until_complete(run())
19 ...:
20 ...:
21 [1, 1, 2]
22 [1, 1, 2]
23 2 s ± 0 ns per loop (mean ± std. dev. of 1
run, 1 loop each)
ASYNC SYNTAX
• async def, await, coroutine
• run async function in event loop
13
1 In [1]: async def sleep_and_return(sec):
2 ...: await asyncio.sleep(sec)
3 ...: return sec
4 ...:
5
6 In [2]: sleep_and_return()
7 Out[2]: <coroutine object sleep_and_return
at 0x10556ae60>
• gather async functions to run IO concurrently
8
9 In [3]:
asyncio.get_event_loop().run_until_complete(
sleep_and_return(1))
10 Out[3]: 1
9 In [3]: async def run():
10 ...: results = await asyncio.gather(
11 ...: sleep_and_return(1),
12 ...: sleep_and_return(1),
13 ...: sleep_and_return(2),
14 ...: )
15 ...: print(results)
16 ...:
17
18 In [4]: %timeit -r 1
asyncio.get_event_loop().run_until_complete(run())
19 ...:
20 ...:
21 [1, 1, 2]
22 [1, 1, 2]
23 2 s ± 0 ns per loop (mean ± std. dev. of 1
run, 1 loop each)
gather() is the key to get latency win!
HOW ASYNCIO WORKS?
• nonblocking I/O mode: socket.setblocking(False)
• register I/O to EpollSelector and wait until I/O ready by select( )
14Source code are simplified for explanation purpose.
1 class BaseSelectorEventLoop:
2 async def sock_recv(self, sock, n):
3 """Receive data from the socket."""
4 fut = self.create_future()
5 fd = sock.fileno()
6 handle = events.Handle(
7 self._sock_recv, args, self, None
8 )
9 self._selector.register(
10 fd, selectors.EVENT_READ, (handle, None)
11 )
12 return await fut
13
14 def _sock_recv(self, fut, registered_fd, sock, n):
15 try:
16 data = sock.recv(n)
17 except (BlockingIOError, InterruptedError):
18 ...
19
20 def run_until_complete(self, future):
21 """Run until the Future is done."""
22 self.run_forever()
23
24 def run_forever(self):
25 """Run until stop() is called."""
26 while True:
27 self._run_once()
28 if self._stopping:
29 break
30
31 def _run_once(self):
32 """Run one full iteration of the event loop."""
33 event_list = self._selector.select(None)
34 self._process_events(event_list)
35 ntodo = len(self._ready)
36 for i in range(ntodo):
37 handle = self._ready.popleft()
38 handle._run()
1
2
3
ASYNCIO ADOPTION IN INSTAGRAM
ASYNCIO ADOPTION IN INSTAGRAM JUST LIKE
decorate some trees in a forest
16
Instagram started using
Django and launched in
2010.
Large repo and many
developers.
ASYNCIO ADOPTION CHALLENGES
• scale: collaboration in large code repo with a lot of developers
• usability: asyncio utility and bug fix
• prioritization: too much blocking calls to migrate
• automation: reduce repeated manual effort
• efficiency: asyncio CPU overhead is very high
17
BACKEND CLIENT LIBRARIES ASYNCIO SUPPORT
• Thrift
• fbthrift py3 and py.asyncio namespaces
• Http
• aiohttp replaces requests
• Other backends
• https://github.com/aio-libs
18
• wait_for • async_test
MAKE ASYNCIO EASIER
19
1 import asyncio
2
3 def wait_for(coro):
4 loop = asyncio.get_event_loop()
5 return loop.run_until_complete(coro)
6
7 result = wait_for(async_func())
1 def async_test(func):
2 def inner(*args, **kwargs):
3 return wait_for(
4 func(*args, **kwargs)
5 )
6 return inner
7
8 class TestAsyncMethods(unittest.TestCase):
9 @async_test
10 async def test_async_method(self):
11 obj = Cls()
12 self.assertTrue(await obj.async_func())
ASYNC STACK MIGRATION
20
1 def func():
2 blocking_thrift_call()
3
4 ## after migrating to async
5
6 async def func():
7 await async_thrift_call()
IDENTIFY BLOCKING CALLS
Blocking Call Finder
• Figure out blocking call stack
and prioritize among tons of
stacks
• Prioritize stack by latency/call
count
• Implementation:
• use profile to collect
runtime stack trace
• use pygraphviz to render
graph view
21
1 def f():
2 blocking_thrift_call()
3
4 def g():
5 h()
6
7 def h():
8 blocking_http_call()
9
10 def api():
11 f()
12 g()
api
f g
blocking_thrift_call
h
blocking_http_call
20ms
50k calls
10ms
10k calls
9ms
9k calls
9ms
9k calls
20ms
50k calls
WHEN TOO MANY DEPENDENCY IN STACK
• Use sync wrapper
22
SYNC
func = sync(async_func)
• Provide async and non-async versions
given a function.
• Supports classmethod, staticmethod,
etc.
• Clean up sync wrapper line after
migrate all callsite to async.
23
1 def sync(async_func):
2 is_classmethod = False
3 if isinstance(async_func, classmethod):
4 async_func = async_func.__func__
5 is_classmethod = True
6 elif isinstance(async_func, staticmethod):
7 async_func = async_func.__func__
8 if not asyncio.iscoroutinefunction(async_func):
9 async_func = asyncio.coroutine(async_func)
10
11 @functools.wraps(async_func)
12 def _no_profile_sync(*args, **kwargs):
13 return wait_for(async_func(*args, **kwargs))
14
15 if is_classmethod:
16 return classmethod(_no_profile_sync)
17 else:
18 return _no_profile_sync
19
20 func = sync(async_func)
NESTED EVENT LOOP
RuntimeError: This event loop is already running
24
run_until_complete( )
async def f( )
def g( )
def h( )
run_until_complete( )
async def i( )
• Use new event loop when loop
is already running.
• Loop pool for reusing event loop
• Set current event loop and
running loop when loop is
already running.
• Restore event loop after finish
run_until_complete.
1 import asyncio
2 from contextlib import contextmanager
3
4 def wait_for(coro):
5 with get_event_loop() as loop:
6 return loop.run_until_complete(coro)
7
8 @contextmanager
9 def get_event_loop():
10 loop = asyncio.get_event_loop()
11 if not loop.is_running():
12 yield loop
13 else:
14 new_loop = loop_pool.borrow_loop()
15 asyncio.set_event_loop(new_loop)
16 running_loop = asyncio.events._get_running_loop()
17 asyncio.events._set_running_loop(None)
18 try:
19 yield new_loop
20 finally:
21 loop_pool.return_loop(new_loop)
22 asyncio.set_event_loop(loop)
23 asyncio.events._set_running_loop(running_loop)
RUNTIME ERROR: EVENT LOOP STOPPED BEFORE FUTURE
COMPLETED.
25
1 def test_run_until_complete_loop_orphan_future_close_loop(self):
2 class ShowStopper(BaseException):
3 pass
4
5 async def foo(delay):
6 await asyncio.sleep(delay, loop=self.loop)
7
8 def throw():
9 raise ShowStopper
10
11 self.loop._process_events = mock.Mock()
12 self.loop.call_soon(throw)
13 try:
14 self.loop.run_until_complete(foo(0.1))
15 except ShowStopper:
16 pass
17
18 # This call fails if run_until_complete does not clean up
19 # done-callback for the previous future.
20 self.loop.run_until_complete(foo(0.2))
Fix in run_until_complete( )
https://github.com/python/cpython/pull/1688
GLOBAL VARIABLE ISSUE
• Execution order is not guaranteed. Shared
mutable global variable may cause
unexpected result.
26
1 var = Container()
2
3 async def f():
4 var.val = await read_from_db1()
5 await write_to_db1(var)
6
7 async def g():
8 var.val = await read_from_db2()
9 await write_to_db2(var)
10
11 async def run():
12 await asyncio.gather(f(), g())
1 import contextvars
2 var = contextvars.ContextVar('var')
3
4 async def f():
5 var.set(await read_from_db1())
6 await write_to_db1(var.get())
7
8 async def g():
9 var.set(await read_from_db2())
10 await write_to_db2(var.get())
11
12 async def run():
13 await asyncio.gather(f(), g())
• Context Variable added in Python 3.7
GATHER DESIGN PATTERN
• To achieve the maximum concurrency
27
1 async def identity(value):
2 return value
3
4 async def run():
5 awaitables = [
6 f(),
7 g() if a is True else identity(None),
8 h() if b is True else identity(None),
9 ]
10 _, var1, var2 = await asyncio.gather(*awaitables)
1 async def run():
2 await f()
3 var1 = None
4 if a is True:
5 var1 = await g()
6
7 var2 = None
8 if b is True:
9 var2 = await h()
LINT
Provide guidance to write better asyncio code
• Rules:
1. async function should be named with async_ prefix
• e.g. async_func( ) vs func( )
2. gather await in loop
3. warning when adding new blocking calls
• implemented with ast + flake8
28
1 for data in data_list:
2 await async_func(data)
3
4 # use gather to run faster
5 await asyncio.gather(*[async_func(data) for data in data_list])
AUTOMATION
• Many of asyncio changes are simple and repetitive
• smart code modifier for asyncio adoption:
• collect caller-callee from runtime profiling and offline pyan static analysis
• modify source code ast tree
• change blocking call to async call
• add await
• auto formatting code using isort and black
29
source
code
ast
code
modifier
change
set
pull
request
CPU OVERHEAD
• Adopting asyncio could cost ~20% CPU instructions on Instagram servers.
• CPython asyncio was slow due to Python implementation of event loop and helpers.
• Optimization strategies:
• simplify the code and remove redundant computation
• Cython
• C API
• Available optimizations:
• uvloop: libuv + Cython binding for event loop
• CPython 3.6 implement Future and Task in C
• CPython 3.7 implement get_event_loop( ) in C. Future and gather( ) also become
faster.
30
CUSTOM OPTIMIZATION
• Example: gather( ) -> ensure_future( ) -> isfuture/iscoroutine/isawaitable
• Reorder: check iscoroutine first
• gather deduplicate coroutines using a dict. Remove the assumption.
• Implement all helper functions by C API
• Optimization result: reduce the overall asyncio CPU overhead by 2X (10%)
31
CURRENT RESULTS
• API latency become 30% faster on server side
• Better user engagement
• more media views
• more time spent
• Next Steps
• 100% asyncio
• concurrent request handling
32
Q&A
jimmylai@instagram.com

More Related Content

What's hot

UVM Methodology Tutorial
UVM Methodology TutorialUVM Methodology Tutorial
UVM Methodology TutorialArrow Devices
 
FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)Naren Chandra
 
Design and Implementation of AMBA ASB APB Bridge
Design and Implementation of AMBA ASB APB BridgeDesign and Implementation of AMBA ASB APB Bridge
Design and Implementation of AMBA ASB APB BridgeManu BN
 
project report on embedded system
project report on embedded systemproject report on embedded system
project report on embedded systemram avtar
 
Logic synthesis using Verilog HDL
Logic synthesis using Verilog HDLLogic synthesis using Verilog HDL
Logic synthesis using Verilog HDLanand hd
 
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC 02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC IEEE SSCS AlexSC
 
Vlsi course syllabus verilog language
Vlsi course syllabus verilog languageVlsi course syllabus verilog language
Vlsi course syllabus verilog languageSoftroniics india
 
Spartan-3A/3AN FPGA Starter Kit Board User Guide
Spartan-3A/3AN  FPGA Starter Kit  Board User GuideSpartan-3A/3AN  FPGA Starter Kit  Board User Guide
Spartan-3A/3AN FPGA Starter Kit Board User GuideRemmy Fuentes Telleria
 
Introduction to systems programming
Introduction to systems programmingIntroduction to systems programming
Introduction to systems programmingMukesh Tekwani
 
Open ROADM Design and Technical Scope
Open ROADM Design and Technical ScopeOpen ROADM Design and Technical Scope
Open ROADM Design and Technical Scopekinsleyaniston
 
Embedded System Tools ppt
Embedded System Tools  pptEmbedded System Tools  ppt
Embedded System Tools pptHalai Hansika
 
Industrial Ethernet, Part 1: Technologies
Industrial Ethernet, Part 1: TechnologiesIndustrial Ethernet, Part 1: Technologies
Industrial Ethernet, Part 1: TechnologiesControlEng
 
Arm assembly language programming
Arm assembly language programmingArm assembly language programming
Arm assembly language programmingv Kalairajan
 
Superscalar and VLIW architectures
Superscalar and VLIW architecturesSuperscalar and VLIW architectures
Superscalar and VLIW architecturesAmit Kumar Rathi
 

What's hot (20)

UVM Methodology Tutorial
UVM Methodology TutorialUVM Methodology Tutorial
UVM Methodology Tutorial
 
FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)FreeRTOS basics (Real time Operating System)
FreeRTOS basics (Real time Operating System)
 
Design and Implementation of AMBA ASB APB Bridge
Design and Implementation of AMBA ASB APB BridgeDesign and Implementation of AMBA ASB APB Bridge
Design and Implementation of AMBA ASB APB Bridge
 
project report on embedded system
project report on embedded systemproject report on embedded system
project report on embedded system
 
Logic synthesis using Verilog HDL
Logic synthesis using Verilog HDLLogic synthesis using Verilog HDL
Logic synthesis using Verilog HDL
 
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC 02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC
02 : ARM Cortex M4 Specs || IEEE SSCS AlexSC
 
Vlsi course syllabus verilog language
Vlsi course syllabus verilog languageVlsi course syllabus verilog language
Vlsi course syllabus verilog language
 
Spartan-3A/3AN FPGA Starter Kit Board User Guide
Spartan-3A/3AN  FPGA Starter Kit  Board User GuideSpartan-3A/3AN  FPGA Starter Kit  Board User Guide
Spartan-3A/3AN FPGA Starter Kit Board User Guide
 
Introduction to systems programming
Introduction to systems programmingIntroduction to systems programming
Introduction to systems programming
 
Open ROADM Design and Technical Scope
Open ROADM Design and Technical ScopeOpen ROADM Design and Technical Scope
Open ROADM Design and Technical Scope
 
Embedded System Tools ppt
Embedded System Tools  pptEmbedded System Tools  ppt
Embedded System Tools ppt
 
Industrial Ethernet, Part 1: Technologies
Industrial Ethernet, Part 1: TechnologiesIndustrial Ethernet, Part 1: Technologies
Industrial Ethernet, Part 1: Technologies
 
Arm assembly language programming
Arm assembly language programmingArm assembly language programming
Arm assembly language programming
 
Superscalar and VLIW architectures
Superscalar and VLIW architecturesSuperscalar and VLIW architectures
Superscalar and VLIW architectures
 
Superscalar Processor
Superscalar ProcessorSuperscalar Processor
Superscalar Processor
 
Intro to Embedded OS, RTOS and Communication Protocols
Intro to Embedded OS, RTOS and Communication ProtocolsIntro to Embedded OS, RTOS and Communication Protocols
Intro to Embedded OS, RTOS and Communication Protocols
 
Vlsi assignment 1
Vlsi assignment 1Vlsi assignment 1
Vlsi assignment 1
 
Homework solutionsch8
Homework solutionsch8Homework solutionsch8
Homework solutionsch8
 
Pass 1 flowchart
Pass 1 flowchartPass 1 flowchart
Pass 1 flowchart
 
Peephole Optimization
Peephole OptimizationPeephole Optimization
Peephole Optimization
 

Similar to ASYNCIO ADOPTION IN INSTAGRAM

BUILDING APPS WITH ASYNCIO
BUILDING APPS WITH ASYNCIOBUILDING APPS WITH ASYNCIO
BUILDING APPS WITH ASYNCIOMykola Novik
 
Async programming and python
Async programming and pythonAsync programming and python
Async programming and pythonChetan Giridhar
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealTzung-Bi Shih
 
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOP
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOPHOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOP
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOPMykola Novik
 
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas EricssonOSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas EricssonNETWAYS
 
How NOT to write in Node.js
How NOT to write in Node.jsHow NOT to write in Node.js
How NOT to write in Node.jsPiotr Pelczar
 
Syncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
Syncing up with Python’s asyncio for (micro) service development, Joir-dan GumbsSyncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
Syncing up with Python’s asyncio for (micro) service development, Joir-dan GumbsPôle Systematic Paris-Region
 
Asynchronous programming intro
Asynchronous programming introAsynchronous programming intro
Asynchronous programming introcc liu
 
A deep dive into PEP-3156 and the new asyncio module
A deep dive into PEP-3156 and the new asyncio moduleA deep dive into PEP-3156 and the new asyncio module
A deep dive into PEP-3156 and the new asyncio moduleSaúl Ibarra Corretgé
 
JavaScript Async for Effortless UX
JavaScript Async for Effortless UXJavaScript Async for Effortless UX
JavaScript Async for Effortless UX재석 강
 
04_ForkPipe.pptx
04_ForkPipe.pptx04_ForkPipe.pptx
04_ForkPipe.pptxvnwzympx
 
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
«Gevent — быть или не быть?» Александр Мокров, Positive Technologiesit-people
 
Alexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for DevelopersAlexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for DevelopersDevDay Dresden
 
Letswift19-clean-architecture
Letswift19-clean-architectureLetswift19-clean-architecture
Letswift19-clean-architectureJung Kim
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Cisco DevNet
 
Down the rabbit hole, profiling in Django
Down the rabbit hole, profiling in DjangoDown the rabbit hole, profiling in Django
Down the rabbit hole, profiling in DjangoRemco Wendt
 

Similar to ASYNCIO ADOPTION IN INSTAGRAM (20)

BUILDING APPS WITH ASYNCIO
BUILDING APPS WITH ASYNCIOBUILDING APPS WITH ASYNCIO
BUILDING APPS WITH ASYNCIO
 
Async programming and python
Async programming and pythonAsync programming and python
Async programming and python
 
Global Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the SealGlobal Interpreter Lock: Episode I - Break the Seal
Global Interpreter Lock: Episode I - Break the Seal
 
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOP
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOPHOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOP
HOW TO DEAL WITH BLOCKING CODE WITHIN ASYNCIO EVENT LOOP
 
Aio...whatever
Aio...whateverAio...whatever
Aio...whatever
 
Python, do you even async?
Python, do you even async?Python, do you even async?
Python, do you even async?
 
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas EricssonOSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
OSMC 2012 | Neues in Nagios 4.0 by Andreas Ericsson
 
How NOT to write in Node.js
How NOT to write in Node.jsHow NOT to write in Node.js
How NOT to write in Node.js
 
Syncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
Syncing up with Python’s asyncio for (micro) service development, Joir-dan GumbsSyncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
Syncing up with Python’s asyncio for (micro) service development, Joir-dan Gumbs
 
Asynchronous programming intro
Asynchronous programming introAsynchronous programming intro
Asynchronous programming intro
 
A deep dive into PEP-3156 and the new asyncio module
A deep dive into PEP-3156 and the new asyncio moduleA deep dive into PEP-3156 and the new asyncio module
A deep dive into PEP-3156 and the new asyncio module
 
Gevent rabbit rpc
Gevent rabbit rpcGevent rabbit rpc
Gevent rabbit rpc
 
JavaScript Async for Effortless UX
JavaScript Async for Effortless UXJavaScript Async for Effortless UX
JavaScript Async for Effortless UX
 
04_ForkPipe.pptx
04_ForkPipe.pptx04_ForkPipe.pptx
04_ForkPipe.pptx
 
Gevent be or not to be
Gevent be or not to beGevent be or not to be
Gevent be or not to be
 
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
«Gevent — быть или не быть?» Александр Мокров, Positive Technologies
 
Alexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for DevelopersAlexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for Developers
 
Letswift19-clean-architecture
Letswift19-clean-architectureLetswift19-clean-architecture
Letswift19-clean-architecture
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!
 
Down the rabbit hole, profiling in Django
Down the rabbit hole, profiling in DjangoDown the rabbit hole, profiling in Django
Down the rabbit hole, profiling in Django
 

More from Jimmy Lai

Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdfJimmy Lai
 
EuroPython 2022 - Automated Refactoring Large Python Codebases
EuroPython 2022 - Automated Refactoring Large Python CodebasesEuroPython 2022 - Automated Refactoring Large Python Codebases
EuroPython 2022 - Automated Refactoring Large Python CodebasesJimmy Lai
 
Annotate types in large codebase with automated refactoring
Annotate types in large codebase with automated refactoringAnnotate types in large codebase with automated refactoring
Annotate types in large codebase with automated refactoringJimmy Lai
 
Data Analyst Nanodegree
Data Analyst NanodegreeData Analyst Nanodegree
Data Analyst NanodegreeJimmy Lai
 
Distributed system coordination by zookeeper and introduction to kazoo python...
Distributed system coordination by zookeeper and introduction to kazoo python...Distributed system coordination by zookeeper and introduction to kazoo python...
Distributed system coordination by zookeeper and introduction to kazoo python...Jimmy Lai
 
Continuous Delivery: automated testing, continuous integration and continuous...
Continuous Delivery: automated testing, continuous integration and continuous...Continuous Delivery: automated testing, continuous integration and continuous...
Continuous Delivery: automated testing, continuous integration and continuous...Jimmy Lai
 
Build a Searchable Knowledge Base
Build a Searchable Knowledge BaseBuild a Searchable Knowledge Base
Build a Searchable Knowledge BaseJimmy Lai
 
[LDSP] Solr Usage
[LDSP] Solr Usage[LDSP] Solr Usage
[LDSP] Solr UsageJimmy Lai
 
[LDSP] Search Engine Back End API Solution for Fast Prototyping
[LDSP] Search Engine Back End API Solution for Fast Prototyping[LDSP] Search Engine Back End API Solution for Fast Prototyping
[LDSP] Search Engine Back End API Solution for Fast PrototypingJimmy Lai
 
Text classification in scikit-learn
Text classification in scikit-learnText classification in scikit-learn
Text classification in scikit-learnJimmy Lai
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Jimmy Lai
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Jimmy Lai
 
Software development practices in python
Software development practices in pythonSoftware development practices in python
Software development practices in pythonJimmy Lai
 
Fast data mining flow prototyping using IPython Notebook
Fast data mining flow prototyping using IPython NotebookFast data mining flow prototyping using IPython Notebook
Fast data mining flow prototyping using IPython NotebookJimmy Lai
 
Documentation with sphinx @ PyHug
Documentation with sphinx @ PyHugDocumentation with sphinx @ PyHug
Documentation with sphinx @ PyHugJimmy Lai
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesApache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesJimmy Lai
 
NetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHugNetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHugJimmy Lai
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012Jimmy Lai
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk  natural language toolkit overview and application @ PyCon.tw 2012Nltk  natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012Jimmy Lai
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHugJimmy Lai
 

More from Jimmy Lai (20)

Python Linters at Scale.pdf
Python Linters at Scale.pdfPython Linters at Scale.pdf
Python Linters at Scale.pdf
 
EuroPython 2022 - Automated Refactoring Large Python Codebases
EuroPython 2022 - Automated Refactoring Large Python CodebasesEuroPython 2022 - Automated Refactoring Large Python Codebases
EuroPython 2022 - Automated Refactoring Large Python Codebases
 
Annotate types in large codebase with automated refactoring
Annotate types in large codebase with automated refactoringAnnotate types in large codebase with automated refactoring
Annotate types in large codebase with automated refactoring
 
Data Analyst Nanodegree
Data Analyst NanodegreeData Analyst Nanodegree
Data Analyst Nanodegree
 
Distributed system coordination by zookeeper and introduction to kazoo python...
Distributed system coordination by zookeeper and introduction to kazoo python...Distributed system coordination by zookeeper and introduction to kazoo python...
Distributed system coordination by zookeeper and introduction to kazoo python...
 
Continuous Delivery: automated testing, continuous integration and continuous...
Continuous Delivery: automated testing, continuous integration and continuous...Continuous Delivery: automated testing, continuous integration and continuous...
Continuous Delivery: automated testing, continuous integration and continuous...
 
Build a Searchable Knowledge Base
Build a Searchable Knowledge BaseBuild a Searchable Knowledge Base
Build a Searchable Knowledge Base
 
[LDSP] Solr Usage
[LDSP] Solr Usage[LDSP] Solr Usage
[LDSP] Solr Usage
 
[LDSP] Search Engine Back End API Solution for Fast Prototyping
[LDSP] Search Engine Back End API Solution for Fast Prototyping[LDSP] Search Engine Back End API Solution for Fast Prototyping
[LDSP] Search Engine Back End API Solution for Fast Prototyping
 
Text classification in scikit-learn
Text classification in scikit-learnText classification in scikit-learn
Text classification in scikit-learn
 
Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013Big data analysis in python @ PyCon.tw 2013
Big data analysis in python @ PyCon.tw 2013
 
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python – using Pandas, scikit-learn, IPython Notebook ...
 
Software development practices in python
Software development practices in pythonSoftware development practices in python
Software development practices in python
 
Fast data mining flow prototyping using IPython Notebook
Fast data mining flow prototyping using IPython NotebookFast data mining flow prototyping using IPython Notebook
Fast data mining flow prototyping using IPython Notebook
 
Documentation with sphinx @ PyHug
Documentation with sphinx @ PyHugDocumentation with sphinx @ PyHug
Documentation with sphinx @ PyHug
 
Apache thrift-RPC service cross languages
Apache thrift-RPC service cross languagesApache thrift-RPC service cross languages
Apache thrift-RPC service cross languages
 
NetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHugNetworkX - python graph analysis and visualization @ PyHug
NetworkX - python graph analysis and visualization @ PyHug
 
When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012When big data meet python @ COSCUP 2012
When big data meet python @ COSCUP 2012
 
Nltk natural language toolkit overview and application @ PyCon.tw 2012
Nltk  natural language toolkit overview and application @ PyCon.tw 2012Nltk  natural language toolkit overview and application @ PyCon.tw 2012
Nltk natural language toolkit overview and application @ PyCon.tw 2012
 
Nltk natural language toolkit overview and application @ PyHug
Nltk  natural language toolkit overview and application @ PyHugNltk  natural language toolkit overview and application @ PyHug
Nltk natural language toolkit overview and application @ PyHug
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 

ASYNCIO ADOPTION IN INSTAGRAM

  • 1. THE JOURNEY OF ASYNCIO ADOPTION IN INSTAGRAM Jimmy Lai in PyCon TW 2018
  • 2. OUTLINE 2 1 What's asyncio? 2 Asyncio Adoption in Instagram 3 Q&A
  • 3. ABOUT ME - JIMMY LAI • Software Engineer in Instagram Infrastructure • I like Python • Recent interests: Python efficiency • profiling • Cython • asyncio 3
  • 4. INSTAGRAM BACKEND • Python + Django • Serving with uwsgi • Data fetching from backends • No. of processes > No. CPU 4 Server uwsgi Django process sharedmemory memcached cassandra thrift services https://instagram-engineering.com/ ... CPU Django process Django process Django process Django process Django process
  • 5. BLOCKING I/O PROBLEMS • Slow API: API takes longer time to finish. Bad user experience. • CPU idle: Context switch between processes come with overhead. • Harakiri: Long request process termination (uwsgi Harakiri). Restarting process has high overhead. 5
  • 6. WHAT'S ASYNCIO • Asynchronous I/O • Running I/O concurrently • Blocking IO mode • Async IO mode 6https://rarehistoricalphotos.com/samuel-reshevsky-age-8-france-1920/ • Simultaneous Exhibition CPU I/O CPU I/O CPU I/O CPU I/O CPU I/O CPU I/O time
  • 7. ASYNCIO AS SOLUTION • Slow API: API runs faster and user get better experiences. • CPU idle: In-thread context switch vs process context switch. • Harakiri: Just cancel pending async call. No need to kill process. 7
  • 8. MYTHS ABOUT ASYNCIO 1. asyncio is multi-processes or parallel computing. It's single single-threaded. • Only one function could be executed at one time. • Only I/O could run concurrently. 2. asyncio is always faster regarding CPU and Latency. • Overhead of event loop and context switch could be significant. 8
  • 9. CPYTHON ASYNCIO • asyncio module became available starting in CPython 3.4 • Instagram used version 2.7 for a long time and migrated to 3.5 in 2017 9
  • 10. ASYNC SYNTAX • async def, await, coroutine 10 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60>
  • 11. ASYNC SYNTAX • async def, await, coroutine • run async function in event loop 11 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1
  • 12. ASYNC SYNTAX • async def, await, coroutine • run async function in event loop 12 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> • gather async functions to run IO concurrently 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1 9 In [3]: async def run(): 10 ...: results = await asyncio.gather( 11 ...: sleep_and_return(1), 12 ...: sleep_and_return(1), 13 ...: sleep_and_return(2), 14 ...: ) 15 ...: print(results) 16 ...: 17 18 In [4]: %timeit -r 1 asyncio.get_event_loop().run_until_complete(run()) 19 ...: 20 ...: 21 [1, 1, 2] 22 [1, 1, 2] 23 2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)
  • 13. ASYNC SYNTAX • async def, await, coroutine • run async function in event loop 13 1 In [1]: async def sleep_and_return(sec): 2 ...: await asyncio.sleep(sec) 3 ...: return sec 4 ...: 5 6 In [2]: sleep_and_return() 7 Out[2]: <coroutine object sleep_and_return at 0x10556ae60> • gather async functions to run IO concurrently 8 9 In [3]: asyncio.get_event_loop().run_until_complete( sleep_and_return(1)) 10 Out[3]: 1 9 In [3]: async def run(): 10 ...: results = await asyncio.gather( 11 ...: sleep_and_return(1), 12 ...: sleep_and_return(1), 13 ...: sleep_and_return(2), 14 ...: ) 15 ...: print(results) 16 ...: 17 18 In [4]: %timeit -r 1 asyncio.get_event_loop().run_until_complete(run()) 19 ...: 20 ...: 21 [1, 1, 2] 22 [1, 1, 2] 23 2 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each) gather() is the key to get latency win!
  • 14. HOW ASYNCIO WORKS? • nonblocking I/O mode: socket.setblocking(False) • register I/O to EpollSelector and wait until I/O ready by select( ) 14Source code are simplified for explanation purpose. 1 class BaseSelectorEventLoop: 2 async def sock_recv(self, sock, n): 3 """Receive data from the socket.""" 4 fut = self.create_future() 5 fd = sock.fileno() 6 handle = events.Handle( 7 self._sock_recv, args, self, None 8 ) 9 self._selector.register( 10 fd, selectors.EVENT_READ, (handle, None) 11 ) 12 return await fut 13 14 def _sock_recv(self, fut, registered_fd, sock, n): 15 try: 16 data = sock.recv(n) 17 except (BlockingIOError, InterruptedError): 18 ... 19 20 def run_until_complete(self, future): 21 """Run until the Future is done.""" 22 self.run_forever() 23 24 def run_forever(self): 25 """Run until stop() is called.""" 26 while True: 27 self._run_once() 28 if self._stopping: 29 break 30 31 def _run_once(self): 32 """Run one full iteration of the event loop.""" 33 event_list = self._selector.select(None) 34 self._process_events(event_list) 35 ntodo = len(self._ready) 36 for i in range(ntodo): 37 handle = self._ready.popleft() 38 handle._run() 1 2 3
  • 15. ASYNCIO ADOPTION IN INSTAGRAM
  • 16. ASYNCIO ADOPTION IN INSTAGRAM JUST LIKE decorate some trees in a forest 16 Instagram started using Django and launched in 2010. Large repo and many developers.
  • 17. ASYNCIO ADOPTION CHALLENGES • scale: collaboration in large code repo with a lot of developers • usability: asyncio utility and bug fix • prioritization: too much blocking calls to migrate • automation: reduce repeated manual effort • efficiency: asyncio CPU overhead is very high 17
  • 18. BACKEND CLIENT LIBRARIES ASYNCIO SUPPORT • Thrift • fbthrift py3 and py.asyncio namespaces • Http • aiohttp replaces requests • Other backends • https://github.com/aio-libs 18
  • 19. • wait_for • async_test MAKE ASYNCIO EASIER 19 1 import asyncio 2 3 def wait_for(coro): 4 loop = asyncio.get_event_loop() 5 return loop.run_until_complete(coro) 6 7 result = wait_for(async_func()) 1 def async_test(func): 2 def inner(*args, **kwargs): 3 return wait_for( 4 func(*args, **kwargs) 5 ) 6 return inner 7 8 class TestAsyncMethods(unittest.TestCase): 9 @async_test 10 async def test_async_method(self): 11 obj = Cls() 12 self.assertTrue(await obj.async_func())
  • 20. ASYNC STACK MIGRATION 20 1 def func(): 2 blocking_thrift_call() 3 4 ## after migrating to async 5 6 async def func(): 7 await async_thrift_call()
  • 21. IDENTIFY BLOCKING CALLS Blocking Call Finder • Figure out blocking call stack and prioritize among tons of stacks • Prioritize stack by latency/call count • Implementation: • use profile to collect runtime stack trace • use pygraphviz to render graph view 21 1 def f(): 2 blocking_thrift_call() 3 4 def g(): 5 h() 6 7 def h(): 8 blocking_http_call() 9 10 def api(): 11 f() 12 g() api f g blocking_thrift_call h blocking_http_call 20ms 50k calls 10ms 10k calls 9ms 9k calls 9ms 9k calls 20ms 50k calls
  • 22. WHEN TOO MANY DEPENDENCY IN STACK • Use sync wrapper 22
  • 23. SYNC func = sync(async_func) • Provide async and non-async versions given a function. • Supports classmethod, staticmethod, etc. • Clean up sync wrapper line after migrate all callsite to async. 23 1 def sync(async_func): 2 is_classmethod = False 3 if isinstance(async_func, classmethod): 4 async_func = async_func.__func__ 5 is_classmethod = True 6 elif isinstance(async_func, staticmethod): 7 async_func = async_func.__func__ 8 if not asyncio.iscoroutinefunction(async_func): 9 async_func = asyncio.coroutine(async_func) 10 11 @functools.wraps(async_func) 12 def _no_profile_sync(*args, **kwargs): 13 return wait_for(async_func(*args, **kwargs)) 14 15 if is_classmethod: 16 return classmethod(_no_profile_sync) 17 else: 18 return _no_profile_sync 19 20 func = sync(async_func)
  • 24. NESTED EVENT LOOP RuntimeError: This event loop is already running 24 run_until_complete( ) async def f( ) def g( ) def h( ) run_until_complete( ) async def i( ) • Use new event loop when loop is already running. • Loop pool for reusing event loop • Set current event loop and running loop when loop is already running. • Restore event loop after finish run_until_complete. 1 import asyncio 2 from contextlib import contextmanager 3 4 def wait_for(coro): 5 with get_event_loop() as loop: 6 return loop.run_until_complete(coro) 7 8 @contextmanager 9 def get_event_loop(): 10 loop = asyncio.get_event_loop() 11 if not loop.is_running(): 12 yield loop 13 else: 14 new_loop = loop_pool.borrow_loop() 15 asyncio.set_event_loop(new_loop) 16 running_loop = asyncio.events._get_running_loop() 17 asyncio.events._set_running_loop(None) 18 try: 19 yield new_loop 20 finally: 21 loop_pool.return_loop(new_loop) 22 asyncio.set_event_loop(loop) 23 asyncio.events._set_running_loop(running_loop)
  • 25. RUNTIME ERROR: EVENT LOOP STOPPED BEFORE FUTURE COMPLETED. 25 1 def test_run_until_complete_loop_orphan_future_close_loop(self): 2 class ShowStopper(BaseException): 3 pass 4 5 async def foo(delay): 6 await asyncio.sleep(delay, loop=self.loop) 7 8 def throw(): 9 raise ShowStopper 10 11 self.loop._process_events = mock.Mock() 12 self.loop.call_soon(throw) 13 try: 14 self.loop.run_until_complete(foo(0.1)) 15 except ShowStopper: 16 pass 17 18 # This call fails if run_until_complete does not clean up 19 # done-callback for the previous future. 20 self.loop.run_until_complete(foo(0.2)) Fix in run_until_complete( ) https://github.com/python/cpython/pull/1688
  • 26. GLOBAL VARIABLE ISSUE • Execution order is not guaranteed. Shared mutable global variable may cause unexpected result. 26 1 var = Container() 2 3 async def f(): 4 var.val = await read_from_db1() 5 await write_to_db1(var) 6 7 async def g(): 8 var.val = await read_from_db2() 9 await write_to_db2(var) 10 11 async def run(): 12 await asyncio.gather(f(), g()) 1 import contextvars 2 var = contextvars.ContextVar('var') 3 4 async def f(): 5 var.set(await read_from_db1()) 6 await write_to_db1(var.get()) 7 8 async def g(): 9 var.set(await read_from_db2()) 10 await write_to_db2(var.get()) 11 12 async def run(): 13 await asyncio.gather(f(), g()) • Context Variable added in Python 3.7
  • 27. GATHER DESIGN PATTERN • To achieve the maximum concurrency 27 1 async def identity(value): 2 return value 3 4 async def run(): 5 awaitables = [ 6 f(), 7 g() if a is True else identity(None), 8 h() if b is True else identity(None), 9 ] 10 _, var1, var2 = await asyncio.gather(*awaitables) 1 async def run(): 2 await f() 3 var1 = None 4 if a is True: 5 var1 = await g() 6 7 var2 = None 8 if b is True: 9 var2 = await h()
  • 28. LINT Provide guidance to write better asyncio code • Rules: 1. async function should be named with async_ prefix • e.g. async_func( ) vs func( ) 2. gather await in loop 3. warning when adding new blocking calls • implemented with ast + flake8 28 1 for data in data_list: 2 await async_func(data) 3 4 # use gather to run faster 5 await asyncio.gather(*[async_func(data) for data in data_list])
  • 29. AUTOMATION • Many of asyncio changes are simple and repetitive • smart code modifier for asyncio adoption: • collect caller-callee from runtime profiling and offline pyan static analysis • modify source code ast tree • change blocking call to async call • add await • auto formatting code using isort and black 29 source code ast code modifier change set pull request
  • 30. CPU OVERHEAD • Adopting asyncio could cost ~20% CPU instructions on Instagram servers. • CPython asyncio was slow due to Python implementation of event loop and helpers. • Optimization strategies: • simplify the code and remove redundant computation • Cython • C API • Available optimizations: • uvloop: libuv + Cython binding for event loop • CPython 3.6 implement Future and Task in C • CPython 3.7 implement get_event_loop( ) in C. Future and gather( ) also become faster. 30
  • 31. CUSTOM OPTIMIZATION • Example: gather( ) -> ensure_future( ) -> isfuture/iscoroutine/isawaitable • Reorder: check iscoroutine first • gather deduplicate coroutines using a dict. Remove the assumption. • Implement all helper functions by C API • Optimization result: reduce the overall asyncio CPU overhead by 2X (10%) 31
  • 32. CURRENT RESULTS • API latency become 30% faster on server side • Better user engagement • more media views • more time spent • Next Steps • 100% asyncio • concurrent request handling 32