SlideShare a Scribd company logo
JIT compilation
for CPython
Dmitry Alimov
2019
SPb Python
JIT compilation and JIT history
My experience with JIT in CPython
Python projects that use JIT and projects for JIT
Outline
What is JIT compilation
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
The earliest JIT compiler on LISP by John McCarthy in 1960
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
The earliest JIT compiler on LISP by John McCarthy in 1960
Ken Thompson in 1968 used for regex in text editor QED
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
The earliest JIT compiler on LISP by John McCarthy in 1960
Ken Thompson in 1968 used for regex in text editor QED
LC2
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
The earliest JIT compiler on LISP by John McCarthy in 1960
Ken Thompson in 1968 used for regex in text editor QED
LC2
Smalltalk
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
The earliest JIT compiler on LISP by John McCarthy in 1960
Ken Thompson in 1968 used for regex in text editor QED
LC2
Smalltalk
Self
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
The earliest JIT compiler on LISP by John McCarthy in 1960
Ken Thompson in 1968 used for regex in text editor QED
LC2
Smalltalk
Self
Popularized by Java with James Gosling using the term from 1993
JIT
Just-in-time compilation (aka dynamic translation, run-time compilation)
The earliest JIT compiler on LISP by John McCarthy in 1960
Ken Thompson in 1968 used for regex in text editor QED
LC2
Smalltalk
Self
Popularized by Java with James Gosling using the term from 1993
Just-in-time manufacturing, also known as just-in-time production or the Toyota
Production System (TPS)
My experience with
JIT in CPython
Example
def fibonacci(n):
"""Returns n-th Fibonacci number"""
a = 0
b = 1
if n < 1:
return a
i = 0
while i < n:
temp = a
a = b
b = temp + b
i += 1
return a
Fibonacci Sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
Let’s JIT it
1) Convert function to machine code at run-time
Let’s JIT it
1) Convert function to machine code at run-time
2) Execute this machine code
Let’s JIT it
@jit
def fibonacci(n):
"""Returns n-th Fibonacci number"""
a = 0
b = 1
if n < 1:
return a
i = 0
while i < n:
temp = a
a = b
b = temp + b
i += 1
return a
Convert function to AST
import ast
import inspect
lines = inspect.getsource(func)
node = ast.parse(lines)
visitor = Visitor()
visitor.visit(node)
AST
Module(body=[
FunctionDef(name='fibonacci', args=arguments(args=[Name(id='n', ctx=Param())],
vararg=None, kwarg=None, defaults=[]), body=[
Expr(value=Str(s='Returns n-th Fibonacci number')),
Assign(targets=[Name(id='a', ctx=Store())], value=Num(n=0)),
Assign(targets=[Name(id='b', ctx=Store())], value=Num(n=1)),
If(test=Compare(left=Name(id='n', ctx=Load()), ops=[Lt()], comparators=[Num(n=1)]), body=[
Return(value=Name(id='a', ctx=Load()))
], orelse=[]),
Assign(targets=[Name(id='i', ctx=Store())], value=Num(n=0)),
While(test=Compare(left=Name(id='i', ctx=Load()), ops=[Lt()], comparators=[Name(id='n', ctx=Load())]), body=[
Assign(targets=[Name(id='temp', ctx=Store())], value=Name(id='a', ctx=Load())),
Assign(targets=[Name(id='a', ctx=Store())], value=Name(id='b', ctx=Load())),
Assign(targets=[Name(id='b', ctx=Store())], value=BinOp(
left=Name(id='temp', ctx=Load()), op=Add(), right=Name(id='b', ctx=Load()))),
AugAssign(target=Name(id='i', ctx=Store()), op=Add(), value=Num(n=1))
], orelse=[]),
Return(value=Name(id='a', ctx=Load()))
], decorator_list=[Name(id='jit', ctx=Load())])
])
AST to IL ASM
class Visitor(ast.NodeVisitor):
def __init__(self):
self.ops = []
...
...
def visit_Assign(self, node):
if isinstance(node.value, ast.Num):
self.ops.append('MOV <{}>, {}'.format(node.targets[0].id, node.value.n))
elif isinstance(node.value, ast.Name):
self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.id))
elif isinstance(node.value, ast.BinOp):
self.ops.extend(self.visit_BinOp(node.value))
self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.left.id))
...
AST to IL ASM
class Visitor(ast.NodeVisitor):
def __init__(self):
self.ops = []
...
...
def visit_Assign(self, node):
if isinstance(node.value, ast.Num):
self.ops.append('MOV <{}>, {}'.format(node.targets[0].id, node.value.n))
elif isinstance(node.value, ast.Name):
self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.id))
elif isinstance(node.value, ast.BinOp):
self.ops.extend(self.visit_BinOp(node.value))
self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.left.id))
...
...
Assign(
targets=[Name(id='i', ctx=Store())],
value=Num(n=0)
),
Assign(
targets=[Name(id='a', ctx=Store())],
value=Name(id='b', ctx=Load())
),
...
...
MOV <i>, 0
...
AST to IL ASM
class Visitor(ast.NodeVisitor):
def __init__(self):
self.ops = []
...
...
def visit_Assign(self, node):
if isinstance(node.value, ast.Num):
self.ops.append('MOV <{}>, {}'.format(node.targets[0].id, node.value.n))
elif isinstance(node.value, ast.Name):
self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.id))
elif isinstance(node.value, ast.BinOp):
self.ops.extend(self.visit_BinOp(node.value))
self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.left.id))
...
...
Assign(
targets=[Name(id='i', ctx=Store())],
value=Num(n=0)
),
Assign(
targets=[Name(id='a', ctx=Store())],
value=Name(id='b', ctx=Load())
),
...
...
MOV <i>, 0
MOV <a>, <b>
...
IL ASM to ASM
MOV <a>, 0
MOV <b>, 1
CMP <n>, 1
JNL label0
RET
label0:
MOV <i>, 0
loop0:
MOV <temp>, <a>
MOV <a>, <b>
ADD <temp>, <b>
MOV <b>, <temp>
INC <i>
CMP <i>, <n>
JL loop0
RET
IL ASM to ASM
MOV <a>, 0
MOV <b>, 1
CMP <n>, 1
JNL label0
RET
label0:
MOV <i>, 0
loop0:
MOV <temp>, <a>
MOV <a>, <b>
ADD <temp>, <b>
MOV <b>, <temp>
INC <i>
CMP <i>, <n>
JL loop0
RET
# for x64 system
args_registers = ['rdi', 'rsi', 'rdx', ...]
registers = ['rax', 'rbx', 'rcx', ...]
# return register: rax
def fibonacci(n): n ⇔ rdi
...
return a a ⇔ rax
IL ASM to ASM
MOV rax, 0
MOV rbx, 1
CMP rdi, 1
JNL label0
RET
label0:
MOV rcx, 0
loop0:
MOV rdx, rax
MOV rax, rbx
ADD rdx, rbx
MOV rbx, rdx
INC rcx
CMP rcx, rdi
JL loop0
RET
MOV <a>, 0
MOV <b>, 1
CMP <n>, 1
JNL label0
RET
label0:
MOV <i>, 0
loop0:
MOV <temp>, <a>
MOV <a>, <b>
ADD <temp>, <b>
MOV <b>, <temp>
INC <i>
CMP <i>, <n>
JL loop0
RET
ASM to machine code
MOV rax, 0
MOV rbx, 1
CMP rdi, 1
JNL label0
RET
label0:
MOV rcx, 0
loop0:
MOV rdx, rax
MOV rax, rbx
ADD rdx, rbx
MOV rbx, rdx
INC rcx
CMP rcx, rdi
JL loop0
RET
from pwnlib.asm import asm
code = asm(asm_code, arch='amd64')
ASM to machine code
MOV rax, 0
MOV rbx, 1
CMP rdi, 1
JNL label0
RET
label0:
MOV rcx, 0
loop0:
MOV rdx, rax
MOV rax, rbx
ADD rdx, rbx
MOV rbx, rdx
INC rcx
CMP rcx, rdi
JL loop0
RET
ASM to machine code
MOV rax, 0
MOV rbx, 1
CMP rdi, 1
JNL label0
RET
label0:
MOV rcx, 0
loop0:
MOV rdx, rax
MOV rax, rbx
ADD rdx, rbx
MOV rbx, rdx
INC rcx
CMP rcx, rdi
JL loop0
RET
x48xc7xc0x00x00x00x00
x48xc7xc3x01x00x00x00
x48x83xffx01x7dx01xc3
x48xc7xc1x00x00x00x00
x48x89xc2x48x89xd8x48
x01xdax48x89xd3x48xff
xc1x48x39xf9x7cxecxc3
Create function in memory
1) Allocate memory
Create function in memory
1) Allocate memory
2) Copy machine code to allocated memory
Create function in memory
1) Allocate memory
2) Copy machine code to allocated memory
3) Mark the memory as executable
Create function in memory
1) Allocate memory
2) Copy machine code to allocated memory
3) Mark the memory as executable
Linux: mmap, mprotect
Windows: VirtualAlloc, VirtualProtect
Signatures in C/C++
Linux:
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
int mprotect(void *addr, size_t len, int prot);
void *memcpy(void *dest, const void *src, size_t n);
int munmap(void *addr, size_t length);
Windows:
LPVOID VirtualAlloc(LPVOID lpAddress, SIZE_T dwSize,
DWORD flAllocationType, DWORD flProtect);
BOOL VirtualProtect(LPVOID lpAddress, SIZE_T dwSize,
DWORD flNewProtect, PDWORD lpflOldProtect);
void *memcpy(void *dest, const void *src, size_t count);
BOOL VirtualFree(LPVOID lpAddress, SIZE_T dwSize, DWORD dwFreeType);
Create function in memory
import ctypes
# Linux
libc = ctypes.CDLL('libc.so.6')
libc.mmap
libc.mprotect
libc.memcpy
libc.munmap
# Windows
ctypes.windll.kernel32.VirtualAlloc
ctypes.windll.kernel32.VirtualProtect
ctypes.cdll.msvcrt.memcpy
ctypes.windll.kernel32.VirtualFree
Create function in memory
mmap_func = libc.mmap
mmap_func.argtype = [ctypes.c_void_p, ctypes.c_size_t, ctypes.c_int,
ctypes.c_int, ctypes.c_int, ctypes.c_size_t]
mmap_func.restype = ctypes.c_void_p
memcpy_func = libc.memcpy
memcpy_func.argtypes = [ctypes.c_void_p, ctypes.c_void_p, ctypes.c_size_t]
memcpy_func.restype = ctypes.c_char_p
Create function in memory
machine_code = 'x48xc7xc0x00x00x00x00x48xc7xc3x01x00x00x00x48
x83xffx01x7dx01xc3x48xc7xc1x00x00x00x00x48x89xc2x48x89xd8
x48x01xdax48x89xd3x48xffxc1x48x39xf9x7cxecxc3'
machine_code_size = len(machine_code)
addr = mmap_func(None, machine_code_size, PROT_READ | PROT_WRITE | PROT_EXEC,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0)
memcpy_func(addr, machine_code, machine_code_size)
func = ctypes.CFUNCTYPE(ctypes.c_uint64)(addr)
func.argtypes = [ctypes.c_uint32]
Benchmarks
for _ in range(1000000):
fibonacci(n)
n No JIT (s) JIT (s)
0 0,153 0,882
10 1,001 0,878
20 1,805 0,942
30 2,658 0,955
60 4,800 0,928
90 7,117 0,922
500 50,611 1,251
Python 2.7
No JIT
JIT
n No JIT (s) JIT (s)
0 0,150 1,079
10 1,093 0,971
20 2,206 1,135
30 3,313 1,204
60 6,815 1,198
90 10,458 1,270
500 63.949 1,652
for _ in range(1000000):
fibonacci(n)
Python 3.7
No JIT
JIT
Python 2.7 vs 3.7
fibonacci(n=93)
No JIT: 10.524 s
JIT: 1.185 s
JIT ~8.5 times faster
JIT compilation time: ~0.08 s
fibonacci(n=93)
No JIT: 7.942 s
JIT: 0.887 s
JIT ~8.5 times faster
JIT compilation time: ~0.07 s
VS
* fibonacci(n=92) = 0x68a3dd8e61eccfbd
fibonacci(n=93) = 0xa94fad42221f2702
0 LOAD_CONST 1 (0)
3 STORE_FAST 1 (a)
6 LOAD_CONST 2 (1)
9 STORE_FAST 2 (b)
12 LOAD_FAST 0 (n)
15 LOAD_CONST 2 (1)
18 COMPARE_OP 0 (<)
21 POP_JUMP_IF_FALSE 28
24 LOAD_FAST 1 (a)
27 RETURN_VALUE
>> 28 LOAD_CONST 1 (0)
31 STORE_FAST 3 (i)
34 SETUP_LOOP 48 (to 85)
>> 37 LOAD_FAST 3 (i)
40 LOAD_FAST 0 (n)
43 COMPARE_OP 0 (<)
46 POP_JUMP_IF_FALSE 84
49 LOAD_FAST 1 (a)
52 STORE_FAST 4 (temp)
55 LOAD_FAST 2 (b)
58 STORE_FAST 1 (a)
61 LOAD_FAST 4 (temp)
64 LOAD_FAST 2 (b)
67 BINARY_ADD
68 STORE_FAST 2 (b)
71 LOAD_FAST 3 (i)
74 LOAD_CONST 2 (1)
77 INPLACE_ADD
78 STORE_FAST 3 (i)
81 JUMP_ABSOLUTE 37
>> 84 POP_BLOCK
>> 85 LOAD_FAST 1 (a)
88 RETURN_VALUE
MOV rax, 0
MOV rbx, 1
CMP rdi, 1
JNL label0
RET
label0:
MOV rcx, 0
loop0:
MOV rdx, rax
MOV rax, rbx
ADD rdx, rbx
MOV rbx, rdx
INC rcx
CMP rcx, rdi
JL loop0
RET
VS
33 (VM opcodes)
vs
14 (real machine instructions)
No JIT vs JIT
Projects
Numba makes Python code fast
Numba is an open source JIT compiler that translates a subset of Python and
NumPy code into fast machine code
- Parallelization
- SIMD Vectorization
- GPU Acceleration
Numba
from numba import jit
import numpy as np
@jit(nopython=True) # Set "nopython" mode for best performance, equivalent to @njit
def go_fast(a): # Function is compiled to machine code when called the first time
trace = 0
for i in range(a.shape[0]): # Numba likes loops
trace += np.tanh(a[i, i]) # Numba likes NumPy functions
return a + trace # Numba likes NumPy broadcasting
@cuda.jit
def matmul(A, B, C):
"""Perform square matrix multiplication of C = A * B
"""
i, j = cuda.grid(2)
if i < C.shape[0] and j < C.shape[1]:
tmp = 0.
for k in range(A.shape[1]):
tmp += A[i, k] * B[k, j]
C[i, j] = tmp
LLVM — compiler infrastructure project
Tutorial “Building a JIT: Starting out with KaleidoscopeJIT”
LLVMPy — Python bindings for LLVM
LLVMLite project by Numba team — lightweight LLVM Python binding for writing
JIT compilers
LLVM
x86-64 assembler embedded in Python
Portable Efficient Assembly Code-generator in Higher-level Python
PeachPy
from peachpy.x86_64 import *
ADD(eax, 5).encode()
# bytearray(b'x83xc0x05')
MOVAPS(xmm0, xmm1).encode_options()
# [bytearray(b'x0f(xc1'), bytearray(b'x0f)xc8')]
VPSLLVD(ymm0, ymm1, [rsi + 8]).encode_length_options()
# {6: bytearray(b'xc4xe2uGFx08'),
# 7: bytearray(b'xc4xe2uGD&x08'),
# 9: bytearray(b'xc4xe2uGx86x08x00x00x00')}
PyPy
PyPy is a fast, compliant alternative implementation of the Python language
Python programs often run faster on PyPy thanks to its Just-in-Time compiler
PyPy works best when executing long-running programs where a significant
fraction of the time is spent executing Python code
“If you want your code to run faster, you should probably just use PyPy”
— Guido van Rossum (creator of Python)
Other projects
Pyjion — A JIT for Python based upon CoreCLR
Pyston — built using LLVM and modern JIT techniques
Psyco — extension module which can greatly speed up the execution of code
The first just-in-time compiler for Python, now unmaintained and dead
Unladen Swallow — was an attempt to make LLVM be a JIT compiler for CPython
References
1. https://en.wikipedia.org/wiki/Just-in-time_compilation
2. John Aycock: A Brief History of Just-In-Time. ACM Computing Surveys (CSUR) Surveys, volume 35,
issue 2, pages 97-113, June 2003, DOI: 10.1145/857076.857077
3. https://eli.thegreenplace.net/2013/11/05/how-to-jit-an-introduction
4. https://medium.com/starschema-blog/jit-fast-supercharge-tensor-processing-in-python-with-jit-com
pilation-47598de6ee96
5. https://github.com/Gallopsled/pwntools
6. https://numba.pydata.org
7. https://llvm.org/docs/tutorial/BuildingAJIT1.html
8. https://llvmlite.readthedocs.io/en/latest/
9. http://www.llvmpy.org
10. https://github.com/Maratyszcza/PeachPy
11. https://github.com/microsoft/Pyjion
12. https://blog.pyston.org
Thank you

More Related Content

What's hot

PyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtimePyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtime
National Cheng Kung University
 
Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
National Cheng Kung University
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-optJeff Larkin
 
[GEG1] 10.camera-centric engine design for multithreaded rendering
[GEG1] 10.camera-centric engine design for multithreaded rendering[GEG1] 10.camera-centric engine design for multithreaded rendering
[GEG1] 10.camera-centric engine design for multithreaded rendering종빈 오
 
DWARF Data Representation
DWARF Data RepresentationDWARF Data Representation
DWARF Data Representation
Wang Hsiangkai
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
Zhen Wei
 
Design and Implementation of GCC Register Allocation
Design and Implementation of GCC Register AllocationDesign and Implementation of GCC Register Allocation
Design and Implementation of GCC Register Allocation
Kito Cheng
 
LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)
Wang Hsiangkai
 
Implementing Lightweight Networking
Implementing Lightweight NetworkingImplementing Lightweight Networking
Implementing Lightweight Networking
guest6972eaf
 
Modern C++
Modern C++Modern C++
Modern C++
Michael Clark
 
Software transactional memory. pure functional approach
Software transactional memory. pure functional approachSoftware transactional memory. pure functional approach
Software transactional memory. pure functional approach
Alexander Granin
 
C++11 talk
C++11 talkC++11 talk
C++11 talk
vpoliboyina
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Languagemspline
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
National Cheng Kung University
 
Boost.Python - domesticating the snake
Boost.Python - domesticating the snakeBoost.Python - domesticating the snake
Boost.Python - domesticating the snake
Sławomir Zborowski
 
Gentle introduction to modern C++
Gentle introduction to modern C++Gentle introduction to modern C++
Gentle introduction to modern C++
Mihai Todor
 
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
Fantix King 王川
 
Designing C++ portable SIMD support
Designing C++ portable SIMD supportDesigning C++ portable SIMD support
Designing C++ portable SIMD support
Joel Falcou
 
Threads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonThreads and Callbacks for Embedded Python
Threads and Callbacks for Embedded Python
Yi-Lung Tsai
 

What's hot (20)

PyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtimePyPy's approach to construct domain-specific language runtime
PyPy's approach to construct domain-specific language runtime
 
Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
 
May2010 hex-core-opt
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
 
[GEG1] 10.camera-centric engine design for multithreaded rendering
[GEG1] 10.camera-centric engine design for multithreaded rendering[GEG1] 10.camera-centric engine design for multithreaded rendering
[GEG1] 10.camera-centric engine design for multithreaded rendering
 
DWARF Data Representation
DWARF Data RepresentationDWARF Data Representation
DWARF Data Representation
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 
Design and Implementation of GCC Register Allocation
Design and Implementation of GCC Register AllocationDesign and Implementation of GCC Register Allocation
Design and Implementation of GCC Register Allocation
 
LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)LLVM Register Allocation (2nd Version)
LLVM Register Allocation (2nd Version)
 
Implementing Lightweight Networking
Implementing Lightweight NetworkingImplementing Lightweight Networking
Implementing Lightweight Networking
 
OpenMP
OpenMPOpenMP
OpenMP
 
Modern C++
Modern C++Modern C++
Modern C++
 
Software transactional memory. pure functional approach
Software transactional memory. pure functional approachSoftware transactional memory. pure functional approach
Software transactional memory. pure functional approach
 
C++11 talk
C++11 talkC++11 talk
C++11 talk
 
C++11: Feel the New Language
C++11: Feel the New LanguageC++11: Feel the New Language
C++11: Feel the New Language
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
 
Boost.Python - domesticating the snake
Boost.Python - domesticating the snakeBoost.Python - domesticating the snake
Boost.Python - domesticating the snake
 
Gentle introduction to modern C++
Gentle introduction to modern C++Gentle introduction to modern C++
Gentle introduction to modern C++
 
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
About Those Python Async Concurrent Frameworks - Fantix @ OSTC 2014
 
Designing C++ portable SIMD support
Designing C++ portable SIMD supportDesigning C++ portable SIMD support
Designing C++ portable SIMD support
 
Threads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonThreads and Callbacks for Embedded Python
Threads and Callbacks for Embedded Python
 

Similar to JIT compilation for CPython

PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
Andrey Karpov
 
12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine
ChengHui Weng
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentPVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications development
OOO "Program Verification Systems"
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
PVS-Studio
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
Andrey Karpov
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
Travis Oliphant
 
CorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. ProgrammingCorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. Programming
Slide_N
 
Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2Max Kleiner
 
Demystify eBPF JIT Compiler
Demystify eBPF JIT CompilerDemystify eBPF JIT Compiler
Demystify eBPF JIT Compiler
Netronome
 
Quiz 9
Quiz 9Quiz 9
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
eugeniadean34240
 
NativeBoost
NativeBoostNativeBoost
NativeBoost
ESUG
 
Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...
Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...
Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...
Stefan Marr
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
John Lee
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
Fastly
 
Zope component architechture
Zope component architechtureZope component architechture
Zope component architechture
Anatoly Bubenkov
 
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Romain Dorgueil
 
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain DorgueilSimple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Pôle Systematic Paris-Region
 
How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...
Codemotion
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Marina Kolpakova
 

Similar to JIT compilation for CPython (20)

PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
PVS-Studio 5.00, a solution for developers of modern resource-intensive appl...
 
12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine
 
PVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications developmentPVS-Studio, a solution for resource intensive applications development
PVS-Studio, a solution for resource intensive applications development
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
Static analysis of C++ source code
Static analysis of C++ source codeStatic analysis of C++ source code
Static analysis of C++ source code
 
Numba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPyNumba: Array-oriented Python Compiler for NumPy
Numba: Array-oriented Python Compiler for NumPy
 
CorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. ProgrammingCorePy High-Productivity CellB.E. Programming
CorePy High-Productivity CellB.E. Programming
 
Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2Pascal script maxbox_ekon_14_2
Pascal script maxbox_ekon_14_2
 
Demystify eBPF JIT Compiler
Demystify eBPF JIT CompilerDemystify eBPF JIT Compiler
Demystify eBPF JIT Compiler
 
Quiz 9
Quiz 9Quiz 9
Quiz 9
 
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docx
 
NativeBoost
NativeBoostNativeBoost
NativeBoost
 
Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...
Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...
Metaprogramming, Metaobject Protocols, Gradual Type Checks: Optimizing the "U...
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
 
Zope component architechture
Zope component architechtureZope component architechture
Zope component architechture
 
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017Simple ETL in python 3.5+ with Bonobo - PyParis 2017
Simple ETL in python 3.5+ with Bonobo - PyParis 2017
 
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain DorgueilSimple ETL in python 3.5+ with Bonobo, Romain Dorgueil
Simple ETL in python 3.5+ with Bonobo, Romain Dorgueil
 
How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...
 
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the CompilerPragmatic Optimization in Modern Programming - Demystifying the Compiler
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
 

More from delimitry

Python Hashlib & A True Story of One Bug
Python Hashlib & A True Story of One BugPython Hashlib & A True Story of One Bug
Python Hashlib & A True Story of One Bug
delimitry
 
Data storage systems
Data storage systemsData storage systems
Data storage systems
delimitry
 
Fuzzing python modules
Fuzzing python modulesFuzzing python modules
Fuzzing python modules
delimitry
 
Writing file system in CPython
Writing file system in CPythonWriting file system in CPython
Writing file system in CPython
delimitry
 
CPython logo
CPython logoCPython logo
CPython logo
delimitry
 
Contribute to CPython
Contribute to CPythonContribute to CPython
Contribute to CPython
delimitry
 
Buzzword poem generator in Python
Buzzword poem generator in PythonBuzzword poem generator in Python
Buzzword poem generator in Python
delimitry
 
True stories on the analysis of network activity using Python
True stories on the analysis of network activity using PythonTrue stories on the analysis of network activity using Python
True stories on the analysis of network activity using Python
delimitry
 
Numbers obfuscation in Python
Numbers obfuscation in PythonNumbers obfuscation in Python
Numbers obfuscation in Python
delimitry
 
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему кодуITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
delimitry
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, future
delimitry
 
Python dict: прошлое, настоящее, будущее
Python dict: прошлое, настоящее, будущееPython dict: прошлое, настоящее, будущее
Python dict: прошлое, настоящее, будущее
delimitry
 
Разработка фреймворка на Python для автоматизации тестирования STB боксов
Разработка фреймворка на Python для автоматизации тестирования STB боксовРазработка фреймворка на Python для автоматизации тестирования STB боксов
Разработка фреймворка на Python для автоматизации тестирования STB боксов
delimitry
 
SchoolCTF 2012 - Tpircsavaj
SchoolCTF 2012 - TpircsavajSchoolCTF 2012 - Tpircsavaj
SchoolCTF 2012 - Tpircsavaj
delimitry
 
SchoolCTF 2012 - See Shark
SchoolCTF 2012 - See SharkSchoolCTF 2012 - See Shark
SchoolCTF 2012 - See Shark
delimitry
 
SchoolCTF 2012 - Rings
SchoolCTF 2012 - RingsSchoolCTF 2012 - Rings
SchoolCTF 2012 - Rings
delimitry
 
SchoolCTF 2012 - Bin Pix
SchoolCTF 2012 - Bin PixSchoolCTF 2012 - Bin Pix
SchoolCTF 2012 - Bin Pix
delimitry
 
SchoolCTF 2012 - Acid
SchoolCTF 2012 - AcidSchoolCTF 2012 - Acid
SchoolCTF 2012 - Acid
delimitry
 
Python GC
Python GCPython GC
Python GC
delimitry
 

More from delimitry (19)

Python Hashlib & A True Story of One Bug
Python Hashlib & A True Story of One BugPython Hashlib & A True Story of One Bug
Python Hashlib & A True Story of One Bug
 
Data storage systems
Data storage systemsData storage systems
Data storage systems
 
Fuzzing python modules
Fuzzing python modulesFuzzing python modules
Fuzzing python modules
 
Writing file system in CPython
Writing file system in CPythonWriting file system in CPython
Writing file system in CPython
 
CPython logo
CPython logoCPython logo
CPython logo
 
Contribute to CPython
Contribute to CPythonContribute to CPython
Contribute to CPython
 
Buzzword poem generator in Python
Buzzword poem generator in PythonBuzzword poem generator in Python
Buzzword poem generator in Python
 
True stories on the analysis of network activity using Python
True stories on the analysis of network activity using PythonTrue stories on the analysis of network activity using Python
True stories on the analysis of network activity using Python
 
Numbers obfuscation in Python
Numbers obfuscation in PythonNumbers obfuscation in Python
Numbers obfuscation in Python
 
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему кодуITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
ITGM #9 - Коварный CodeType, или от segfault'а к работающему коду
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, future
 
Python dict: прошлое, настоящее, будущее
Python dict: прошлое, настоящее, будущееPython dict: прошлое, настоящее, будущее
Python dict: прошлое, настоящее, будущее
 
Разработка фреймворка на Python для автоматизации тестирования STB боксов
Разработка фреймворка на Python для автоматизации тестирования STB боксовРазработка фреймворка на Python для автоматизации тестирования STB боксов
Разработка фреймворка на Python для автоматизации тестирования STB боксов
 
SchoolCTF 2012 - Tpircsavaj
SchoolCTF 2012 - TpircsavajSchoolCTF 2012 - Tpircsavaj
SchoolCTF 2012 - Tpircsavaj
 
SchoolCTF 2012 - See Shark
SchoolCTF 2012 - See SharkSchoolCTF 2012 - See Shark
SchoolCTF 2012 - See Shark
 
SchoolCTF 2012 - Rings
SchoolCTF 2012 - RingsSchoolCTF 2012 - Rings
SchoolCTF 2012 - Rings
 
SchoolCTF 2012 - Bin Pix
SchoolCTF 2012 - Bin PixSchoolCTF 2012 - Bin Pix
SchoolCTF 2012 - Bin Pix
 
SchoolCTF 2012 - Acid
SchoolCTF 2012 - AcidSchoolCTF 2012 - Acid
SchoolCTF 2012 - Acid
 
Python GC
Python GCPython GC
Python GC
 

Recently uploaded

An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
ambekarshweta25
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
manasideore6
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
dxobcob
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
ChristineTorrepenida1
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
Kerry Sado
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
Intella Parts
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
Massimo Talia
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
zwunae
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
ClaraZara1
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
zwunae
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
SyedAbiiAzazi1
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Christina Lin
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
heavyhaig
 

Recently uploaded (20)

An Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering TechniquesAn Approach to Detecting Writing Styles Based on Clustering Techniques
An Approach to Detecting Writing Styles Based on Clustering Techniques
 
Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
Fundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptxFundamentals of Induction Motor Drives.pptx
Fundamentals of Induction Motor Drives.pptx
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
一比一原版(Otago毕业证)奥塔哥大学毕业证成绩单如何办理
 
Unbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptxUnbalanced Three Phase Systems and circuits.pptx
Unbalanced Three Phase Systems and circuits.pptx
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
Hierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power SystemHierarchical Digital Twin of a Naval Power System
Hierarchical Digital Twin of a Naval Power System
 
Forklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella PartsForklift Classes Overview by Intella Parts
Forklift Classes Overview by Intella Parts
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024Nuclear Power Economics and Structuring 2024
Nuclear Power Economics and Structuring 2024
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单专业办理
 
6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)6th International Conference on Machine Learning & Applications (CMLA 2024)
6th International Conference on Machine Learning & Applications (CMLA 2024)
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单专业办理
 
14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application14 Template Contractual Notice - EOT Application
14 Template Contractual Notice - EOT Application
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesHarnessing WebAssembly for Real-time Stateless Streaming Pipelines
Harnessing WebAssembly for Real-time Stateless Streaming Pipelines
 
Technical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prismsTechnical Drawings introduction to drawing of prisms
Technical Drawings introduction to drawing of prisms
 

JIT compilation for CPython

  • 1. JIT compilation for CPython Dmitry Alimov 2019 SPb Python
  • 2. JIT compilation and JIT history My experience with JIT in CPython Python projects that use JIT and projects for JIT Outline
  • 3. What is JIT compilation
  • 4. JIT Just-in-time compilation (aka dynamic translation, run-time compilation)
  • 5. JIT Just-in-time compilation (aka dynamic translation, run-time compilation) The earliest JIT compiler on LISP by John McCarthy in 1960
  • 6. JIT Just-in-time compilation (aka dynamic translation, run-time compilation) The earliest JIT compiler on LISP by John McCarthy in 1960 Ken Thompson in 1968 used for regex in text editor QED
  • 7. JIT Just-in-time compilation (aka dynamic translation, run-time compilation) The earliest JIT compiler on LISP by John McCarthy in 1960 Ken Thompson in 1968 used for regex in text editor QED LC2
  • 8. JIT Just-in-time compilation (aka dynamic translation, run-time compilation) The earliest JIT compiler on LISP by John McCarthy in 1960 Ken Thompson in 1968 used for regex in text editor QED LC2 Smalltalk
  • 9. JIT Just-in-time compilation (aka dynamic translation, run-time compilation) The earliest JIT compiler on LISP by John McCarthy in 1960 Ken Thompson in 1968 used for regex in text editor QED LC2 Smalltalk Self
  • 10. JIT Just-in-time compilation (aka dynamic translation, run-time compilation) The earliest JIT compiler on LISP by John McCarthy in 1960 Ken Thompson in 1968 used for regex in text editor QED LC2 Smalltalk Self Popularized by Java with James Gosling using the term from 1993
  • 11. JIT Just-in-time compilation (aka dynamic translation, run-time compilation) The earliest JIT compiler on LISP by John McCarthy in 1960 Ken Thompson in 1968 used for regex in text editor QED LC2 Smalltalk Self Popularized by Java with James Gosling using the term from 1993 Just-in-time manufacturing, also known as just-in-time production or the Toyota Production System (TPS)
  • 13. Example def fibonacci(n): """Returns n-th Fibonacci number""" a = 0 b = 1 if n < 1: return a i = 0 while i < n: temp = a a = b b = temp + b i += 1 return a Fibonacci Sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, ...
  • 14. Let’s JIT it 1) Convert function to machine code at run-time
  • 15. Let’s JIT it 1) Convert function to machine code at run-time 2) Execute this machine code
  • 16. Let’s JIT it @jit def fibonacci(n): """Returns n-th Fibonacci number""" a = 0 b = 1 if n < 1: return a i = 0 while i < n: temp = a a = b b = temp + b i += 1 return a
  • 17. Convert function to AST import ast import inspect lines = inspect.getsource(func) node = ast.parse(lines) visitor = Visitor() visitor.visit(node)
  • 18. AST Module(body=[ FunctionDef(name='fibonacci', args=arguments(args=[Name(id='n', ctx=Param())], vararg=None, kwarg=None, defaults=[]), body=[ Expr(value=Str(s='Returns n-th Fibonacci number')), Assign(targets=[Name(id='a', ctx=Store())], value=Num(n=0)), Assign(targets=[Name(id='b', ctx=Store())], value=Num(n=1)), If(test=Compare(left=Name(id='n', ctx=Load()), ops=[Lt()], comparators=[Num(n=1)]), body=[ Return(value=Name(id='a', ctx=Load())) ], orelse=[]), Assign(targets=[Name(id='i', ctx=Store())], value=Num(n=0)), While(test=Compare(left=Name(id='i', ctx=Load()), ops=[Lt()], comparators=[Name(id='n', ctx=Load())]), body=[ Assign(targets=[Name(id='temp', ctx=Store())], value=Name(id='a', ctx=Load())), Assign(targets=[Name(id='a', ctx=Store())], value=Name(id='b', ctx=Load())), Assign(targets=[Name(id='b', ctx=Store())], value=BinOp( left=Name(id='temp', ctx=Load()), op=Add(), right=Name(id='b', ctx=Load()))), AugAssign(target=Name(id='i', ctx=Store()), op=Add(), value=Num(n=1)) ], orelse=[]), Return(value=Name(id='a', ctx=Load())) ], decorator_list=[Name(id='jit', ctx=Load())]) ])
  • 19. AST to IL ASM class Visitor(ast.NodeVisitor): def __init__(self): self.ops = [] ... ... def visit_Assign(self, node): if isinstance(node.value, ast.Num): self.ops.append('MOV <{}>, {}'.format(node.targets[0].id, node.value.n)) elif isinstance(node.value, ast.Name): self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.id)) elif isinstance(node.value, ast.BinOp): self.ops.extend(self.visit_BinOp(node.value)) self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.left.id)) ...
  • 20. AST to IL ASM class Visitor(ast.NodeVisitor): def __init__(self): self.ops = [] ... ... def visit_Assign(self, node): if isinstance(node.value, ast.Num): self.ops.append('MOV <{}>, {}'.format(node.targets[0].id, node.value.n)) elif isinstance(node.value, ast.Name): self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.id)) elif isinstance(node.value, ast.BinOp): self.ops.extend(self.visit_BinOp(node.value)) self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.left.id)) ... ... Assign( targets=[Name(id='i', ctx=Store())], value=Num(n=0) ), Assign( targets=[Name(id='a', ctx=Store())], value=Name(id='b', ctx=Load()) ), ... ... MOV <i>, 0 ...
  • 21. AST to IL ASM class Visitor(ast.NodeVisitor): def __init__(self): self.ops = [] ... ... def visit_Assign(self, node): if isinstance(node.value, ast.Num): self.ops.append('MOV <{}>, {}'.format(node.targets[0].id, node.value.n)) elif isinstance(node.value, ast.Name): self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.id)) elif isinstance(node.value, ast.BinOp): self.ops.extend(self.visit_BinOp(node.value)) self.ops.append('MOV <{}>, <{}>'.format(node.targets[0].id, node.value.left.id)) ... ... Assign( targets=[Name(id='i', ctx=Store())], value=Num(n=0) ), Assign( targets=[Name(id='a', ctx=Store())], value=Name(id='b', ctx=Load()) ), ... ... MOV <i>, 0 MOV <a>, <b> ...
  • 22. IL ASM to ASM MOV <a>, 0 MOV <b>, 1 CMP <n>, 1 JNL label0 RET label0: MOV <i>, 0 loop0: MOV <temp>, <a> MOV <a>, <b> ADD <temp>, <b> MOV <b>, <temp> INC <i> CMP <i>, <n> JL loop0 RET
  • 23. IL ASM to ASM MOV <a>, 0 MOV <b>, 1 CMP <n>, 1 JNL label0 RET label0: MOV <i>, 0 loop0: MOV <temp>, <a> MOV <a>, <b> ADD <temp>, <b> MOV <b>, <temp> INC <i> CMP <i>, <n> JL loop0 RET # for x64 system args_registers = ['rdi', 'rsi', 'rdx', ...] registers = ['rax', 'rbx', 'rcx', ...] # return register: rax def fibonacci(n): n ⇔ rdi ... return a a ⇔ rax
  • 24. IL ASM to ASM MOV rax, 0 MOV rbx, 1 CMP rdi, 1 JNL label0 RET label0: MOV rcx, 0 loop0: MOV rdx, rax MOV rax, rbx ADD rdx, rbx MOV rbx, rdx INC rcx CMP rcx, rdi JL loop0 RET MOV <a>, 0 MOV <b>, 1 CMP <n>, 1 JNL label0 RET label0: MOV <i>, 0 loop0: MOV <temp>, <a> MOV <a>, <b> ADD <temp>, <b> MOV <b>, <temp> INC <i> CMP <i>, <n> JL loop0 RET
  • 25. ASM to machine code MOV rax, 0 MOV rbx, 1 CMP rdi, 1 JNL label0 RET label0: MOV rcx, 0 loop0: MOV rdx, rax MOV rax, rbx ADD rdx, rbx MOV rbx, rdx INC rcx CMP rcx, rdi JL loop0 RET
  • 26. from pwnlib.asm import asm code = asm(asm_code, arch='amd64') ASM to machine code MOV rax, 0 MOV rbx, 1 CMP rdi, 1 JNL label0 RET label0: MOV rcx, 0 loop0: MOV rdx, rax MOV rax, rbx ADD rdx, rbx MOV rbx, rdx INC rcx CMP rcx, rdi JL loop0 RET
  • 27. ASM to machine code MOV rax, 0 MOV rbx, 1 CMP rdi, 1 JNL label0 RET label0: MOV rcx, 0 loop0: MOV rdx, rax MOV rax, rbx ADD rdx, rbx MOV rbx, rdx INC rcx CMP rcx, rdi JL loop0 RET x48xc7xc0x00x00x00x00 x48xc7xc3x01x00x00x00 x48x83xffx01x7dx01xc3 x48xc7xc1x00x00x00x00 x48x89xc2x48x89xd8x48 x01xdax48x89xd3x48xff xc1x48x39xf9x7cxecxc3
  • 28. Create function in memory 1) Allocate memory
  • 29. Create function in memory 1) Allocate memory 2) Copy machine code to allocated memory
  • 30. Create function in memory 1) Allocate memory 2) Copy machine code to allocated memory 3) Mark the memory as executable
  • 31. Create function in memory 1) Allocate memory 2) Copy machine code to allocated memory 3) Mark the memory as executable Linux: mmap, mprotect Windows: VirtualAlloc, VirtualProtect
  • 32. Signatures in C/C++ Linux: void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset); int mprotect(void *addr, size_t len, int prot); void *memcpy(void *dest, const void *src, size_t n); int munmap(void *addr, size_t length); Windows: LPVOID VirtualAlloc(LPVOID lpAddress, SIZE_T dwSize, DWORD flAllocationType, DWORD flProtect); BOOL VirtualProtect(LPVOID lpAddress, SIZE_T dwSize, DWORD flNewProtect, PDWORD lpflOldProtect); void *memcpy(void *dest, const void *src, size_t count); BOOL VirtualFree(LPVOID lpAddress, SIZE_T dwSize, DWORD dwFreeType);
  • 33. Create function in memory import ctypes # Linux libc = ctypes.CDLL('libc.so.6') libc.mmap libc.mprotect libc.memcpy libc.munmap # Windows ctypes.windll.kernel32.VirtualAlloc ctypes.windll.kernel32.VirtualProtect ctypes.cdll.msvcrt.memcpy ctypes.windll.kernel32.VirtualFree
  • 34. Create function in memory mmap_func = libc.mmap mmap_func.argtype = [ctypes.c_void_p, ctypes.c_size_t, ctypes.c_int, ctypes.c_int, ctypes.c_int, ctypes.c_size_t] mmap_func.restype = ctypes.c_void_p memcpy_func = libc.memcpy memcpy_func.argtypes = [ctypes.c_void_p, ctypes.c_void_p, ctypes.c_size_t] memcpy_func.restype = ctypes.c_char_p
  • 35. Create function in memory machine_code = 'x48xc7xc0x00x00x00x00x48xc7xc3x01x00x00x00x48 x83xffx01x7dx01xc3x48xc7xc1x00x00x00x00x48x89xc2x48x89xd8 x48x01xdax48x89xd3x48xffxc1x48x39xf9x7cxecxc3' machine_code_size = len(machine_code) addr = mmap_func(None, machine_code_size, PROT_READ | PROT_WRITE | PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0) memcpy_func(addr, machine_code, machine_code_size) func = ctypes.CFUNCTYPE(ctypes.c_uint64)(addr) func.argtypes = [ctypes.c_uint32]
  • 37. for _ in range(1000000): fibonacci(n) n No JIT (s) JIT (s) 0 0,153 0,882 10 1,001 0,878 20 1,805 0,942 30 2,658 0,955 60 4,800 0,928 90 7,117 0,922 500 50,611 1,251 Python 2.7 No JIT JIT
  • 38. n No JIT (s) JIT (s) 0 0,150 1,079 10 1,093 0,971 20 2,206 1,135 30 3,313 1,204 60 6,815 1,198 90 10,458 1,270 500 63.949 1,652 for _ in range(1000000): fibonacci(n) Python 3.7 No JIT JIT
  • 39. Python 2.7 vs 3.7 fibonacci(n=93) No JIT: 10.524 s JIT: 1.185 s JIT ~8.5 times faster JIT compilation time: ~0.08 s fibonacci(n=93) No JIT: 7.942 s JIT: 0.887 s JIT ~8.5 times faster JIT compilation time: ~0.07 s VS * fibonacci(n=92) = 0x68a3dd8e61eccfbd fibonacci(n=93) = 0xa94fad42221f2702
  • 40. 0 LOAD_CONST 1 (0) 3 STORE_FAST 1 (a) 6 LOAD_CONST 2 (1) 9 STORE_FAST 2 (b) 12 LOAD_FAST 0 (n) 15 LOAD_CONST 2 (1) 18 COMPARE_OP 0 (<) 21 POP_JUMP_IF_FALSE 28 24 LOAD_FAST 1 (a) 27 RETURN_VALUE >> 28 LOAD_CONST 1 (0) 31 STORE_FAST 3 (i) 34 SETUP_LOOP 48 (to 85) >> 37 LOAD_FAST 3 (i) 40 LOAD_FAST 0 (n) 43 COMPARE_OP 0 (<) 46 POP_JUMP_IF_FALSE 84 49 LOAD_FAST 1 (a) 52 STORE_FAST 4 (temp) 55 LOAD_FAST 2 (b) 58 STORE_FAST 1 (a) 61 LOAD_FAST 4 (temp) 64 LOAD_FAST 2 (b) 67 BINARY_ADD 68 STORE_FAST 2 (b) 71 LOAD_FAST 3 (i) 74 LOAD_CONST 2 (1) 77 INPLACE_ADD 78 STORE_FAST 3 (i) 81 JUMP_ABSOLUTE 37 >> 84 POP_BLOCK >> 85 LOAD_FAST 1 (a) 88 RETURN_VALUE MOV rax, 0 MOV rbx, 1 CMP rdi, 1 JNL label0 RET label0: MOV rcx, 0 loop0: MOV rdx, rax MOV rax, rbx ADD rdx, rbx MOV rbx, rdx INC rcx CMP rcx, rdi JL loop0 RET VS 33 (VM opcodes) vs 14 (real machine instructions) No JIT vs JIT
  • 42. Numba makes Python code fast Numba is an open source JIT compiler that translates a subset of Python and NumPy code into fast machine code - Parallelization - SIMD Vectorization - GPU Acceleration Numba
  • 43. from numba import jit import numpy as np @jit(nopython=True) # Set "nopython" mode for best performance, equivalent to @njit def go_fast(a): # Function is compiled to machine code when called the first time trace = 0 for i in range(a.shape[0]): # Numba likes loops trace += np.tanh(a[i, i]) # Numba likes NumPy functions return a + trace # Numba likes NumPy broadcasting @cuda.jit def matmul(A, B, C): """Perform square matrix multiplication of C = A * B """ i, j = cuda.grid(2) if i < C.shape[0] and j < C.shape[1]: tmp = 0. for k in range(A.shape[1]): tmp += A[i, k] * B[k, j] C[i, j] = tmp
  • 44. LLVM — compiler infrastructure project Tutorial “Building a JIT: Starting out with KaleidoscopeJIT” LLVMPy — Python bindings for LLVM LLVMLite project by Numba team — lightweight LLVM Python binding for writing JIT compilers LLVM
  • 45. x86-64 assembler embedded in Python Portable Efficient Assembly Code-generator in Higher-level Python PeachPy from peachpy.x86_64 import * ADD(eax, 5).encode() # bytearray(b'x83xc0x05') MOVAPS(xmm0, xmm1).encode_options() # [bytearray(b'x0f(xc1'), bytearray(b'x0f)xc8')] VPSLLVD(ymm0, ymm1, [rsi + 8]).encode_length_options() # {6: bytearray(b'xc4xe2uGFx08'), # 7: bytearray(b'xc4xe2uGD&x08'), # 9: bytearray(b'xc4xe2uGx86x08x00x00x00')}
  • 46. PyPy PyPy is a fast, compliant alternative implementation of the Python language Python programs often run faster on PyPy thanks to its Just-in-Time compiler PyPy works best when executing long-running programs where a significant fraction of the time is spent executing Python code “If you want your code to run faster, you should probably just use PyPy” — Guido van Rossum (creator of Python)
  • 47. Other projects Pyjion — A JIT for Python based upon CoreCLR Pyston — built using LLVM and modern JIT techniques Psyco — extension module which can greatly speed up the execution of code The first just-in-time compiler for Python, now unmaintained and dead Unladen Swallow — was an attempt to make LLVM be a JIT compiler for CPython
  • 48. References 1. https://en.wikipedia.org/wiki/Just-in-time_compilation 2. John Aycock: A Brief History of Just-In-Time. ACM Computing Surveys (CSUR) Surveys, volume 35, issue 2, pages 97-113, June 2003, DOI: 10.1145/857076.857077 3. https://eli.thegreenplace.net/2013/11/05/how-to-jit-an-introduction 4. https://medium.com/starschema-blog/jit-fast-supercharge-tensor-processing-in-python-with-jit-com pilation-47598de6ee96 5. https://github.com/Gallopsled/pwntools 6. https://numba.pydata.org 7. https://llvm.org/docs/tutorial/BuildingAJIT1.html 8. https://llvmlite.readthedocs.io/en/latest/ 9. http://www.llvmpy.org 10. https://github.com/Maratyszcza/PeachPy 11. https://github.com/microsoft/Pyjion 12. https://blog.pyston.org