SlideShare a Scribd company logo
1 of 19
Optimizing tail recursion
in Python using bytecode
manipulations.
Allison Kaptur
Paul Tagliamonte
Liuda Nikolaeva
(all errors are my own)
Problem:
Python has a limit on recursion depth:
def factorial(n, accum):
if n <= 1:
return accum
else:
return factorial(n-1, accum*n)
>>> tail-factorial(1000)
RuntimeError: maximum recursion depth exceeded
Challenge:
• Optimize recursive function calls so that they
don’t create new frames, thus avoiding stack
overflow.
• What we want: eliminate the recursive call;
instead, reset the variables and jump to the
beginning of the function.
Problem:
How do you change the insides of a function?
Bytecode!
Solution:
(obviously)
Quick intro to bytecode.
def f(n, accum):
if n <= 1:
return accum
else:
return f(n-1, accum*n)
>>> f.__code__.co_code
'|x00x00dx01x00kx01x00rx10x00|x01x00Stx00x00|x00
x00dx01x00x18|x01x00|x00x00x14x83x02x00Sdx00x00
S‘
>>> print [ord(b) for b in f.__code__.co_code]
[124, 0, 0, 100, 1, 0, 107, 1, 0, 114, 16, 0, 124, 1, 0, 83,
116, 0, 0, 124, 0, 0, 100, 1, 0, 24, 124, 1, 0, 124, 0, 0,
20, 131, 2, 0, 83, 100, 0, 0, 83]
def f(n, accum):
if n <= 1:
return accum
else:
return f(n-1, accum*n)
>>> import dis
>>> dis.dis(f)
2 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (1)
6 COMPARE_OP 1 (<=)
9 POP_JUMP_IF_FALSE 16
3 12 LOAD_FAST 1 (accum)
15 RETURN_VALUE
5 >> 16 LOAD_GLOBAL 0 (f)
19 LOAD_FAST 0 (n)
22 LOAD_CONST 1 (1)
25 BINARY_SUBTRACT
26 LOAD_FAST 1 (accum)
29 LOAD_FAST 0 (n)
32 BINARY_MULTIPLY
33 CALL_FUNCTION 2
36 RETURN_VALUE
37 LOAD_CONST 0 (None)
40 RETURN_VALUE
def f(n, accum):
if n <= 1:
return accum
else:
return f(n-1, accum*n)
>>> import dis
>>> dis.dis(f)
2 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (1)
6 COMPARE_OP 1 (<=)
9 POP_JUMP_IF_FALSE 16
3 12 LOAD_FAST 1 (accum)
15 RETURN_VALUE
5 >> 16 LOAD_GLOBAL 0 (f)
19 LOAD_FAST 0 (n)
22 LOAD_CONST 1 (1)
25 BINARY_SUBTRACT
26 LOAD_FAST 1 (accum)
29 LOAD_FAST 0 (n)
32 BINARY_MULTIPLY
33 CALL_FUNCTION 2
36 RETURN_VALUE
37 LOAD_CONST 0 (None)
40 RETURN_VALUE
def f(n, accum):
if n <= 1:
return accum
else:
return f(n-1, accum*n)
>>> import dis
>>> dis.dis(f)
2 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (1)
6 COMPARE_OP 1 (<=)
9 POP_JUMP_IF_FALSE 16
3 12 LOAD_FAST 1 (accum)
15 RETURN_VALUE
5 >> 16 LOAD_GLOBAL 0 (f)
19 LOAD_FAST 0 (n)
22 LOAD_CONST 1 (1)
25 BINARY_SUBTRACT
26 LOAD_FAST 1 (accum)
29 LOAD_FAST 0 (n)
32 BINARY_MULTIPLY
33 CALL_FUNCTION 2
36 RETURN_VALUE
37 LOAD_CONST 0 (None)
40 RETURN_VALUE
Before optimization:
0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (1)
6 COMPARE_OP 1 (<=)
9 POP_JUMP_IF_FALSE 16
12 LOAD_FAST 1 (accum)
15 RETURN_VALUE
>> 16 LOAD_GLOBAL 0 (f)
19 LOAD_FAST 0 (n)
22 LOAD_CONST 1 (1)
25 BINARY_SUBTRACT
26 LOAD_FAST 1 (accum)
29 LOAD_FAST 0 (n)
32 BINARY_MULTIPLY
33 CALL_FUNCTION 2
36 RETURN_VALUE
After optimization:
>> 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (1)
6 COMPARE_OP 1 (<=)
9 POP_JUMP_IF_FALSE 16
12 LOAD_FAST 1 (accum)
15 RETURN_VALUE
>> 16 LOAD_FAST 0 (n)
19 LOAD_CONST 1 (1)
22 BINARY_SUBTRACT
23 LOAD_FAST 1 (accum)
26 LOAD_FAST 0 (n)
39 BINARY_MULTIPLY
30 STORE_FAST 1 (accum)
33 STORE_FAST 0 (n)
36 JUMP_ABSOLUTE 0
39 RETURN_VALUE
Simplified algorithm.
def recursion_optimizer(f):
new_bytecode = ‘’
for byte in f.__code__.co_code:
if instruction[byte] == ‘LOAD_GLOBAL f’:
get rid of this instruction
elif instruction[byte] == ‘CALL_FUNCTION’:
#replace it with resetting variables and jumping to 0
for arg in *args:
new_bytecode.add_instr(store_new_val(arg))
new_bytecode.add_instr(jump_to_0)
else: #regular byte
new_bytecode.add(byte)
f.__code__.co_code = new_bytecode
return f
Not only does it work, it works FASTER
than the original function:
• Timed 10000 calls to fact(450).
Original fact: 1.7009999752
Optimized fact: 1.6970000267
• And faster than other ways of optimizing this.
Here is the most interesting so far:
If our function calls another function…
def sq(x): return x*x
@tailbytes_v1
def sum_squares(n, accum):
if n < 1:
return accum
else:
return sum_squares(n-1, accum+sq(n))
• Our initial algorithm was removing all calls to a
function, not only the recursive calls, so this would
break.
How do you battle this?
• We need to keep track of function calls and
remove only the recursive calls.
• Unfortunately, bytecode doesn’t know which
function it’s calling: it just calls whatever is on
the stack:
29 CALL_FUNCTION 2
So we just need to keep track of the
stack…
• When we hit ‘LOAD_GLOBAL self’, we start
keeping track of the stack size (stack_size = 0).
• Now, with every byte, we update the stack
size.
• Once we hit stack_size = 0, it means this byte
was the recursive call, so we remove it.
• It allows us to not get rid of calls to other
functions (e.g., identity).
Road ahead:
• Make it harder to break.
• Translate “normal” (non-tail) recursion into
tail-recursion (possibly with ASTs)
• Handle mutual recursion
…And some crasy ideas:
https://github.com/lohmataja/recursion
Or: http://tinyurl.com/tailbytes
Liuda Nikolaeva

More Related Content

What's hot

All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2goMoriyoshi Koizumi
 
When RV Meets CEP (RV 2016 Tutorial)
When RV Meets CEP (RV 2016 Tutorial)When RV Meets CEP (RV 2016 Tutorial)
When RV Meets CEP (RV 2016 Tutorial)Sylvain Hallé
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4Abed Bukhari
 
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrainsit-people
 
ES6 - Next Generation Javascript
ES6 - Next Generation JavascriptES6 - Next Generation Javascript
ES6 - Next Generation JavascriptRamesh Nair
 
Explaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to ComeExplaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to ComeCory Forsyth
 
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"OdessaJS Conf
 
T3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmerT3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmerDavid Muñoz Díaz
 
Advance features of C++
Advance features of C++Advance features of C++
Advance features of C++vidyamittal
 
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"OdessaJS Conf
 
Advance C++notes
Advance C++notesAdvance C++notes
Advance C++notesRajiv Gupta
 
GUL UC3M - Introduction to functional programming
GUL UC3M - Introduction to functional programmingGUL UC3M - Introduction to functional programming
GUL UC3M - Introduction to functional programmingDavid Muñoz Díaz
 
The best language in the world
The best language in the worldThe best language in the world
The best language in the worldDavid Muñoz Díaz
 
Phil Bartie QGIS PLPython
Phil Bartie QGIS PLPythonPhil Bartie QGIS PLPython
Phil Bartie QGIS PLPythonRoss McDonald
 

What's hot (20)

Talk Code
Talk CodeTalk Code
Talk Code
 
All I know about rsc.io/c2go
All I know about rsc.io/c2goAll I know about rsc.io/c2go
All I know about rsc.io/c2go
 
When RV Meets CEP (RV 2016 Tutorial)
When RV Meets CEP (RV 2016 Tutorial)When RV Meets CEP (RV 2016 Tutorial)
When RV Meets CEP (RV 2016 Tutorial)
 
Whats new in_csharp4
Whats new in_csharp4Whats new in_csharp4
Whats new in_csharp4
 
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
«Отладка в Python 3.6: Быстрее, Выше, Сильнее» Елизавета Шашкова, JetBrains
 
Welcome to python
Welcome to pythonWelcome to python
Welcome to python
 
ES6 - Next Generation Javascript
ES6 - Next Generation JavascriptES6 - Next Generation Javascript
ES6 - Next Generation Javascript
 
C++ TUTORIAL 7
C++ TUTORIAL 7C++ TUTORIAL 7
C++ TUTORIAL 7
 
C++ TUTORIAL 6
C++ TUTORIAL 6C++ TUTORIAL 6
C++ TUTORIAL 6
 
Explaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to ComeExplaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to Come
 
Beginning Python
Beginning PythonBeginning Python
Beginning Python
 
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
Yurii Shevtsov "V8 + libuv = Node.js. Under the hood"
 
T3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmerT3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmer
 
Advance features of C++
Advance features of C++Advance features of C++
Advance features of C++
 
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
Timur Shemsedinov "Пишу на колбеках, а что... (Асинхронное программирование)"
 
Advance C++notes
Advance C++notesAdvance C++notes
Advance C++notes
 
EcmaScript 6
EcmaScript 6 EcmaScript 6
EcmaScript 6
 
GUL UC3M - Introduction to functional programming
GUL UC3M - Introduction to functional programmingGUL UC3M - Introduction to functional programming
GUL UC3M - Introduction to functional programming
 
The best language in the world
The best language in the worldThe best language in the world
The best language in the world
 
Phil Bartie QGIS PLPython
Phil Bartie QGIS PLPythonPhil Bartie QGIS PLPython
Phil Bartie QGIS PLPython
 

Viewers also liked (17)

An inside look on how LinkedIn trains their sales reps
An inside look on how LinkedIn trains their sales repsAn inside look on how LinkedIn trains their sales reps
An inside look on how LinkedIn trains their sales reps
 
sejarah komputer
sejarah komputersejarah komputer
sejarah komputer
 
From silicon valley to singapore
From silicon valley to singaporeFrom silicon valley to singapore
From silicon valley to singapore
 
Who Owns Social Selling? Bridging the Divide Between Sales &Marketing
Who Owns Social Selling? Bridging the Divide Between Sales &MarketingWho Owns Social Selling? Bridging the Divide Between Sales &Marketing
Who Owns Social Selling? Bridging the Divide Between Sales &Marketing
 
Recursion transformer
Recursion transformerRecursion transformer
Recursion transformer
 
Manajemen paket
Manajemen paketManajemen paket
Manajemen paket
 
Cara memecah ip_linux_kuliax_pada_virtualbox
Cara memecah ip_linux_kuliax_pada_virtualboxCara memecah ip_linux_kuliax_pada_virtualbox
Cara memecah ip_linux_kuliax_pada_virtualbox
 
Tugas ip addres
Tugas ip addresTugas ip addres
Tugas ip addres
 
File editor vim
File editor vimFile editor vim
File editor vim
 
Tugas ip addres
Tugas ip addresTugas ip addres
Tugas ip addres
 
Editor nano
Editor nanoEditor nano
Editor nano
 
Cms
CmsCms
Cms
 
File editor vim
File editor vimFile editor vim
File editor vim
 
Manajemen file dan user
Manajemen file dan userManajemen file dan user
Manajemen file dan user
 
Contoh perintah dasar linux
Contoh perintah dasar linuxContoh perintah dasar linux
Contoh perintah dasar linux
 
Install cms
Install cmsInstall cms
Install cms
 
Share folder azhar
Share folder azharShare folder azhar
Share folder azhar
 

Similar to Optimizing Python Recursion Using Bytecode Manipulation

The Ring programming language version 1.4 book - Part 21 of 30
The Ring programming language version 1.4 book - Part 21 of 30The Ring programming language version 1.4 book - Part 21 of 30
The Ring programming language version 1.4 book - Part 21 of 30Mahmoud Samir Fayed
 
The Ring programming language version 1.3 book - Part 59 of 88
The Ring programming language version 1.3 book - Part 59 of 88The Ring programming language version 1.3 book - Part 59 of 88
The Ring programming language version 1.3 book - Part 59 of 88Mahmoud Samir Fayed
 
The Ring programming language version 1.5.2 book - Part 74 of 181
The Ring programming language version 1.5.2 book - Part 74 of 181The Ring programming language version 1.5.2 book - Part 74 of 181
The Ring programming language version 1.5.2 book - Part 74 of 181Mahmoud Samir Fayed
 
Introduction to functional programming using Ocaml
Introduction to functional programming using OcamlIntroduction to functional programming using Ocaml
Introduction to functional programming using Ocamlpramode_ce
 
Basic_analysis.ppt
Basic_analysis.pptBasic_analysis.ppt
Basic_analysis.pptSoumyaJ3
 
The Ring programming language version 1.8 book - Part 84 of 202
The Ring programming language version 1.8 book - Part 84 of 202The Ring programming language version 1.8 book - Part 84 of 202
The Ring programming language version 1.8 book - Part 84 of 202Mahmoud Samir Fayed
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance PuzzlersDoug Hawkins
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptAnishaJ7
 
Monadologie
MonadologieMonadologie
Monadologieleague
 
Need help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docxNeed help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docxlauracallander
 
Problem descriptionThe Jim Thornton Coffee House chain is .docx
Problem descriptionThe Jim Thornton Coffee House chain is .docxProblem descriptionThe Jim Thornton Coffee House chain is .docx
Problem descriptionThe Jim Thornton Coffee House chain is .docxelishaoatway
 
openMP loop parallelization
openMP loop parallelizationopenMP loop parallelization
openMP loop parallelizationAlbert DeFusco
 
The Ring programming language version 1.10 book - Part 89 of 212
The Ring programming language version 1.10 book - Part 89 of 212The Ring programming language version 1.10 book - Part 89 of 212
The Ring programming language version 1.10 book - Part 89 of 212Mahmoud Samir Fayed
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...Andrey Karpov
 
The Ring programming language version 1.9 book - Part 88 of 210
The Ring programming language version 1.9 book - Part 88 of 210The Ring programming language version 1.9 book - Part 88 of 210
The Ring programming language version 1.9 book - Part 88 of 210Mahmoud Samir Fayed
 
06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptxMouDhara1
 

Similar to Optimizing Python Recursion Using Bytecode Manipulation (20)

The Ring programming language version 1.4 book - Part 21 of 30
The Ring programming language version 1.4 book - Part 21 of 30The Ring programming language version 1.4 book - Part 21 of 30
The Ring programming language version 1.4 book - Part 21 of 30
 
The Ring programming language version 1.3 book - Part 59 of 88
The Ring programming language version 1.3 book - Part 59 of 88The Ring programming language version 1.3 book - Part 59 of 88
The Ring programming language version 1.3 book - Part 59 of 88
 
The Ring programming language version 1.5.2 book - Part 74 of 181
The Ring programming language version 1.5.2 book - Part 74 of 181The Ring programming language version 1.5.2 book - Part 74 of 181
The Ring programming language version 1.5.2 book - Part 74 of 181
 
Introduction to functional programming using Ocaml
Introduction to functional programming using OcamlIntroduction to functional programming using Ocaml
Introduction to functional programming using Ocaml
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
Basic_analysis.ppt
Basic_analysis.pptBasic_analysis.ppt
Basic_analysis.ppt
 
The Ring programming language version 1.8 book - Part 84 of 202
The Ring programming language version 1.8 book - Part 84 of 202The Ring programming language version 1.8 book - Part 84 of 202
The Ring programming language version 1.8 book - Part 84 of 202
 
Advanced C - Part 2
Advanced C - Part 2Advanced C - Part 2
Advanced C - Part 2
 
Java Performance Puzzlers
Java Performance PuzzlersJava Performance Puzzlers
Java Performance Puzzlers
 
Python High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.pptPython High Level Functions_Ch 11.ppt
Python High Level Functions_Ch 11.ppt
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Monadologie
MonadologieMonadologie
Monadologie
 
Need help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docxNeed help filling out the missing sections of this code- the sections.docx
Need help filling out the missing sections of this code- the sections.docx
 
Problem descriptionThe Jim Thornton Coffee House chain is .docx
Problem descriptionThe Jim Thornton Coffee House chain is .docxProblem descriptionThe Jim Thornton Coffee House chain is .docx
Problem descriptionThe Jim Thornton Coffee House chain is .docx
 
openMP loop parallelization
openMP loop parallelizationopenMP loop parallelization
openMP loop parallelization
 
The Ring programming language version 1.10 book - Part 89 of 212
The Ring programming language version 1.10 book - Part 89 of 212The Ring programming language version 1.10 book - Part 89 of 212
The Ring programming language version 1.10 book - Part 89 of 212
 
PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...PVS-Studio team experience: checking various open source projects, or mistake...
PVS-Studio team experience: checking various open source projects, or mistake...
 
The Ring programming language version 1.9 book - Part 88 of 210
The Ring programming language version 1.9 book - Part 88 of 210The Ring programming language version 1.9 book - Part 88 of 210
The Ring programming language version 1.9 book - Part 88 of 210
 
06 Recursion in C.pptx
06 Recursion in C.pptx06 Recursion in C.pptx
06 Recursion in C.pptx
 

Recently uploaded

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 

Recently uploaded (20)

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 

Optimizing Python Recursion Using Bytecode Manipulation

  • 1. Optimizing tail recursion in Python using bytecode manipulations. Allison Kaptur Paul Tagliamonte Liuda Nikolaeva (all errors are my own)
  • 2. Problem: Python has a limit on recursion depth: def factorial(n, accum): if n <= 1: return accum else: return factorial(n-1, accum*n) >>> tail-factorial(1000) RuntimeError: maximum recursion depth exceeded
  • 3. Challenge: • Optimize recursive function calls so that they don’t create new frames, thus avoiding stack overflow. • What we want: eliminate the recursive call; instead, reset the variables and jump to the beginning of the function.
  • 4. Problem: How do you change the insides of a function?
  • 6. Quick intro to bytecode. def f(n, accum): if n <= 1: return accum else: return f(n-1, accum*n) >>> f.__code__.co_code '|x00x00dx01x00kx01x00rx10x00|x01x00Stx00x00|x00 x00dx01x00x18|x01x00|x00x00x14x83x02x00Sdx00x00 S‘ >>> print [ord(b) for b in f.__code__.co_code] [124, 0, 0, 100, 1, 0, 107, 1, 0, 114, 16, 0, 124, 1, 0, 83, 116, 0, 0, 124, 0, 0, 100, 1, 0, 24, 124, 1, 0, 124, 0, 0, 20, 131, 2, 0, 83, 100, 0, 0, 83]
  • 7. def f(n, accum): if n <= 1: return accum else: return f(n-1, accum*n) >>> import dis >>> dis.dis(f) 2 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (1) 6 COMPARE_OP 1 (<=) 9 POP_JUMP_IF_FALSE 16 3 12 LOAD_FAST 1 (accum) 15 RETURN_VALUE 5 >> 16 LOAD_GLOBAL 0 (f) 19 LOAD_FAST 0 (n) 22 LOAD_CONST 1 (1) 25 BINARY_SUBTRACT 26 LOAD_FAST 1 (accum) 29 LOAD_FAST 0 (n) 32 BINARY_MULTIPLY 33 CALL_FUNCTION 2 36 RETURN_VALUE 37 LOAD_CONST 0 (None) 40 RETURN_VALUE
  • 8. def f(n, accum): if n <= 1: return accum else: return f(n-1, accum*n) >>> import dis >>> dis.dis(f) 2 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (1) 6 COMPARE_OP 1 (<=) 9 POP_JUMP_IF_FALSE 16 3 12 LOAD_FAST 1 (accum) 15 RETURN_VALUE 5 >> 16 LOAD_GLOBAL 0 (f) 19 LOAD_FAST 0 (n) 22 LOAD_CONST 1 (1) 25 BINARY_SUBTRACT 26 LOAD_FAST 1 (accum) 29 LOAD_FAST 0 (n) 32 BINARY_MULTIPLY 33 CALL_FUNCTION 2 36 RETURN_VALUE 37 LOAD_CONST 0 (None) 40 RETURN_VALUE
  • 9. def f(n, accum): if n <= 1: return accum else: return f(n-1, accum*n) >>> import dis >>> dis.dis(f) 2 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (1) 6 COMPARE_OP 1 (<=) 9 POP_JUMP_IF_FALSE 16 3 12 LOAD_FAST 1 (accum) 15 RETURN_VALUE 5 >> 16 LOAD_GLOBAL 0 (f) 19 LOAD_FAST 0 (n) 22 LOAD_CONST 1 (1) 25 BINARY_SUBTRACT 26 LOAD_FAST 1 (accum) 29 LOAD_FAST 0 (n) 32 BINARY_MULTIPLY 33 CALL_FUNCTION 2 36 RETURN_VALUE 37 LOAD_CONST 0 (None) 40 RETURN_VALUE
  • 10. Before optimization: 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (1) 6 COMPARE_OP 1 (<=) 9 POP_JUMP_IF_FALSE 16 12 LOAD_FAST 1 (accum) 15 RETURN_VALUE >> 16 LOAD_GLOBAL 0 (f) 19 LOAD_FAST 0 (n) 22 LOAD_CONST 1 (1) 25 BINARY_SUBTRACT 26 LOAD_FAST 1 (accum) 29 LOAD_FAST 0 (n) 32 BINARY_MULTIPLY 33 CALL_FUNCTION 2 36 RETURN_VALUE After optimization: >> 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (1) 6 COMPARE_OP 1 (<=) 9 POP_JUMP_IF_FALSE 16 12 LOAD_FAST 1 (accum) 15 RETURN_VALUE >> 16 LOAD_FAST 0 (n) 19 LOAD_CONST 1 (1) 22 BINARY_SUBTRACT 23 LOAD_FAST 1 (accum) 26 LOAD_FAST 0 (n) 39 BINARY_MULTIPLY 30 STORE_FAST 1 (accum) 33 STORE_FAST 0 (n) 36 JUMP_ABSOLUTE 0 39 RETURN_VALUE
  • 11. Simplified algorithm. def recursion_optimizer(f): new_bytecode = ‘’ for byte in f.__code__.co_code: if instruction[byte] == ‘LOAD_GLOBAL f’: get rid of this instruction elif instruction[byte] == ‘CALL_FUNCTION’: #replace it with resetting variables and jumping to 0 for arg in *args: new_bytecode.add_instr(store_new_val(arg)) new_bytecode.add_instr(jump_to_0) else: #regular byte new_bytecode.add(byte) f.__code__.co_code = new_bytecode return f
  • 12.
  • 13. Not only does it work, it works FASTER than the original function: • Timed 10000 calls to fact(450). Original fact: 1.7009999752 Optimized fact: 1.6970000267 • And faster than other ways of optimizing this.
  • 14. Here is the most interesting so far:
  • 15. If our function calls another function… def sq(x): return x*x @tailbytes_v1 def sum_squares(n, accum): if n < 1: return accum else: return sum_squares(n-1, accum+sq(n)) • Our initial algorithm was removing all calls to a function, not only the recursive calls, so this would break.
  • 16. How do you battle this? • We need to keep track of function calls and remove only the recursive calls. • Unfortunately, bytecode doesn’t know which function it’s calling: it just calls whatever is on the stack: 29 CALL_FUNCTION 2
  • 17. So we just need to keep track of the stack… • When we hit ‘LOAD_GLOBAL self’, we start keeping track of the stack size (stack_size = 0). • Now, with every byte, we update the stack size. • Once we hit stack_size = 0, it means this byte was the recursive call, so we remove it. • It allows us to not get rid of calls to other functions (e.g., identity).
  • 18. Road ahead: • Make it harder to break. • Translate “normal” (non-tail) recursion into tail-recursion (possibly with ASTs) • Handle mutual recursion …And some crasy ideas: