Have you ever wondered what happens to all the precious RAM after running your 'simple' CPython code? Prepare yourself for a short introduction to CPython memory management! This presentation will try to answer some memory related questions you always wondered about. It will also discuss basic memory profiling tools and techniques.
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...Piotr Przymus
Have you ever wondered what happens to all the precious RAM after running your 'simple' CPython code? Prepare yourself for a short introduction to CPython memory management! This presentation will try to answer some memory related questions you always wondered about. It will also discuss basic memory profiling tools and techniques.
Have you ever wondered how to speed up your code in Python? This presentation will show you how to start. I will begin with a guide how to locate performance bottlenecks and then give you some tips how to speed up your code. Also I would like to discuss how to avoid premature optimization as it may be ‘the root of all evil’ (at least according to D. Knuth).
REs and Python: Regular Expressions, Sequence Characters in Regular Expressions, Quantifiers in Regular Expressions, Special Characters in Regular Expressions, Using Regular Expressions on Files, Retrieving Information from a HTML File
Threading : Concurrent Programming and GIL, Uses of Threads, Creating Threads in Python, Thread Class Methods, Single Tasking using a Thread, Multitasking using Multiple Threads, Thread Synchronization Deadlock of Threads, Avoiding Deadlocks in a Program, Communication between Threads, Thread Communication using notify() and wait() Methods, Thread Communication using a Queue, Daemon Threads
Functions, Exception, Modules and Files
Functions: Difference between a Function and a Method, Defining a Function, Calling a Function, Returning Results from a Function, Returning Multiple Values from a Function, Functions are First Class Objects, Pass by Object Reference, Formal and Actual Arguments, Positional Arguments, Keyword Arguments, Default Arguments, Variable Length Arguments, Local and Global Variables, The Global Keyword, Passing a Group of Elements to a Function, Recursive Functions, Anonymous Functions or Lambdas (Using Lambdas with filter() Function, Using Lambdas with map() Function, Using Lambdas with reduce() Function), Function Decorators, Generators, Structured Programming, Creating our Own Modules in Python, The Special Variable __name__
Exceptions: Errors in a Python Program (Compile-Time Errors, Runtime Errors, Logical Errors),Exceptions, Exception Handling, Types of Exceptions, The Except Block, The assert Statement, UserDefined Exceptions, Logging the Exceptions
20%
Files: Files, Types of Files in Python, Opening a File, Closing a File, Working with Text Files Containing Strings, Knowing Whether a File Exists or Not, Working with Binary Files, The with Statement, Pickle in Python, The seek() and tell() Methods, Random Accessing of Binary Files, Random Accessing of Binary Files using mmap, Zipping and Unzipping Files, Working with Directories, Running Other Programs from Python Program
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...Piotr Przymus
Have you ever wondered what happens to all the precious RAM after running your 'simple' CPython code? Prepare yourself for a short introduction to CPython memory management! This presentation will try to answer some memory related questions you always wondered about. It will also discuss basic memory profiling tools and techniques.
Have you ever wondered how to speed up your code in Python? This presentation will show you how to start. I will begin with a guide how to locate performance bottlenecks and then give you some tips how to speed up your code. Also I would like to discuss how to avoid premature optimization as it may be ‘the root of all evil’ (at least according to D. Knuth).
REs and Python: Regular Expressions, Sequence Characters in Regular Expressions, Quantifiers in Regular Expressions, Special Characters in Regular Expressions, Using Regular Expressions on Files, Retrieving Information from a HTML File
Threading : Concurrent Programming and GIL, Uses of Threads, Creating Threads in Python, Thread Class Methods, Single Tasking using a Thread, Multitasking using Multiple Threads, Thread Synchronization Deadlock of Threads, Avoiding Deadlocks in a Program, Communication between Threads, Thread Communication using notify() and wait() Methods, Thread Communication using a Queue, Daemon Threads
Functions, Exception, Modules and Files
Functions: Difference between a Function and a Method, Defining a Function, Calling a Function, Returning Results from a Function, Returning Multiple Values from a Function, Functions are First Class Objects, Pass by Object Reference, Formal and Actual Arguments, Positional Arguments, Keyword Arguments, Default Arguments, Variable Length Arguments, Local and Global Variables, The Global Keyword, Passing a Group of Elements to a Function, Recursive Functions, Anonymous Functions or Lambdas (Using Lambdas with filter() Function, Using Lambdas with map() Function, Using Lambdas with reduce() Function), Function Decorators, Generators, Structured Programming, Creating our Own Modules in Python, The Special Variable __name__
Exceptions: Errors in a Python Program (Compile-Time Errors, Runtime Errors, Logical Errors),Exceptions, Exception Handling, Types of Exceptions, The Except Block, The assert Statement, UserDefined Exceptions, Logging the Exceptions
20%
Files: Files, Types of Files in Python, Opening a File, Closing a File, Working with Text Files Containing Strings, Knowing Whether a File Exists or Not, Working with Binary Files, The with Statement, Pickle in Python, The seek() and tell() Methods, Random Accessing of Binary Files, Random Accessing of Binary Files using mmap, Zipping and Unzipping Files, Working with Directories, Running Other Programs from Python Program
Introduction to Python Pandas for Data AnalyticsPhoenix
Pandas is an open-source, BSD-licensed Python library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Python with Pandas is used in a wide range of fields including academic and commercial domains including finance, economics, Statistics, analytics, medical...
A for loop is probably the most common type of loop in Python. A for loop will select items from any iterable. In Python an iterable is any container (list, tuple, set, dictionary), as well as many other important objects such as generator function, generator expressions, the results of builtin functions such as filter, map, range and many other items.
The basics of Python are rather straightforward. In a few minutes you can learn most of the syntax. There are some gotchas along the way that might appear tricky. This talk is meant to bring programmers up to speed with Python. They should be able to read and write Python.
Natural language processing open seminar For Tensorflow usagehyunyoung Lee
This is presentation for Natural Language Processing open seminar in Kookmin University.
The open seminar reference : https://cafe.naver.com/nlpk
My presentation about how to use tensorflow for NLP open seminar for newbies for tensorflow.
The slides shown here have been used for talks given to scientists in informal contexts.
Python is introduced as a valuable tool for both producing and evaluating data.
The talk is essentially a guided tour of the author's favourite parts of the Python ecosystem. Besides the Python language itself, NumPy and SciPy as well as Matplotlib are mentioned.
A last part of the talk concerns itself with code execution speed. With this problem in sight, Cython and f2py are introduced as means of glueing different languages together and speeding Python up.
The source code for the slides, code snippets and further links are available in a git repository at
https://github.com/aeberspaecher/PythonForScientists
This presentation is a great resource for zero-based Python programmers who wants to learn Python 3. This course includes brief history of Python and familiarity of its basic syntax.
A Gentle Introduction to Coding ... with PythonTariq Rashid
A gentle introduction to coding (programming) for complete beginners. Starting from then basics - electrical wires - proceeding through variables, data structures, loops, functions, and exploring libraries for visualisation and specialist tools. Finally we use flask to make a very simple twitter clone web application.
Provides an introductory level understanding of the Python Programming Language and language features. Serves as a guide for beginners and a reference to Python basics and language use cases.
Industry - Program analysis and verification - Type-preserving Heap Profiler ...ICSM 2011
Paper: Type-preserving Heap Profiler for C++
Authors: József Mihalicza, Zoltán Porkoláb and Ábel Gábor
Session: "Industry Track Session 4: Program analysis and Verification"
Introduction to Python Pandas for Data AnalyticsPhoenix
Pandas is an open-source, BSD-licensed Python library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Python with Pandas is used in a wide range of fields including academic and commercial domains including finance, economics, Statistics, analytics, medical...
A for loop is probably the most common type of loop in Python. A for loop will select items from any iterable. In Python an iterable is any container (list, tuple, set, dictionary), as well as many other important objects such as generator function, generator expressions, the results of builtin functions such as filter, map, range and many other items.
The basics of Python are rather straightforward. In a few minutes you can learn most of the syntax. There are some gotchas along the way that might appear tricky. This talk is meant to bring programmers up to speed with Python. They should be able to read and write Python.
Natural language processing open seminar For Tensorflow usagehyunyoung Lee
This is presentation for Natural Language Processing open seminar in Kookmin University.
The open seminar reference : https://cafe.naver.com/nlpk
My presentation about how to use tensorflow for NLP open seminar for newbies for tensorflow.
The slides shown here have been used for talks given to scientists in informal contexts.
Python is introduced as a valuable tool for both producing and evaluating data.
The talk is essentially a guided tour of the author's favourite parts of the Python ecosystem. Besides the Python language itself, NumPy and SciPy as well as Matplotlib are mentioned.
A last part of the talk concerns itself with code execution speed. With this problem in sight, Cython and f2py are introduced as means of glueing different languages together and speeding Python up.
The source code for the slides, code snippets and further links are available in a git repository at
https://github.com/aeberspaecher/PythonForScientists
This presentation is a great resource for zero-based Python programmers who wants to learn Python 3. This course includes brief history of Python and familiarity of its basic syntax.
A Gentle Introduction to Coding ... with PythonTariq Rashid
A gentle introduction to coding (programming) for complete beginners. Starting from then basics - electrical wires - proceeding through variables, data structures, loops, functions, and exploring libraries for visualisation and specialist tools. Finally we use flask to make a very simple twitter clone web application.
Provides an introductory level understanding of the Python Programming Language and language features. Serves as a guide for beginners and a reference to Python basics and language use cases.
Industry - Program analysis and verification - Type-preserving Heap Profiler ...ICSM 2011
Paper: Type-preserving Heap Profiler for C++
Authors: József Mihalicza, Zoltán Porkoláb and Ábel Gábor
Session: "Industry Track Session 4: Program analysis and Verification"
Accelerating Random Forests in Scikit-LearnGilles Louppe
Random Forests are without contest one of the most robust, accurate and versatile tools for solving machine learning tasks. Implementing this algorithm properly and efficiently remains however a challenging task involving issues that are easily overlooked if not considered with care. In this talk, we present the Random Forests implementation developed within the Scikit-Learn machine learning library. In particular, we describe the iterative team efforts that led us to gradually improve our codebase and eventually make Scikit-Learn's Random Forests one of the most efficient implementations in the scientific ecosystem, across all libraries and programming languages. Algorithmic and technical optimizations that have made this possible include:
- An efficient formulation of the decision tree algorithm, tailored for Random Forests;
- Cythonization of the tree induction algorithm;
- CPU cache optimizations, through low-level organization of data into contiguous memory blocks;
- Efficient multi-threading through GIL-free routines;
- A dedicated sorting procedure, taking into account the properties of data;
- Shared pre-computations whenever critical.
Overall, we believe that lessons learned from this case study extend to a broad range of scientific applications and may be of interest to anybody doing data analysis in Python.
2018 cosup-delete unused python code safely - englishJen Yee Hong
The talk is about doing cleanup and refactor for legacy Python code base in a safer way. I introduced several existing tools for this task and demonstrated how (surprisingly) Python ast module can also help in this case.
中文摘要:
不管是 open source 專案還是工作上,經過長時間開發累積,source code 內可能會殘留許多不再需要的 code,造成維護以及 refactor 的困難,也造成新手 trace code 時的障礙。
對 C/C++ 這類編譯式語言來說,開啟編譯器最佳化能自動清除 dead code,但對於 Python 這類動態語言,則沒有公認完美的方法。
本議程分享一些相關經驗,佐以利用 Python AST 的簡易自製工具,討論如何從較複雜的 python source tree 中,安全的清除不再需要的 code。
Code: https://github.com/PCMan/python-find-unused-func
Keynote talk at PyCon Estonia 2019 where I discuss how to extend CPython and how that has led to a robust ecosystem around Python. I then discuss the need to define and build a Python extension language I later propose as EPython on OpenTeams: https://openteams.com/initiatives/2
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
When stars align: studies in data quality, knowledge graphs, and machine lear...
Everything You Always Wanted to Know About Memory in Python But Were Afraid to Ask
1. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Everything You Always Wanted to Know About
Memory in Python
But Were Afraid to Ask
Piotr Przymus
Nicolaus Copernicus University
Europython 2014,
Berlin
P. Przymus 1/31
2. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
About Me
Piotr Przymus
PhD student / Research Assistant at Nicolaus Copernicus University.
Interests: databases, GPGPU computing, datamining.
8 years of Python experience.
Some of my Python projects:
Parts of trading platform in turbineam.com (back testing, trading
algorithms)
Mussels bio-monitoring analysis and data mining software.
Simulator of heterogeneus processing environment for evaluation of
database query scheduling algorithms.
P. Przymus 2/31
3. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Size of objects
Table: Size of different types in bytes
Type Python
32 bit 64 bit
int (py-2.7) 12 24
long (py-2.7) / int (py-3.3) 14 30
+2 · number of digits
float 16 24
complex 24 32
str (py-2.7) / bytes (py-3.3) 24 40
+2 · length
unicode (py-2.7) / str (py-3.3) 28 52
+(2 or 4) ∗ length
tuple 24 64
+(4 · length) +(8 · length)
P. Przymus 3/31
4. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
DIY – check size of objects
sys.getsizeof(obj)
From documentation
Since Python 2.6
Return the size of an object in bytes. The object can be any type.
All built-in objects will return correct results.
May not be true for third-party extensions as it is implementation
specific.
Calls the object’s sizeof method and adds an additional garbage
collector overhead if the object is managed by the garbage collector.
P. Przymus 4/31
5. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20)]
2
Listing 1: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20)]
2
Listing 2: List of integers
Any allocation difference between Listing 1 and Listing 2 ?
Results measured using psutils
Listing 1 – (resident=15.1M, virtual=2.3G)
Listing 2 – (resident=39.5M, virtual=2.4G)
P. Przymus 5/31
6. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – fun example
1 a = [ i % 257 for i in xrange (2**20)]
2
Listing 3: List of interned integers
1 b = [ 1024 + i % 257 for i in xrange (2**20)]
2
Listing 4: List of integers
Any allocation difference between Listing 1 and Listing 2 ?
Results measured using psutils
Listing 1 – (resident=15.1M, virtual=2.3G)
Listing 2 – (resident=39.5M, virtual=2.4G)
P. Przymus 5/31
7. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – explained
Objects and variables – general rule
Objects are allocated on assignment
Variables just point to objects (i.e. they do not hold the memory)
Interning of Objects
This is an exception to the general rule.
Python implementation specific (examples from CPython).
”Often” used objects are preallocated and are shared instead of costly
new alloc.
Mainly due to the performance optimization.
1 >>> a = 0, b = 0
2 >>> a is b, a == b
3 (True , True)
4
Listing 5: Interning of Objects
1 >>> a = 1024 , b = 1024
2 >>> a is b, a == b
3 (False , True)
4
Listing 6: Objects allocation
P. Przymus 6/31
8. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Objects interning – behind the scenes
Warning
This is Python implementation dependent.
This may change in the future.
This is not documented because of the above reasons.
For reference consult the source code.
CPython 2.7 - 3.4
Single instances for:
int – in range [−5, 257)
str / unicode – empty string and all length=1 strings
unicode / str – empty string and all length=1 strings for Latin-1
tuple – empty tuple
P. Przymus 7/31
9. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
String interning – example
1 >>> a, b = "strin", "string"
2 >>> a + ’g’ is b # returns False
3 >>> intern(a+’g’) is intern(b) # returns True
4 >>> a = [ "spam %d" % (i % 257)
5 for i in xrange (2**20)]
6 >>> # memory usage (resident =57.6M, virtual =2.4G)
7 >>> a = [ intern("spam %d" % (i % 257))
8 for i in xrange (2**20)]
9 >>> # memory usage (resident =14.9M, virtual =2.3G)
10
Listing 7: String interning
P. Przymus 8/31
10. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
String interning – explained
String interning definition
String interning is a method of storing only one copy of each distinct string
value, which must be immutable.
intern (py-2.x) / sys.intern (py-3.x)
From Cpython documentation:
Enter string in the table of “interned” strings.
Return the interned string (string or string copy).
Useful to gain a little performance on dictionary lookup (key
comparisons after hashing can be done by a pointer compare instead of
a string compare).
Names used in programs are automatically interned
Dictionaries used to hold module, class or instance attributes have
interned keys.
P. Przymus 9/31
11. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Mutable Containers Memory Allocation Strategy
Plan for growth and shrinkage
Slightly overallocate memory neaded by container.
Leave room to growth.
Shrink when overallocation threshold is reached.
Reduce number of expensive function calls:
relloc()
memcpy()
Use optimal layout.
List, Sets, Dictionaries
P. Przymus 10/31
12. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
List allocation – example
Figure: List growth example
P. Przymus 11/31
13. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
List allocation strategy
Represented as fixed-length array of pointers.
Overallocation for list growth (by append)
List size growth: 4, 8, 16, 25, 35, 46, . . .
For large lists less then 12.5%
Due to the memory actions involved, operations:
at end of list are cheap (rare realloc),
in the middle or beginning require memory copy or shift!
Note that for 1,2,5 elements lists, space is wasted.
List allocation size:
32 bits – 32 + (4 * length)
64 bits – 72 + (8 * length)
Shrinking only when list size < 1/2 of allocated space.
P. Przymus 12/31
14. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Overallocation of dictionaries/sets
Represented as fixed-length hash tables.
Overallocation for dict/sets – when 2/3 of capacity is reached.
if number of elements < 50000: quadruple the capacity
else: double the capacity
1 // dict growth strategy
2 (mp ->ma_used >50000 ? 2 : 4) * mp ->ma_used;
3 // set growth strategy
4 so ->used >50000 ? so ->used *2 : so ->used *4);
5
Dict/Set growth/shrink code
1 for (newsize = PyDict_MINSIZE ;
2 newsize <= minused && newsize > 0;
3 newsize <<= 1);
4
Shrinkage if dictionary/set fill (real and dummy elements) is much larger
than used elements (real elements) i.e. lot of keys have been deleted.
P. Przymus 13/31
15. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Various data representation
1 # Fields: field1 , field2 , field3 , ..., field8
2 # Data: "foo 1", "foo 2", "foo 3", ..., "foo 8"
3 class OldStyleClass : #only py -2.x
4 ...
5 class NewStyleClass (object): # default for py -3.x
6 ...
7 class NewStyleClassSlots (object):
8 __slots__ = (’field1 ’, ’field2 ’, ...)
9 ...
10 import collections as c
11 NamedTuple = c.namedtuple(’nt’, [ ’field1 ’, ... ,])
12
13 TupleData = (’value1 ’, ’value2 ’, ....)
14 ListaData = [’value1 ’, ’value2 ’, ....]
15 DictData = {’field1 ’:, ’value2 ’, ....}
16
Listing 8: Various data representation
P. Przymus 14/31
16. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Various data representation – allocated memory
0 MB 50 MB 100 MB 150 MB
Old
StyleClass
New
StyleClass
DictData
NamedTuple
TupleData
ListaData
NewStyle
ClassWithSlots
Python 2.x Python 3.x
Figure: Allocated memory after creating 100000 objects with 8 fields each
P. Przymus 15/31
17. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Notes on garbage collector, reference count and cycles
Python garbage collector
Uses reference counting.
Offers cycle detection.
Objects garbage-collected when count goes to 0.
Reference increment, e.g.: object creation, additional aliases, passed to
function
Reference decrement, e.g.: local reference goes out of scope, alias is
destroyed, alias is reassigned
Warning – from documentation
Objects that have del () methods and are part of a reference cycle cause
the entire reference cycle to be uncollectable!
Python doesn’t collect such cycles automatically.
It is not possible for Python to guess a safe order in which to run the
del () methods.
P. Przymus 16/31
18. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools
psutil
memory profiler
objgraph
Meliae (could be combined with runsnakerun)
Heapy
P. Przymus 17/31
19. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – psutil
psutil – A cross-platform process and system utilities module for Python.
1 import psutil
2 import os
3 ...
4 p = psutil.Process(os.getpid ())
5 pinfo = p.as_dict ()
6 ...
7 print pinfo[’memory_percent ’],
8 print pinfo[’memory_info ’].rss , pinfo[’memory_info ’]. vms
Listing 9: Various data representation
P. Przymus 18/31
20. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – memory profiler
memory profiler – a module for monitoring memory usage of a python
program.
Recommended dependency: psutil.
May work as:
Line-by-line profiler.
Memory usage monitoring (memory in time).
Debugger trigger – setting debugger breakpoints.
P. Przymus 19/31
21. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – Line-by-line profiler
Preparation
To track particular functions use profile decorator.
Running
1 python -m memory_profiler
1 Line # Mem usage Increment Line Contents
2 ================================================
3 45 9.512 MiB 0.000 MiB @profile
4 46 def create_lot_of_stuff (
times = 10000 , cl = OldStyleClass ):
5 47 9.516 MiB 0.004 MiB ret = []
6 48 9.516 MiB 0.000 MiB t = "foo %d"
7 49 156.449 MiB 146.934 MiB for i in xrange(times):
8 50 156.445 MiB -0.004 MiB l = [ t % (j + i%8)
for j in xrange (8)]
9 51 156.449 MiB 0.004 MiB c = cl(*l)
10 52 156.449 MiB 0.000 MiB ret.append(c)
11 53 156.449 MiB 0.000 MiB return ret
Listing 10: Results
P. Przymus 20/31
22. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – memory usage monitoring
Preparation
To track particular functions use profile decorator.
Running and plotting
1 mprof run --python python uniwerse.py -f 100 100 -s 100
100 10
2 mprof plot
Figure: Results
P. Przymus 21/31
23. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
memory profiler – Debugger trigger
1 eror@eror -laptop :˜$ python -m memory_profiler --pdb -mmem =10
uniwerse.py -s 100 100 10
2 Current memory 20.80 MiB exceeded the maximumof 10.00 MiB
3 Stepping into the debugger
4 > /home/eror/uniwerse.py (52) connect ()
5 -> self.adj.append(n)
6 (Pdb)
Listing 11: Debugger trigger – setting debugger breakpoints.
P. Przymus 22/31
24. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – objgraph
objgraph – draws Python object reference graphs with graphviz.
1 import objgraph
2 x = []
3 y = [x, [x], dict(x=x)]
4 objgraph.show_refs ([y], filename=’sample -graph.png’)
5 objgraph. show_backrefs ([x], filename=’sample -backref -graph.png’
)
Listing 12: Tutorial example
Figure: Reference graph Figure: Back reference graph
P. Przymus 23/31
25. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – Heapy/Meliae
Heapy
The heap analysis toolset. It can be used to find information about the
objects in the heap and display the information in various ways.
part of ”Guppy-PE – A Python Programming Environment”
Meliae
Python Memory Usage Analyzer
”This project is similar to heapy (in the ’guppy’ project), in its attempt
to understand how memory has been allocated.”
runsnakerun GUI support.
P. Przymus 24/31
26. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Tools – Heapy
1 from guppy import hpy
2 hp=hpy()
3 h1 = hp.heap ()
4 l = [ range(i) for i in xrange (2**10)]
5 h2 = hp.heap ()
6 print h2 - h1
Listing 13: Heapy example
1 Partition of a set of 294937 objects. Total size = 11538088
bytes.
2 Index Count % Size % Cumulative % Kind (class / dict
of class)
3 0 293899 100 7053576 61 7053576 61 int
4 1 1025 0 4481544 39 11535120 100 list
5 2 6 0 1680 0 11536800 100 dict (no owner)
6 3 2 0 560 0 11537360 100 dict of guppy.etc.
Glue.Owner
7 4 1 0 456 0 11537816 100 types.FrameType
8 5 2 0 144 0 11537960 100 guppy.etc.Glue.
Owner
9 6 2 0 128 0 11538088 100 str
Listing 14: Results
P. Przymus 25/31
27. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Meliae and runsnakerun
1 from meliae import scanner
2 scanner. dump_all_objects (" representation_meliae .dump")
3 # In shell: runsnakemem representation_meliae .dump
Listing 15: Heapy example
Figure: Meliae and runsnakerunP. Przymus 26/31
28. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
malloc() alternatives – libjemalloc and libtcmalloc
Pros:
In some cases using different malloc() implementation ”may” help to
retrieve memory from CPython back to system.
Cons:
But equally it may work against you.
1 $LD_PRELOAD ="/usr/lib/libjemalloc .so.1" python
int_float_alloc .py
2 $ LD_PRELOAD="/usr/lib/ libtcmalloc_minimal .so.4" python
int_float_alloc .py
Listing 16: Changing memory allocator
P. Przymus 27/31
30. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Other useful tools
Build Python in debug mode (./configure –with-pydebug . . . ).
Maintains list of all active objects.
Upon exit (or every statement in interactive mode), print all existing
references.
Trac total allocation.
valgrind – a programming tool for memory debugging, leak detection,
and profiling. Rather low level.
CPython can cooperate with valgrind (for >= py-2.7, py-3.2)
gdb-heap (gdb extension)
low level, still experimental
can be attached to running processes
may be used with core file
Web applications memory leaks
dowser – cherrypy application that displays sparklines of python object
counts.
dozer – wsgi middleware version of the cherrypy memory leak debugger
(any wsgi application).
P. Przymus 29/31
31. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
Summary
Summary:
Try to understand better underlying memory model.
Pay attention to hot spots.
Use profiling tools.
”Seek and destroy” – find the root cause of the memory leak and fix it ;)
Quick and sometimes dirty solutions:
Delegate memory intensive work to other process.
Regularly restart process.
Go for low hanging fruits (e.g. slots , different allocators).
P. Przymus 30/31
32. Introduction Basic stuff Notes on memory model Memory profiling tools Summary References
References
Wesley J. Chun, Principal CyberWeb Consulting, ”Python 103...
MMMM: Understanding Python’s Memory Model, Mutability, Methods”
David Malcolm, Red Hat, ”Dude – Where’s My RAM?” A deep dive into
how Python uses memory.
Evan Jones, Improving Python’s Memory Allocator
Alexander Slesarev, Memory reclaiming in Python
Source code of Python
Tools documentation
P. Przymus 31/31