The presentation from SPb Python Interest Group community meetup.
The presentation tells about the dictionaries in Python, reviews the implementation of dictionary in CPython 2.x, dictionary in CPython 3.x, and also recent changes in CPython 3.6. In addition to CPython the dictionaries in alternative Python implementations such as PyPy, IronPython and Jython are reviewed.
This presentation covers Python most important data structures like Lists, Dictionaries, Sets and Tuples. Exception Handling and Random number generation using simple python module "random" also covered. Added simple python programs at the end of the presentation
These are the slides of the second part of this multi-part series, from Learn Python Den Haag meetup group. It covers List comprehensions, Dictionary comprehensions and functions.
This presentation is all about various built in
datastructures which we have in python.
List
Dictionary
Tuple
Set
and various methods present in each data structure
This presentation covers Python most important data structures like Lists, Dictionaries, Sets and Tuples. Exception Handling and Random number generation using simple python module "random" also covered. Added simple python programs at the end of the presentation
These are the slides of the second part of this multi-part series, from Learn Python Den Haag meetup group. It covers List comprehensions, Dictionary comprehensions and functions.
This presentation is all about various built in
datastructures which we have in python.
List
Dictionary
Tuple
Set
and various methods present in each data structure
Inspired by Josh Bloch's Java Puzzlers, we put together our own Python Puzzlers. This slide deck brings you a set of 10 python puzzlers, that are fun and educational. Each puzzler will show you a piece of python code. Your task if to figure out what happens when the code is run. Whether you're a python beginner or a passionate python veteran, we hope that there's something to learn for everybody.
This slide deck was first presented at shopkick. Nandan Sawant and Ryan Rueth are engineers at shopkick. Keeping the audience in mind, most of the puzzlers are based on python 2.x.
Python 101++: Let's Get Down to Business!Paige Bailey
You've started the Codecademy and Coursera courses; you've thumbed through Zed Shaw's "Learn Python the Hard Way"; and now you're itching to see what Python can help you do. This is the workshop for you!
Here's the breakdown: we're going to be taking you on a whirlwind tour of Python's capabilities. By the end of the workshop, you should be able to easily follow any of the widely available Python courses on the internet, and have a grasp on some of the more complex aspects of the language.
Please don't forget to bring your personal laptop!
Audience: This course is aimed at those who already have some basic programming experience, either in Python or in another high level programming language (such as C/C++, Fortran, Java, Ruby, Perl, or Visual Basic). If you're an absolute beginner -- new to Python, and new to programming in general -- make sure to check out the "Python 101" workshop!
Introduction to the basics of Python programming (part 3)Pedro Rodrigues
This is the 3rd part of a multi-part series that teaches the basics of Python programming. It covers list and dict comprehensions, functions, modules and packages.
The basics of Python are rather straightforward. In a few minutes you can learn most of the syntax. There are some gotchas along the way that might appear tricky. This talk is meant to bring programmers up to speed with Python. They should be able to read and write Python.
Characteristics of Java and basic programming constructs like Data types, Variables, Operators, Control Statements, Arrays are discussed with relevant examples
How to Become a Tree Hugger: Random Forests and Predictive Modeling for Devel...Matt Harrison
Python makes data science easy. In this deck we walk through a complete example of creating and evaluating a predictive model using Decision Trees and Random Forests. All of the code is included in the slides.
Inspired by Josh Bloch's Java Puzzlers, we put together our own Python Puzzlers. This slide deck brings you a set of 10 python puzzlers, that are fun and educational. Each puzzler will show you a piece of python code. Your task if to figure out what happens when the code is run. Whether you're a python beginner or a passionate python veteran, we hope that there's something to learn for everybody.
This slide deck was first presented at shopkick. Nandan Sawant and Ryan Rueth are engineers at shopkick. Keeping the audience in mind, most of the puzzlers are based on python 2.x.
Python 101++: Let's Get Down to Business!Paige Bailey
You've started the Codecademy and Coursera courses; you've thumbed through Zed Shaw's "Learn Python the Hard Way"; and now you're itching to see what Python can help you do. This is the workshop for you!
Here's the breakdown: we're going to be taking you on a whirlwind tour of Python's capabilities. By the end of the workshop, you should be able to easily follow any of the widely available Python courses on the internet, and have a grasp on some of the more complex aspects of the language.
Please don't forget to bring your personal laptop!
Audience: This course is aimed at those who already have some basic programming experience, either in Python or in another high level programming language (such as C/C++, Fortran, Java, Ruby, Perl, or Visual Basic). If you're an absolute beginner -- new to Python, and new to programming in general -- make sure to check out the "Python 101" workshop!
Introduction to the basics of Python programming (part 3)Pedro Rodrigues
This is the 3rd part of a multi-part series that teaches the basics of Python programming. It covers list and dict comprehensions, functions, modules and packages.
The basics of Python are rather straightforward. In a few minutes you can learn most of the syntax. There are some gotchas along the way that might appear tricky. This talk is meant to bring programmers up to speed with Python. They should be able to read and write Python.
Characteristics of Java and basic programming constructs like Data types, Variables, Operators, Control Statements, Arrays are discussed with relevant examples
How to Become a Tree Hugger: Random Forests and Predictive Modeling for Devel...Matt Harrison
Python makes data science easy. In this deck we walk through a complete example of creating and evaluating a predictive model using Decision Trees and Random Forests. All of the code is included in the slides.
ITGM #9 - Коварный CodeType, или от segfault'а к работающему кодуdelimitry
Доклад с ITGM #9 рассказывающий про реальный пример поиска и исправления Segmentation fault при генерации функции на Python в одном проекте
(Доклад вместе с http://www.slideshare.net/AndreyZakharevich)
Presented during the DrupalCamp Cebu 2015. It demonstrates how we handled and integrated multiple, switchable, and extendable map APIs with our Drupal site (CNN Travel). It showcases the modern map APIs particularly the Google Maps, HERE Maps, and MapBox.
Likewise, it will feature the Strategy Design Pattern for easy switching of map objects' context and activating a particular map API. Discussion will include the various entity contexts (node and taxonomy pages), Drupal admin form for inputting API credentials, Drupal.settings' object integration, the template files and other loaded assets, the rendered widgets, as well as the challenges we encountered and their corresponding solutions/workaround.
The session is targeted for those interested in design patterns, web mapping, or implementing switchable JavaScript APIs (multiple chart APIs, map APIs, or any family of 3rd-party APIs).
Numerical tour in the Python eco-system: Python, NumPy, scikit-learnArnaud Joly
We first present the Python programming language and the NumPy package for scientific computing. Then, we devise a digit recognition system highlighting the scikit-learn package.
Everything You Always Wanted to Know About Memory in Python - But Were Afraid...Piotr Przymus
Have you ever wondered what happens to all the precious RAM after running your 'simple' CPython code? Prepare yourself for a short introduction to CPython memory management! This presentation will try to answer some memory related questions you always wondered about. It will also discuss basic memory profiling tools and techniques.
Rapid and Scalable Development with MongoDB, PyMongo, and MingRick Copeland
This intermediate-level talk will teach you techniques using the popular NoSQL database MongoDB and the Python library Ming to write maintainable, high-performance, and scalable applications. We will cover everything you need to become an effective Ming/MongoDB developer from basic PyMongo queries to high-level object-document mapping setups in Ming.
Python training slides for beginners that I lecture. Starts from python versions to popular 3 party libraries. Arguments, interactive modes, variable types, language keywords, functions, class etc. If you want to get training please look www.ismailbaydan.com for more details.
This presentation about Python Interview Questions will help you crack your next Python interview with ease. The video includes interview questions on Numbers, lists, tuples, arrays, functions, regular expressions, strings, and files. We also look into concepts such as multithreading, deep copy, and shallow copy, pickling and unpickling. This video also covers Python libraries such as matplotlib, pandas, numpy,scikit and the programming paradigms followed by Python. It also covers Python library interview questions, libraries such as matplotlib, pandas, numpy and scikit. This video is ideal for both beginners as well as experienced professionals who are appearing for Python programming job interviews. Learn what are the most important Python interview questions and answers and know what will set you apart in the interview process.
Simplilearn’s Python Training Course is an all-inclusive program that will introduce you to the Python development language and expose you to the essentials of object-oriented programming, web development with Django and game development. Python has surpassed Java as the top language used to introduce U.S. students to programming and computer science. This course will give you hands-on development experience and prepare you for a career as a professional Python programmer.
What is this course about?
The All-in-One Python course enables you to become a professional Python programmer. Any aspiring programmer can learn Python from the basics and go on to master web development & game development in Python. Gain hands on experience creating a flappy bird game clone & website functionalities in Python.
What are the course objectives?
By the end of this online Python training course, you will be able to:
1. Internalize the concepts & constructs of Python
2. Learn to create your own Python programs
3. Master Python Django & advanced web development in Python
4. Master PyGame & game development in Python
5. Create a flappy bird game clone
The Python training course is recommended for:
1. Any aspiring programmer can take up this bundle to master Python
2. Any aspiring web developer or game developer can take up this bundle to meet their training needs
Learn more at https://www.simplilearn.com/mobile-and-software-development/python-development-training
Effective Numerical Computation in NumPy and SciPyKimikazu Kato
Presented at PyCon JP 2014.
Video is available at
http://bit.ly/1tXYhw6
This talk explores case studies of effective usage of Numpy/Scipy and shows that the computational speed sometimes improves drastically with the appropriate derivation of formulas and performance-conscious implementation. I especially focus on scipy.sparse, the module for sparse matrices, which is often useful in the areas of machine learning and natural language processing.
The presentation from Python meetup by JettyCloud about solving a problem found in a library that uses hashlib, followed by an overview of the CPython hashlib module with implementation details.
The presentation from SPbPython meetup about simple self-made just-in-time (JIT) compiler for Python code.
N-th Fibonacci sequence number returning function is JIT-ed in the example.
The presentation from PiterPy Meetup #10 Hardcore about the data structures used in databases for storing and retrieving data.
Two approaches to data processing are considered: OLTP and OLAP.
SQL, NoSQL and New SQL databases are discussed.
The tradeoffs that the developers face when creating storage systems are shown.
Also the methods of data storage and interaction with the database provides CPython are considered.
The presentation and the list of references and books helps more easily navigate the data storage engines and understand which tool is better suited for a particular task.
True stories on the analysis of network activity using Pythondelimitry
The presentation from SPbPython community / PiterPy meetup.
The presentation tells about the problems of analysing the network activity of applications on Linux using Python. The following topics are covered: analysis of network packets, analysis of packet filters, packets crafting using Scapy, analysis of open ports.
The presentation from SPbPython community / PiterPy meetup.
The presentation tells about one idea how it is possible to perform an obfuscation of numbers in Python.
Презентация со встречи сообщества SPb Python Interest Group рассказывающая об устройстве словарей в Python.
В презентации рассмотрена работа словаря в CPython 2.x, словаря в CPython 3.x, а также рассмотрены изменения в CPython 3.6.
Помимо CPython рассмотрены версии словаря в альтернативных реализациях Python, таких как PyPy, IronPython и Jython.
Разработка фреймворка на Python для автоматизации тестирования STB боксовdelimitry
Презентация для PiterPy #2, рассказывающая об опыте разработки и использования фреймворка на Python для автоматизированного тестирования STB (Set-Top Boxes).
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
3. >>> d = {} # the same as d = dict()
>>> d['a'] = 123
>>> d['b'] = 345
>>> d['c'] = 678
>>> d
{'a': 123, 'c': 678, 'b': 345}
>>> d['b']
345
>>> del d['c']
>>> d
{'a': 123, 'b': 345}
4. Dictionary keys must be hashable
An object is hashable if it has a hash value which never changes during its lifetime
>>> d[list()] = 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
>>> d[set()] = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'set'
>>> d[dict()] = 3
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'
All of Python’s immutable built-in objects are hashable
5. import random
class A(object):
def __init__(self, index):
self.index = index
def __eq__(self, other):
return True
def __hash__(self):
return random.randint(0, 3)
def __repr__(self):
return 'A%d' % self.index
d = {A(0): 0, A(1): 1, A(2): 2}
print('keys: %s' % d.keys())
print('values: %s' % d.values())
for k in d:
print('%s = %s' % (k, d.get(k, 'not found')))
Random hash is a bad idea
Run 1
keys: [A1, A2, A0]
values: [1, 2, 0]
A1 = 1
A2 = not found
A0 = 0
Run 2
keys: [A1, A0]
values: [2, 0]
A1 = not found
A0 = not found
7. Three kinds of slots in the table:
1) Unused
2) Active
3) Dummy
typedef struct {
Py_ssize_t me_hash;
PyObject *me_key;
PyObject *me_value;
} PyDictEntry;
- Hash table
- Open addressing collision resolution strategy
- Initial size = 8
- Load factor = 2/3
- Growth rate = 2 or 4 (depending on the number of cells used)
- “/Include/dictobject.h”, “/Objects/dictobject.c”, “/Objects/dictnotes.txt”
Dictionary in CPython >2.1
8. ma_fill – is the number of non-NULL keys (sum of Active and Dummy)
ma_used – number of Active items
ma_mask – mask == PyDict_MINSIZE - 1
ma_lookup – lookup function (lookdict_string by default)
#define PyDict_MINSIZE 8
typedef struct _dictobject PyDictObject;
struct _dictobject {
PyObject_HEAD
Py_ssize_t ma_fill;
Py_ssize_t ma_used;
Py_ssize_t ma_mask;
PyDictEntry *ma_table;
PyDictEntry *(*ma_lookup)(PyDictObject *mp, PyObject *key,
long hash);
PyDictEntry ma_smalltable[PyDict_MINSIZE];
};
9. Good hash functions are needed
>>> map(hash, [0, 1, 2, 3, 4])
[0, 1, 2, 3, 4]
>>> map(hash, ['abca', 'abcb', 'abcc', 'abcd', 'abce'])
[1540938117, 1540938118, 1540938119, 1540938112, 1540938113]
Modified FNV (Fowler–Noll–Vo) hash function for strings
“-R” option – turns on hash randomization, so that the __hash__() values of str,
bytes and datetime objects are “salted” with an unpredictable random value
>>> map(hash, ['abca', 'abcb', 'abcc', 'abcd', 'abce'])
[-218138032, -218138029, -218138030, -218138027, -218138028]
Hash functions
10. Collision resolution
Collision is a situation that occurs when two distinct pieces of data have the
same hash value.
Probing is a scheme in computer programming for resolving collisions in hash
tables for maintaining a collection of key–value pairs and looking up the value
associated with a given key.
In CPython a pseudo-random probing is used
PERTURB_SHIFT = 5
perturb = hash(key)
while True:
j = (5 * j) + 1 + perturb
perturb >>= PERTURB_SHIFT
index = j % 2**i
See “/Objects/dictobject.c”
In CPython <2.2 used a polynomial-based index computing
11. >>> PyDict_MINSIZE = 8
>>> key = 123
>>> hash(key) % PyDict_MINSIZE
>>> 3
Index computing
>>> mask = PyDict_MINSIZE - 1
>>> hash(key) & mask
>>> 3
Instead of the modulo operation use logical "AND" and the mask
Get least significant bits of the hash:
2 ** i = PyDict_MINSIZE, hence i = 3, i.e. three least significant bits is enough
hash(123) = 123 = 0b1111011
mask = PyDict_MINSIZE - 1 = 8 - 1 = 7 = 0b111
index = hash(123) & mask = 0b1111011 & 0b111 = 0b011 = 3
26. >>> d = {'a': 1}
>>> for i in d:
... d['new item'] = 123
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: dictionary changed size during iteration
Adding item during iteration
35. Dictionary in CPython 3.5
- PEP 412 - Key-Sharing Dictionary
- The DictObject can be in one of two forms: combined table or split table
- Initial size = 4 (split table) or 8 (combined table)
- Maximum dictionary load = (2*n+1)/3
- Growth rate = used*2 + capacity/2
- “/Objects/dict-common.h”, “/Include/dictobject.h”, “/Objects/dictobject.c”,
“/Objects/dictnotes.txt”
typedef struct {
Py_hash_t me_hash;
PyObject *me_key;
PyObject *me_value; /* only meaningful for combined tables */
} PyDictKeyEntry;
struct _dictkeysobject {
Py_ssize_t dk_refcnt;
Py_ssize_t dk_size;
dict_lookup_func dk_lookup;
Py_ssize_t dk_usable;
PyDictKeyEntry dk_entries[1];
};
typedef struct {
PyObject_HEAD
Py_ssize_t ma_used;
PyDictKeysObject *ma_keys;
PyObject **ma_values;
} PyDictObject;
36. Combined table vs split table
Combined table
- For explicit dictionaries (dict() and {})
- ma_values = NULL, dk_refcnt = 1
- Never becomes a split-table dictionary
Split table
- For attribute dictionaries (the__dict__ attribute of an object)
- ma_values != NULL, dk_refcnt >= 1
- Only string (unicode) keys are allowed
- Values are stored in the ma_values array
- When resizing a split dictionary it is converted to a combined table (but if
resizing is as a result of storing an instance attribute, and there is only
instance of a class, then the dictionary will be re-split immediately)
- Lookup function = lookdict_split
37. Dictionary in CPython 3.5
A new kind of slot:
1) Unused
2) Active
3) Dummy
4) Pending (me_key != NULL, me_key != dummy and me_value == NULL)
typedef struct {
Py_hash_t me_hash;
PyObject *me_key;
PyObject *me_value; /* only meaningful for combined tables */
} PyDictKeyEntry;
40. class A():
def __init__(self):
self.a = 1
self.b = 2
self.c = 3
a = A()
print(a.__dict__.__sizeof__()) # 72
b = A()
setattr(a, 'd', 4) # no re-split because of b
print(a.__dict__.__sizeof__()) # 456
Split table
Split table is converted to a combined table
41. Key differences between this implementation and CPython 2.x:
- The table can be split into two parts – the keys and the values
- A new kind of slot
- No more ma_smalltable embedded in the dict
- General dictionaries are slightly larger
- All object dictionaries of a single class can share a single key-table, saving
about 60% memory for such cases (accordint to
https://github.com/python/cpython/blob/3.5/Objects/dictnotes.txt)
Bugs still happens: Unbounded memory growth resizing split-table dicts
(https://bugs.python.org/issue28147)
Summary
42. Hash functions in CPython 3.5
SipHash for strings and bytes (>= CPython 3.4)
- Resistant against hash flooding DoS attacks
- Successfully used in many other languages
Slightly modified hash function for float
PEP 456 – Secure and interchangeable hash algorithm
hash(float("+inf")) == 314159,
hash(float("-inf")) == -314159, was -271828
43. OrderedDict in CPython 3.5
- Doubly-linked-list
- od_fast_nodes hash table that mirrors the od_dict table
- “/Include/odictobject.h”, “/Objects/odictobject.c”
45. Dictionary in PyPy
- Starting from PyPy 2.5.0 – ordereddict is used by default
- Initial size = 16
- Load factor up to 2/3
- Growth rate = 4 (up to 30000 items) or 2
- If a lot of items are deleted the compaction is performed
- “/rpython/rtyper/lltypesystem/rordereddict.py”
struct dicttable {
int num_live_items;
int num_ever_used_items;
int resize_counter;
variable_int *indexes; // byte, short, int, long
dictentry *entries;
...
}
struct dictentry {
PyObject *key;
PyObject *value;
long hash;
bool valid;
}
47. PyDictionary in Jython
- Based on ConcurrentHashMap
- Separate chaining collision resolution
- Initial size = 16, load factor = 0.75, growth rate = 2
- Segments and thread safety
48. PythonDictionary in IronPython
- Based on Dictionary (.NET)
- Separate chaining collision resolution
- Initial size = 0, load factor = 1.0
- Rehashing if the number of collisions >= 100
- Growth rate = 2 (the new size is equal to the next higher prime number) from a set of
primes = {3, 7, 11, 17, 23, 29, 37, 47, 59, 71, 89, 107,… , 4999559, 5999471, 7199369}
51. Dictionary in CPython 3.6
typedef struct {
Py_hash_t me_hash;
PyObject *me_key;
PyObject *me_value; /* only meaningful for combined tables */
} PyDictKeyEntry;
typedef struct {
PyObject_HEAD
Py_ssize_t ma_used; /* number of items in the dictionary */
uint64_t ma_version_tag; /* unique, changes when dict modified */
PyDictKeysObject *ma_keys;
PyObject **ma_values;
} PyDictObject;
- ma_version_tag is added (PEP 509 – Add a private version to dict)
- Initial size = 8 (for split table too)
- Maximum dictionary load = (2*n)/3
- Contributed by INADA Naoki in https://bugs.python.org/issue27350
Four kinds of slots in the table:
1) Unused (index == DKIX_EMPTY == -1)
2) Active (index >= 0 , me_key != NULL and me_value != NULL)
3) Dummy (index == DKIX_DUMMY == -2, only for combined table)
4) Pending (index >= 0 , me_key != NULL and me_value == NULL, only for split table)
52. Dictionary in CPython 3.6
- Added dk_nentries and dk_indices
struct _dictkeysobject {
Py_ssize_t dk_refcnt;
Py_ssize_t dk_size; /* Size of the hash table (dk_indices) */
dict_lookup_func dk_lookup; /* Function to lookup in dk_indices */
Py_ssize_t dk_usable; /* Number of usable entries in dk_entries */
Py_ssize_t dk_nentries; /* Number of used entries in dk_entries */
union {
int8_t as_1[8];
int16_t as_2[4];
int32_t as_4[2];
#if SIZEOF_VOID_P > 4
int64_t as_8[1];
#endif
} dk_indices;
PyDictKeyEntry dk_entries[dk_usable]; /* using DK_ENTRIES macro */
};
54. Key differences between this implementation and CPython 3.5:
- Compact and ordered
- Added dk_indices with type, depending on the size of dictionary
- Added ma_version_tag (PEP 509)
- Initial size for split table is changed to 8
- Maximum dictionary load changed to (2*n)/3
- Deleting item cause converting the dict to the combined table
- Preserving the order of **kwargs in a function (PEP 468) is implemented
- Preserving Class Attribute Definition Order (PEP 520) is implemented
- The memory usage of the new dict() is between 20% and 25% smaller compared
to Python 3.5 (https://docs.python.org/3.6/whatsnew/3.6.html#other-language-
changes)
Summary
55. References
1. The implementation of a dictionary in Python 2.7 https://habrahabr.ru/post/247843/
2. Python hash calculation algorithms http://delimitry.blogspot.com/2014/07/python-hash-calculation-algorithms.html
3. PEP 412 - Key-Sharing Dictionary https://www.python.org/dev/peps/pep-0412/
4. PEP 456 - Secure and interchangeable hash algorithm https://www.python.org/dev/peps/pep-0456/
5. Mirror of the CPython repository https://github.com/python/cpython/
6. Faster, more memory efficient and more ordered dictionaries on PyPy https://morepypy.blogspot.com/2015/01/faster-
more-memory-efficient-and-more.html
7. PyDictionary (Jython API documentation) http://www.jython.org/javadoc/org/python/core/PyDictionary.html
8. Jython repository https://bitbucket.org/jython/jython
9. Java theory and practice: Building a better HashMap http://www.ibm.com/developerworks/library/j-jtp08223/
10. Back to basics: Dictionary part 2, .NET implementation https://blog.markvincze.com/back-to-basics-dictionary-part-2-
net-implementation/
11. http://referencesource.microsoft.com/mscorlib/system/collections/generic/dictionary.cs.html
12. https://github.com/IronLanguages/main/blob/ipy-2.7-maint/Languages/IronPython/IronPython/
13. https://bitbucket.org/pypy/pypy/
14. https://twitter.com/raymondh
15. PEP 509 - Add a private version to dict https://www.python.org/dev/peps/pep-0509/
16. Compact and ordered dict http://bugs.python.org/issue27350
17. What’s New In Python 3.6 https://docs.python.org/3.6/whatsnew/3.6.html
18. PEP 468 - Preserving the order of **kwargs in a function https://www.python.org/dev/peps/pep-0468/
19. PEP 520 - Preserving Class Attribute Definition Order https://www.python.org/dev/peps/pep-0520/
20. https://en.wikipedia.org/
Images from:
http://www.rcreptiles.com/blog/index.php/2008/06/28/read_the_operating_manual_first
http://kiwigamer450.deviantart.com/art/Back-to-The-Past-Logo-567858767
http://beyondplm.com/wp-content/uploads/2014/04/time-paradox-past-future-present.jpg
http://itband.ru/wp-content/uploads/2014/10/Future.jpg
https://en.wikipedia.org/wiki/Hash_table