Talk given at Pycon IL 2019.
Managing Python environment dependencies across an organization and between environments is always a pain. Pipenv makes it a little less painful by managing both the environment itself and its configuration.
'Import Error: No module named X'. Sounds familiar? Probably all too familiar for most of you. Nearly every time you share pieces of code or send programs for remote execution, you will encounter this message. After all, we're humans and we don't always remember to update the requirements.txt file when installing a new package. So code may work perfectly on one machine, but fails to work on a different machine.
Pipenv to the rescue! With Pipenv, now you can manage your environment and the configuration using the same command. Each time you install a package using Pipenv, a configuration file is updated automatically, and it can then be used to set an environment on a different machine in a reliable manner.
In this talk we will introduce you to Pipenv, the problems it can help you solve, how to use it properly, and caveats from our personal experience.
2. twiggle.com
whoami
● Data Science TL @ Twiggle
● Past
○ Data Scientist / Security Researcher @ Akamai
○ Graduate student
○ Independent ventures
● Using Python since 2013
○ Academic and Industry
● Father of two
3. twiggle.com
Why are we here?
● Environment reproducibility
○ Why it is important
○ Tools to achieve it
5. twiggle.com
Package Installation Evolution
● The beginning
○ Find a package
○ Download the package tarball
○ Run setup.py to install in site-packages
● easy_install
○ Find, download and install in one go.
○ No “easy_uninstall” :(
6. twiggle.com
pip
● Started 2008, released 2011.
● The de-facto standard today.
● Problem!
○ All programs share the same global installation
10. twiggle.com
Conda
● Highly popular in the scientific community.
● A combination of
○ a package manager (like pip)
○ an environment manager (like virtualenv + virtualenvwrapper)
● Considered easier to use.
● Doesn’t support all the packages that pip does.
○ Try conda install beautifulsoup
13. twiggle.com
Specifying requirements
● Common practice: pip install -r requirements.txt (circa
2011)
● Version specification examples
○ dask==1.2.1 (fixed)
○ dask>=1.2.1 (minimum)
○ dask==1.2 (fixed minor, don’t care micro)
○ dask~=1.2.1 (fixed minor, with at least this micro)
○ dask~=1.2 (fixed major, with at least this minor)
14. twiggle.com
Q1: How to specify requirements.txt?
What your actual
dependency is
What the environment
contains (pip freeze)
$ cat requirements.txt
pandas>=0.23
$ cat requirements.txt
numpy==1.16.4
pandas==0.24.2
python-dateutil==2.8.0
pytz==2019.1
six==1.12.0
16. twiggle.com
Q2: Dependency Synchronization
● How do we keep requirements up to date?
● Possible problematic scenarios
○ Install a new package without updating requirements.txt
■ won’t work on other machines (or different venvs).
○ Install a package accidentally, then use pip freeze
■ now you’re stuck with it FOREVER.
○ Don’t update requirements for a specific package for a long time
■ might be old and contain vulnerabilities
○ Don’t specify exact version
■ Packages may introduce breaking changes
18. twiggle.com
Pipfile / Pipfile.lock
● Separate dependencies from environment specification
● Pipfile
○ Contains package dependencies
○ Human readable and editable (TOML)
○ Separate packages and dev-packages
● Pipfile.lock
○ Adopted from other languages
○ Contains complete env specs
○ Machine readable (JSON)
● Planned to be supported by pip
○ Not sure when...
$ cat Pipfile
[[source]]
name = "pypi"
url = "https://pypi.org/simple"
verify_ssl = true
[dev-packages]
[packages]
pandas = "*"
[requires]
python_version = "3.7"
19. twiggle.com
pipenv
● Also a combination of virtualenv+pip.
● Recommended by PyPA (Python Packaging Authority).
● Implements Pipfile + Pipfile.lock before their support by
pip.
● The same command (pipenv install) updates both the
environment and the Pipfiles.
● The environment is associated with your current directory.
27. twiggle.com
Downsides
● Can only be used for applications (don’t use for packages)
● Locking can be slow (keeps improving)
● By default, automatically updates ALL the dependencies
every time a new package is installed
○ Solvable with --selective-upgrade
28. twiggle.com
Pipenv Strong Points
● Installs packages and updates lock file in one command.
● Installs packages concurrently.
● Prevents dependency conflicts.
● Checks for security issues (pipenv check).
● Supports reading from requirements.txt.
For code formatting: https://romannurik.github.io/SlidesCodeHighlighter/
Use “Dark” theme with size 18 font. After pasting set a black background for the textbox.