Development environments are a necessary part of every developer's workflow. They can also be a great source of friction. What may begin as simply running python my_app.py eventually bloats as you add more apps, more databases, more testing frameworks, and more developers. We'll talk about the evolution of a typical development environment, how it lets us down, and how we try to make it better. We'll end with an introduction to Dusty, a new tool which uses Docker containers to take our development environments to the next level.
Originally presented at PyGotham 2015.
My INSURER PTE LTD - Insurtech Innovation Award 2024
Dev Environments: The Next Generation
1. thieman
Travis Thieman
THE NEXT GENERATION
DEV
ENVS
Talk through the outline on this slide.
What is a dev environment?
Current state of the world (running on local and VMs)
Ways we can make this better (containers)
Whether we can realize this benefit with the tools available (Dusty)
2. What are development environments? It’s all of this crap.
From the code you’re running, to the filesystem, to the OS, to the hardware,
to the bits of electricity pinging around inside the machine, all of this affects
what you’re actually trying to do, which is write and run software.
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
Here’s a different model for thinking about all the crap inside our machine
that affects how we run and write software. There’s a *lot* going on.
Here’s yet another way of looking at a typical dev env that encompasses an
entire machine. Dev envs are inherently complex, and our tiny human brains
have no chance of being able to comprehend them. We need to create
tooling to help us do this.
3. Get our code to work at all
Reproduce our results
Share our results
Test our code properly
Change our dev environments
✗
✗
✗
✗
✗
Getting development environments right is hugely important, and the
consequences of screwing it up affect almost everything we do.
“[T]he alternative [to testing with SQLite] is not
testing at all because there is limited time for
testing and setting up a proper database for this is
so much more trouble.
The choices are not good test vs bad test.
They are test-with-issues vs. no test.”
- jbb555
Hacker News, August 4th, 2015
We are failing at managing the complexity of our dev environments.
This is from a Hacker News thread on testing code against SQLite instead of
the database you actually run in production.
Clearly we have a lot of work to do here.
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
Let’s start thinking through the problem by imagining a simple Python app
and getting it in the right environment to run correctly. If we want to share
that code or reproduce our results on another machine, we need to make
sure that our circles of Hell are compatible all the way down.
Most of the time, in Python-land, we are dealing with the three layers at the
top. In this example…
4. Your Code
requests 2.6.0
Python 2.7
…
…
…
Let’s say our Python code needs a specific version of requests and Python to
run. If we have these set up, our code will probably run. So we give our
awesome script to our friend and tell her to install requests 2.6.0.
pip install
requests==2.6.0
This goes fine, but it turns out she already had requests 2.5.0 installed to run
some other important app on her machine. When we install 2.6.0…
pip install
requests==2.6.0
requests 2.5.0
…her old version gets unceremoniously uninstalled by pip.
5. Your Code + Friend’s Code
requests 2.5.0
Python 2.7
…
…
…
requests 2.6.0
So we have a conflict here. Because of how Python works, these two
versions cannot live harmoniously within the same set of Python libraries. So,
if they refuse to get along, we’ll have to separate them.
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
This is where we’re going to see the concept of isolation come into play.
What if we cut our circles of Hell above the system library level. We’ll keep
*one* total copy of everything below the cut, but we’ll let ourselves create
multiple, isolated versions of everything above the cut.
Userland
OS / Kernel
Hardware
Your Code
requests 2.6.0
Runtime
Friend’s Code
requests 2.5.0
Runtime
This means we can have a structure that looks like this. Two separate,
isolated stacks of Python components. In one, our awesome Riker code and
requests 2.6.0. In the other, our friend’s app.
6. Hey, guess what, this is actually a thing! Venvs give us scalable isolation over
our Python programs. This lets us resolve simple version conflicts, but
scalable isolation has a bigger, more general impact on our development
process as well.
Isolation is also really useful for keeping the number of things we need to
think about smaller, which helps us approach and reason about complicated
tasks. Without isolation, we have to consider all parts of the system all at
once, all of the time.
If we have a problem in the middle of this mess, we’re going to have to wade
through the whole thing to fix it. This is really hard.
7. With scalable isolation (like venvs!) we can break the problem down into
smaller, isolated components.
Now, if we have a problem in a specific area…
…we can focus our cognitive effort entirely on that unit of isolation, which is
much smaller and easier to reason about. Scalable isolation helps us identify
problems and takes away a lot of the cognitive burden we otherwise face
when trying to solve them.
8. Venvs are not enough
• Need C extensions?
• Conflicting system libraries?
• A language other than Python?
• Run a database? Multiple databases?
However, virtualenvs don’t go far enough. We need a more general solution.
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
So we need our unit of isolation to penetrate deeper into the circles of Hell.
Let’s recall what this entire thing is equivalent to…
…a machine, right? And one machine is one big stacked Hell metaphor. Let’s
restrict ourselves to running on one physical machine.
9. Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
Beyond that restriction, let’s go wild. What if we use a unit of isolation that
encompasses everything BUT the hardware?
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
It’d look like this, right? We’re definitely going to get some great isolation out
of this.
Hardware
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
It’d look like this, right? We’re definitely going to get some great isolation out
of this.
10. This is called a virtual machine. It is, also, totally a thing. There’s a big
ecosystem around these, because a lot of people use them for their dev
environments. Up until recently, they were the latest greatest thing in dev
environments.
VMs have one major flaw that leads to a whole host of problems: they’re big
and slow. Turns out, running a whole ‘nother kernel takes a lot of clever
engineering and the end result doesn’t scale well. As a result, we *can’t* run
a bunch of VMs. Maybe two, maybe three, probably not five, definitely not
100.
Hardware
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Host Machine Guest VM
This winds up with us ending up basically where we started. All of our stuff
ends up in the same unit of isolation. Whether it’s on one Hell stack or in 1
VM running in parallel with our host machine, it’s effectively the same. VMs
give us a way to jumpstart the process, but not much more. You end up with
most of the same problems you had running locally. We need something
better.
11. Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
Let’s review. Venvs are lightweight, pretty simple, but they don’t offer enough
isolation.
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
Virtual machines offer great isolation, but they’re so heavyweight that we
can’t run many of them. And, really, do we need a separate OS and kernel to
run a Python app, or even a database? Generally, no.
Your Code
Language Libraries
Language Runtime
Userland
OS / Kernel
Hardware
What if we learned something from Goldilocks and created a unit of isolation
above the OS?
12. OS / Kernel
Hardware
Your Code
Language Libraries
Language Runtime
Userland
Your Code
Language Libraries
Language Runtime
Userland
We know from VMs that running multiple kernels is really hard and expensive,
so what if we split the circles of Hell right above that and created a new unit
of isolation? This could let us run our two apps with different library
dependencies pretty easily.
OS / Kernel
Hardware
This isolation is incredibly powerful. Now we *only* need the dependencies in
that unit of isolation that let us run our code. So we can make them pretty
small and MUCH easier to reason about!
OS / Kernel
Hardware
Taking this approach to its natural conclusion, we can create a bunch of
these small units of isolation for everything we need to run as part of our
development environments
There are numerous benefits here. Smaller, easier to reason about. Actually
scalable, unlike VMs.
13. =
This unit of isolation already exists, and it’s called a container!
If you’ve heard about the Docker project, that’s the leader in implementing
containers right now. Docker provides a core engine for managing
containers, and has done a lot of work around building an ecosystem around
all of this.
So we have this new shiny scalable unit of isolation called a container. What
can we do with it?
14. Get our code to work at all
Reproduce our results
Share our results
Test our code properly
Change our dev environments
?
?
?
?
?
To review, let’s go back to our scary checklist from earlier. This is what we
*want* our dev environments to help us do. Let’s see how containers can
achieve this.
First, we need to get our code to work at all. Because our containers only
need to have the minimum amount of stuff to run our app, this is usually
pretty simple! The isolation keeps the amount of stuff we need to think about
pretty low. Let’s just throw in exactly what we need.
OS / Kernel
Hardware
When we throw this on top of the non-isolated parts of our stack, we end up
with a running container! It’s small, isolated, and runs our code just fine.
15. Get our code to work at all
Reproduce our results
Share our results
Test our code properly
Change our dev environments
?
?
?
?
One of the biggest value adds of the Docker ecosystem is the ability to easily
reproduce and share containers. You can either…
My Awesome Python Container
PyGotham 2015
…take a snapshot of the container once you’ve got it the way you like. Quick
and dirty.
16. FROM debian:jessie
RUN apt-get update && apt-get install python
ADD . /my-python-app
RUN pip install -r /my-python-app/reqs.txt
CMD python /my-python-app/awesome-app.py
You can also formalize the instructions on how to build the container, which
makes it actually reproducible.
Get our code to work at all
Reproduce our results
Share our results
Test our code properly
Change our dev environments
?
?
OS / Kernel
Hardware
Containers enable an awesome testing flow. Let’s say we’re running our
Python app against a Postgres database. We take advantage of the scalable
isolation of containers to create a totally separate copy of the app container
running against a fresh Postgres container. We can even use a separate
container with our test requirements already installed. Afterwards we can
throw those temporary containers away.
17. Get our code to work at all
Reproduce our results
Share our results
Test our code properly
Change our dev environments?
OS / Kernel
Hardware
Scalable isolation is what makes it easy for us to change our development
environments. It doesn’t matter how many containers we add, this Python
container will still work just fine, because it’s not really affected by external
forces. We could visit an apocalypse on all the other containers individually,
but our Python container won’t be affected by anything that happens inside
of them.
Get our code to work at all
Reproduce our results
Share our results
Test our code properly
Change our dev environments
18. Great, so everything is sunshine and rainbows, right? Talk over, goodbye.
NOW WE
GOT
PROBLEMS
Not quite. Using containers for everything is going to introduce some
problems we didn’t have before.
OS / Kernel
Hardware
Isolation makes normal everyday stuff like navigating through a space in a
shell, copying files around, or accessing processes through ports more
difficult. Maybe you’re starting out in this space over here, but your Python
app is over here in a separate unit of isolation, and your database is separate
from both of those.
19. OS / Kernel
Hardware
Wrangling your vast menagerie of containers can also get overwhelming,
even with relatively small stacks. When you add in test containers with short
lifecycles, the problem gets worse. The solution is still *simple*, but it’s also
*complicated* (as opposed to complex). Some tooling and automation could
go a long way here.
Another problem: how do you share your crazy setup with all these
containers floating around? How do you share how to run tests on them?
How do you share common tasks like a database migration that needs to be
aware that it’s running in a containerized environment?
Here’s another big issue…
20. …if you run Linux, you’re in luck. You can use containers out of the box. Did I
mention this whole concept is more accurately called “Linux containers”?
+
+
If you use something else like Mac or Windows, you’re going to need to run a
Linux VM in order to utilize containers. This adds another level of isolation/
abstraction that we constantly need to navigate. This adds a lot of cognitive
overhead unless we can automate it away via tooling.
Docker Machine Docker Compose
The Docker ecosystem provides us tools to solve some of these problems.
Machine gives us a way to get up and running on non-Linux systems.
Compose lets us wrangle sets of containers into a working stack in a
reproducible and sharing way. But these are ultimately fairly low-level tools,
and they can’t meet the specialized and complex needs of running a
development environment specifically.
21. • Usability
• Sharing code
• Restarting when
code changes
• Sharing tests
• Navigating through
containerland
• Mix and Match
Other dev env problems
Dusty is our attempt to use containers to solve the problems with dev
environments. It’s written in Python and uses Docker and existing tools from
the Docker ecosystem.
Dusty provides a solution which:
1. Uses containers to achieve the right level of isolation, keeping things
simple.
2. Scalable. The 100th app you run in Dusty is as easy as the 1st.
3. Specialized. Dusty solves problems specific to a dev environment, like
running tests against isolated database containers.
• Usability
• Sharing code
• Restarting when
code changes
• Sharing tests
• Navigating through
containerland
• Mix and Match
Dusty allows to define groups of containers which can be toggled on or off
separately. We can tell Dusty how to run hundreds of services, but if we only
need one right now, it only bothers running the containers for that one. This
mix and match lets us be efficient with our resources and scale out Dusty to
support stacks of any size.
22. When Dusty runs your containers, the plumbing needed to navigate the
layers of isolation between your normal operating space (your local
filesystem, etc) and the containers is set up for you. Your code gets
seamlessly mounted into the container. You can make a request in your
browser and Dusty will pipe it through the VM and into the container just the
way you need.
We’re developers, we think in code as often as we do in processes. To reflect
this, Dusty doesn’t just know about your containers, it also knows about your
repos. If you make a change in a repo, you can ask Dusty to make sure all
containers using that repo get restarted to pick up the new code. This is also
really easy to automate with a tool like watchdog.
Dusty has first class support for the awesome container testing flow we
talked about earlier. You can define a test script, give it a set of service
containers to run against, and Dusty will handle rigging up the test harness
for you. When it’s done, all the containers involved are cleaned up. Each
time, you start fresh. This whole process takes about 5 seconds right now,
and we’re still working to make it even faster.
23. Dusty, along with the other tools in the Docker ecosystem, is going to help us
manage away the pain of using containers and help us realize all the promise
we saw earlier in the talk.
Outro on technical pieces.
Containers give us an opportunity to address a lot of the problems with
existing local-only and VM-based dev environments. Docker and other
container ecosystems are emerging and maturing, making it easier to get to a
workable solution. Dusty and other projects are creating end-to-end
solutions to make the dream of non-Hellish dev environments a reality. But
we aren’t there yet.
The night is dark, we have problems and a lot of work to do, and the Python
community can help to solve this problem. Call to action for contributors and
for people to start their own projects trying to solve the problem.