Nisha introduces Tern, a utility for software package introspection in containers. This tool allows administrators to have the same level of confidence on what's in their containers as they currently do with VM images, including compliance audits, bill of materials, and exploit detection. Nisha is the primary author of Tern.
https://github.com/vmware/tern
All container builders are dependent on the underlying linux kernel’s storage drivers. The storage driver’s API is called to implement a union mount of various filesystem layers.
An image starts with what is essentially a linux filesystem consisting of bin, boot, etc, root, etc
Then at runtime, a thin copy on write layer is created on which it a container builder will invoke commands to copy in files, untar filesystem bundles or invoke commands
Copy on Write means only if you write to a file that exists in the bottom layer, it gets copied to the top layer. What gets stored in the new layer is the newly modified file.
Once files are installed, the copy on write layer becomes the new diff layer. The storage driver I am familiar with is overlay2, the copy on write layer is just an empty directory. Once that directory gets populated, it is kept
Now that layer becomes the next layer and a new copy on write layer is created for the next addition of files
When you’re done you have a container image, which is just a collection of directory trees
When you push this image to a registry, the directory trees get tarred and checksummed
All a container image is is files with some metadata on how to set them up
To containerize your app, you will download this image - which comes with all of these files (made by somebody else)
You will then copy your app in and run scripts to set it up
When you push this image you are pushing the whole thing - your changes and somebody else’s changes
In fact you can find this information, given enough time and effort searching through Dockerhub and git
This image golang:1.11 is built on top of buildpack-deps images which is built on top of another buildpack-deps image which is finally built on top of a debian:stretch image
I have included links to the Dockerfiles that created these images but this is very lucky as most Dockerhub images do not have links to the Dockerfiles that created them
Even now OCI images have no way of tracking provenance of an image. This information is opaque
Right of the bat you can see that tracking where container images come from is hard
- Let’s assume we have some go application we would like to containerize
- This Dockerfile uses the builder pattern which means use one container to build the golang code and another smaller container to ship the golang binary
- The build container has golang dependencies which includes the golang standard library, and whatever build dependencies are required
Since go is a statically compiled language, that means you need to be extra careful with your binaries’ dependencies.
Even though you are not distributing the golang image, you are copying a binary with statically compiled code into another container and possibly getting rid of the container you used to build it
For compliance purposes, you need to know the whole dependency chain. The Dockerfile is not enough
And this is why OSS compliance is much harder in containers
You can also do this using overlay
overlay2 is available in kernel 4.0 and above using the mount API
Incidentally these are pretty much the same steps that are used when building a container
Only here, you are retracing your steps layer by layer
Tern analyzes the container layer by layer by running commands in a chroot environment. It retrieves those ‘command library’ which is essentially a list of binary names and corresponding scripts to recover package versions and licenses for whatever was installed using that binary
As a result, the architecture is extensible for OS package managers and language package managers. Even for stuff you download using git.
That means I am working on this on the weekdays and sometimes the weekends. It’s a young project so not many