Copyright(c)2020 NTT Corp. All Rights Reserved
The overview of lazypull with
containerd Remote Snapshotter & Stargz Snapshotter
Kohei Tokunaga, Akihiro Suda / NTT
Copyright(c)2020 NTT Corp. All Rights Reserved
Pull is time-consuming
pulling packages accounts for 76% of container start time,
but only 6.4% of that data is read [Harter et al. 2016]
[Harter et al. 2016] Tyler Harter, Brandon Salmon, Rose Liu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. "Slacker: Fast Distribution
with Lazy Docker Containers". 14th USENIX Conference on File and Storage Technologies (FAST ’16). February 22–25, 2016, Santa Clara, CA,
USA
Caching images
Minimizing image size
Cold start is still slow
Not all images are minimizable
Language runtimes, frameworks, etc.
Workarounds are known but not enough
NodeRegistry
Image Container
pull run
Copyright(c)2020 NTT Corp. All Rights Reserved
Normal image pulling
Extracting layers and
merging them as a rootfs
Node
pro
cess
containerDownloading manifest
manifest
GET /v2/<image>/blobs/
GET /v2/<image>/manifests/
Downloading tarball layers
specified by the manifest
Container registry
Container Runtime
l In OCI/Docker standards, an image is a set of tarball layers (+gzip compression)
Ø We need to download and scan the entire blob even for getting single file entry in the image
l During pull, container runtime downloads the entire image layers from registry
• => takes long for large images
e.g.
Copyright(c)2020 NTT Corp. All Rights Reserved
Skipping pull using Remote Snapshotter plugin
Arbitrary remote store
pro
cess
container
Remote
Snapshotter
NodeContainer registry
Downloading manifest
l A type of plugin introduced in containerd 1.4.0
l Extracted layers (snapshots) can be prepared by the plugin (e.g. by mounting remotely)
• Layers not stored in the remote store are pulled from registry
Remote Snapshotter implementations in community
l https://github.com/containerd/stargz-snapshotter
l https://github.com/cvmfs/containerd-remote-snapshotter
Providing snapshots as mount points
Copyright(c)2020 NTT Corp. All Rights Reserved
Lazy pull using Stargz Snapshotter plugin
l Non-core subproject of containerd: https://github.com/containerd/stargz-snapshotter
l Allows containerd to lazy pull stargz/eStargz image from standard registry
Ø Images are standard-compatible so still pullable & runnable by legacy runtimes
l Containerd doesn’t download the image on pull operation but fetches chunks on-demand
pro
cess
container
Node
Stargz
Snapshotter
Mounting rootfs as FUSE
Lazy
pull
Manifest
GET /v2/<image>/manifests/
GET /v2/<image>/blobs/
stargz
eStargz
Container registry
Downloading manifest
Copyright(c)2020 NTT Corp. All Rights Reserved
Stargz and eStargz layers
stargz eStargz
bin/ls
usr/bin/apt
entrypoint.sh
bin/bash
bin/ls
usr/bin/apt
entrypoint.sh
bin/bash Prioritized files
Fetched by a single Range Request
TOCEntriesTOCEntries
Files fetched on demand
also aggressively download in
background
l Stargz: an archive format proposed by Google CRFS project: https://github.com/google/crfs
Ø Seekable but still valid targz = usable as a valid OCI/Docker image layer
l eStargz enables to prefetch files that are likely accessed during runtime (= prioritized files)
Ø Locally accessible to prioritized files, avoiding NW-related overheads
gzip member
per regular file
Seekable
It can be extracted per-file
using HTTP Range Request
Time to take for container startup
0 5 10 15 20 25
estargz
stargz
legacy
python:3.7 (print “hello”)
pull create run [sec]
Waits for prefetch completion
Credit to Akihiro Suda (NTT) for discussion and experiment environment
Slide from the presentation at KubeCon EU 2020 Virtual
Measured on EC2 Oregon (m5.2xlarge, Ubuntu 20.04) & Docker Hub
https://www.slideshare.net/KoheiTokunaga/startup-containers-in-lightning-speed-with-lazy-image-distribution
Time to take for container startup
0 5 10 15 20 25 30
estargz
stargz
legacy
gcc:9.2.0 (compiles and runs printf(“hello”);)
pull create run [sec]
Credit to Akihiro Suda (NTT) for discussion and experiment environment
Slide from the presentation at KubeCon EU 2020 Virtual
Measured on EC2 Oregon (m5.2xlarge, Ubuntu 20.04) & Docker Hub
https://www.slideshare.net/KoheiTokunaga/startup-containers-in-lightning-speed-with-lazy-image-distribution
Time to take for container startup
0 5 10 15 20 25
estargz
stargz
legacy
glassfish:4.1-jdk8 (runs until “Running GlassFish” is printed)
pull create run [sec]
Credit to Akihiro Suda (NTT) for discussion and experiment environment
Slide from the presentation at KubeCon EU 2020 Virtual
Measured on EC2 Oregon (m5.2xlarge, Ubuntu 20.04) & Docker Hub
https://www.slideshare.net/KoheiTokunaga/startup-containers-in-lightning-speed-with-lazy-image-distribution
Copyright(c)2020 NTT Corp. All Rights Reserved
Discussion in community
l Builkit : Supports stargz snapshotter for lazy pulling base images
l https://github.com/moby/buildkit/pull/1402
l Still requires to spawn stargz snapshotter daemon along with buildkitd
l Discussion towards “built-in” support is in progress
l https://github.com/moby/buildkit/pull/1666
l Moby(Docker) : Discussion in progress towards remote snapshotter support
l https://github.com/moby/moby/pull/41002
l Needs more people joining the discussion
l Podman/CRI-O : Discussion in progress towards remote snapshotter support
l https://github.com/containers/podman/issues/4739
l CernVM-FS is working on integration with Podman
l https://github.com/cvmfs/cvmfs/pull/2582

The overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter

  • 1.
    Copyright(c)2020 NTT Corp.All Rights Reserved The overview of lazypull with containerd Remote Snapshotter & Stargz Snapshotter Kohei Tokunaga, Akihiro Suda / NTT
  • 2.
    Copyright(c)2020 NTT Corp.All Rights Reserved Pull is time-consuming pulling packages accounts for 76% of container start time, but only 6.4% of that data is read [Harter et al. 2016] [Harter et al. 2016] Tyler Harter, Brandon Salmon, Rose Liu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. "Slacker: Fast Distribution with Lazy Docker Containers". 14th USENIX Conference on File and Storage Technologies (FAST ’16). February 22–25, 2016, Santa Clara, CA, USA Caching images Minimizing image size Cold start is still slow Not all images are minimizable Language runtimes, frameworks, etc. Workarounds are known but not enough NodeRegistry Image Container pull run
  • 3.
    Copyright(c)2020 NTT Corp.All Rights Reserved Normal image pulling Extracting layers and merging them as a rootfs Node pro cess containerDownloading manifest manifest GET /v2/<image>/blobs/ GET /v2/<image>/manifests/ Downloading tarball layers specified by the manifest Container registry Container Runtime l In OCI/Docker standards, an image is a set of tarball layers (+gzip compression) Ø We need to download and scan the entire blob even for getting single file entry in the image l During pull, container runtime downloads the entire image layers from registry • => takes long for large images e.g.
  • 4.
    Copyright(c)2020 NTT Corp.All Rights Reserved Skipping pull using Remote Snapshotter plugin Arbitrary remote store pro cess container Remote Snapshotter NodeContainer registry Downloading manifest l A type of plugin introduced in containerd 1.4.0 l Extracted layers (snapshots) can be prepared by the plugin (e.g. by mounting remotely) • Layers not stored in the remote store are pulled from registry Remote Snapshotter implementations in community l https://github.com/containerd/stargz-snapshotter l https://github.com/cvmfs/containerd-remote-snapshotter Providing snapshots as mount points
  • 5.
    Copyright(c)2020 NTT Corp.All Rights Reserved Lazy pull using Stargz Snapshotter plugin l Non-core subproject of containerd: https://github.com/containerd/stargz-snapshotter l Allows containerd to lazy pull stargz/eStargz image from standard registry Ø Images are standard-compatible so still pullable & runnable by legacy runtimes l Containerd doesn’t download the image on pull operation but fetches chunks on-demand pro cess container Node Stargz Snapshotter Mounting rootfs as FUSE Lazy pull Manifest GET /v2/<image>/manifests/ GET /v2/<image>/blobs/ stargz eStargz Container registry Downloading manifest
  • 6.
    Copyright(c)2020 NTT Corp.All Rights Reserved Stargz and eStargz layers stargz eStargz bin/ls usr/bin/apt entrypoint.sh bin/bash bin/ls usr/bin/apt entrypoint.sh bin/bash Prioritized files Fetched by a single Range Request TOCEntriesTOCEntries Files fetched on demand also aggressively download in background l Stargz: an archive format proposed by Google CRFS project: https://github.com/google/crfs Ø Seekable but still valid targz = usable as a valid OCI/Docker image layer l eStargz enables to prefetch files that are likely accessed during runtime (= prioritized files) Ø Locally accessible to prioritized files, avoiding NW-related overheads gzip member per regular file Seekable It can be extracted per-file using HTTP Range Request
  • 7.
    Time to takefor container startup 0 5 10 15 20 25 estargz stargz legacy python:3.7 (print “hello”) pull create run [sec] Waits for prefetch completion Credit to Akihiro Suda (NTT) for discussion and experiment environment Slide from the presentation at KubeCon EU 2020 Virtual Measured on EC2 Oregon (m5.2xlarge, Ubuntu 20.04) & Docker Hub https://www.slideshare.net/KoheiTokunaga/startup-containers-in-lightning-speed-with-lazy-image-distribution
  • 8.
    Time to takefor container startup 0 5 10 15 20 25 30 estargz stargz legacy gcc:9.2.0 (compiles and runs printf(“hello”);) pull create run [sec] Credit to Akihiro Suda (NTT) for discussion and experiment environment Slide from the presentation at KubeCon EU 2020 Virtual Measured on EC2 Oregon (m5.2xlarge, Ubuntu 20.04) & Docker Hub https://www.slideshare.net/KoheiTokunaga/startup-containers-in-lightning-speed-with-lazy-image-distribution
  • 9.
    Time to takefor container startup 0 5 10 15 20 25 estargz stargz legacy glassfish:4.1-jdk8 (runs until “Running GlassFish” is printed) pull create run [sec] Credit to Akihiro Suda (NTT) for discussion and experiment environment Slide from the presentation at KubeCon EU 2020 Virtual Measured on EC2 Oregon (m5.2xlarge, Ubuntu 20.04) & Docker Hub https://www.slideshare.net/KoheiTokunaga/startup-containers-in-lightning-speed-with-lazy-image-distribution
  • 10.
    Copyright(c)2020 NTT Corp.All Rights Reserved Discussion in community l Builkit : Supports stargz snapshotter for lazy pulling base images l https://github.com/moby/buildkit/pull/1402 l Still requires to spawn stargz snapshotter daemon along with buildkitd l Discussion towards “built-in” support is in progress l https://github.com/moby/buildkit/pull/1666 l Moby(Docker) : Discussion in progress towards remote snapshotter support l https://github.com/moby/moby/pull/41002 l Needs more people joining the discussion l Podman/CRI-O : Discussion in progress towards remote snapshotter support l https://github.com/containers/podman/issues/4739 l CernVM-FS is working on integration with Podman l https://github.com/cvmfs/cvmfs/pull/2582