Images maintained by a reputable organization or an individual are often considered to be trustworthy; however, it is hard to deny the possibility that they might have silently injected malicious codes that are not present in the source repo. Also, even if they have no malicious intent, their images can still be compromised on an accidental leakage of registry credentials.
The latest release of BuildKit solves this supply chain security concern with reproducible builds. Reproducible builds is a technique to ensure that a bit-for-bit identical image can be reproduced from its source code, by anybody, at any time. When multiple actors can attest to an image's reproducibility, it signifies that the image contains no code of a secret origin.
Audiences of this talk will learn how they can and how sometimes they cannot make their images reproducible to improve their trust.
Faster Container Image Distribution on a Variety of Tools with Lazy PullingKohei Tokunaga
Talked at KubeCon + CloudNativeCon North America 2021 Virtual about lazy pulling of container images with eStargz and nydus (October 14, 2021).
https://kccncna2021.sched.com/event/lV2a
Faster Container Image Distribution on a Variety of Tools with Lazy PullingKohei Tokunaga
Talked at KubeCon + CloudNativeCon North America 2021 Virtual about lazy pulling of container images with eStargz and nydus (October 14, 2021).
https://kccncna2021.sched.com/event/lV2a
CloudNative Days Spring 2021 ONLINE キーノートでの発表資料です。
https://event.cloudnativedays.jp/cndo2021/talks/1071
本セッションでは、DockerとKubernetesのもつ基本的な機能の概要を、コンテナの仕組みをふまえつつイラストを用いて紹介していきます。一般にあまり焦点をあてて取り上げられることは多くありませんが、コンテナの作成や管理を担う低レベルなソフトウェア「コンテナランタイム」も本セッションの中心的なトピックのひとつです。
本セッションは、拙著「イラストで分かるDockerとKubernetes」(技術評論社)の内容を参考にしています。
https://www.amazon.co.jp/dp/4297118378
CloudNative Days Spring 2021 ONLINE キーノートでの発表資料です。
https://event.cloudnativedays.jp/cndo2021/talks/1071
本セッションでは、DockerとKubernetesのもつ基本的な機能の概要を、コンテナの仕組みをふまえつつイラストを用いて紹介していきます。一般にあまり焦点をあてて取り上げられることは多くありませんが、コンテナの作成や管理を担う低レベルなソフトウェア「コンテナランタイム」も本セッションの中心的なトピックのひとつです。
本セッションは、拙著「イラストで分かるDockerとKubernetes」(技術評論社)の内容を参考にしています。
https://www.amazon.co.jp/dp/4297118378
Introduction to Docker - Learning containerization XP conference 2016XP Conference India
Docker is an open-source platform which provides a great way to package and deploy applications. With its lightweight resource consumption pattern, it helps in making CI/CD environments faster and predictable. Learn how to setup Docker and deploy a basic web application.
Docker is an open-source project to easily create lightweight, portable, self-sufficient containers from any application. The same container that a developer builds and tests on a laptop can run at scale, in production, on VMs, bare metal, OpenStack clusters, public clouds and more.
Get hands-on with security features and best practices to protect your containerized services. Learn to push and verify signed images with Docker Content Trust, and collaborate with delegation roles. Intermediate to advanced level Docker experience recommended, participants will be building and pushing with Docker during the workshop.
Led By Docker Security Experts:
Riyaz Faizullabhoy
David Lawrence
Viktor Stanchev
Experience Level: Intermediate to advanced level Docker experience recommended
Preparation study for Docker Event
Mulodo Open Study Group (MOSG) @Ho chi minh, Vietnam
http://www.meetup.com/Open-Study-Group-Saigon/events/229781420/
Настройка окружения для кросскомпиляции проектов на основе docker'acorehard_by
Как быстро и легко настраивать/обновлять окружения для кросскомпиляции проектов под различные платформы(на основе docker), как быстро переключаться между ними, как используя эти кирпичики организовать CI и тестирование(на основе GitLab и Docker).
Overview of Docker 1.11 features(Covers Docker release summary till 1.11, runc/containerd, dns load balancing ipv6 service discovery, labels, macvlan/ipvlan)
JDO 2019: Tips and Tricks from Docker Captain - Łukasz LachPROIDEA
This session covers a bunch of tips and tricks for getting the most out of Docker. The tips were inspired by suggestions, blogs, and presentations and everyday challenges encountered by other Docker Captains but also the members of the Docker community. Come and see the unobvious and unexpected in terms of orchestration, image creation and management, also networking and volumes!
Build and Run Containers With Lazy Pulling - Adoption status of containerd St...Kohei Tokunaga
Talked about lazy pulling of container images with eStargz and Stargz Snapshotter at FOSDEM 2021.
Details: https://fosdem.org/2021/schedule/event/containers_lazy_pull/
Stargz Snapshotter: https://github.com/containerd/stargz-snapshotter
DCSF 19 Building Your Development Pipeline Docker, Inc.
Oliver Pomeroy, Docker & Laura Tacho, Cloudbees
Enterprises often want to provide automation and standardisation on top of their container platform, using a pipeline to build and deploy their containerized applications. However this opens up new challenges; Do I have to build a new CI/CD Stack? Can I build my CI/CD pipeline with Kubernetes orchestration? What should my build agents look like? How do I integrate my pipeline into my enterprise container registry? In this session full of examples and how-to's, Olly and Laura will guide you through common situations and decisions related to your pipelines. We'll cover building minimal images, scanning and signing images, and give examples on how to enforce compliance standards and best practices across your teams.
Introduction to InSpec and 1.0 release updateAlex Pop
Contains an introduction to infrastructure and compliance tests as code and how InSpec can be used for this.
Agenda:
* Why infrastructure tests as code
* What is InSpec and how it works
* Core and custom resources
* What's new in InSpec 1.0 (released Sept 26, 2016)
* Documentation and installation
* Integrations
* Demo
* Chef Community Summit
Similar to [DockerCon 2023] Reproducible builds with BuildKit for software supply chain security (20)
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
Rootless mode is a technique to harden containers by running the container engine as a non-root user. The support for rootless mode has been merged into Docker since v19.03 (2019) and in Kubernetes since v1.22 (2021). However, setting up Rootless Kubernetes has been more challenging than setting up Rootless Docker due to its complexity. This session presents Usernetes Generation 2, a Kubernetes distribution that wraps Kubernetes in Rootless Docker for ease of setting up multi-node Rootless Kubernetes clusters. Unlike the original Usernetes (Generation 1) that was based on "Kubernetes The Hard Way", Usernetes Generation 2 supports kubeadm. Usernetes Generation 2 is similar to `kind` and `minikube`, however, unlike them Usernetes Generation 2 supports forming real multi-node clusters using Flannel (VXLAN) and it can be potentially used for production clusters. https://github.com/rootless-containers/usernetes
https://github.com/rootless-containers/usernetes
Usernetes (Gen2) deploys a Kubernetes cluster inside Rootless Docker, so as to mitigate potential container-breakout vulnerabilities.
Usernetes (Gen2) is similar to Rootless kind and Rootless minikube, but Usernetes (Gen 2) supports creating a cluster with multiple hosts.
The internals and the latest trends of container runtimesAkihiro Suda
Containers are a set of various lightweight methods to isolate filesystems, CPU resources, memory resources, system permissions, etc. Containers are similar to virtual machines in many senses, but they are more efficient and often less secure. This talk roughly consists of the following three parts:
1. Introduction to containers and how they spread in the last decade
2. Internals of container runtimes: namespaces, cgroups, capabilities, seccomp, etc.
3. Latest trends: Non-Docker containers, User Namespaces, Rootless Containers, Kata Containers, gVisor, WebAssembly, etc.
http://www.cce.i.kyoto-u.ac.jp/danwa23.html
[Container Plumbing Days 2023] Why was nerdctl made?Akihiro Suda
nerdctl (contaiNERD CTL) was made to facilitate development of new technologies in the containerd platform.
Such technologies include:
- Lazy-pulling with Stargz/Nydus/OverlayBD
- P2P image distribution with IPFS
- Image encryption with OCIcrypt
- Image signing with Cosign
- “Real” read-only mounts with mount_setattr
- Slirp-less rootless containers with bypass4netns
- Interactive debugging of Dockerfiles, with buildg
nerdctl is also useful for debugging Kubernetes nodes that are running containerd.
Through this session, the audiences will learn these functionalities of nerdctl, relevant projects, and the roadmap for the future.
https://containerplumbing.org/sessions/2023/why_was_nerdctl_
[KubeCon EU 2022] Running containerd and k3s on macOSAkihiro Suda
https://sched.co/ytpi
It has been very hard to use Mac for developing containerized apps. A typical way is to use Docker for Mac, but it is not FLOSS. Another option is to install Docker and/or Kubernetes into VirtualBox, often via minikube, but it doesn't propagate localhost ports, and VirtualBox also doesn't support the ARM architecture. This session will show how to run containerd and k3s on macOS, using Lima and Rancher Desktop. Lima wraps QEMU in a simple CLI, with neat features for container users, such as filesystem sharing and automatic localhost port forwarding, as well as DNS and proxy propagation for enterprise networks. Rancher Desktop wraps Lima with k3s integration and GUI.
[KubeCon EU 2021] Introduction and Deep Dive Into ContainerdAkihiro Suda
Join containerd maintainers and reviewers in a combined introduction and deep dive session. They will discuss the overview and the recent updates of containerd as well as how it is being used by Kubernetes, Docker and other container-based systems. The brief introduction about its architecture and service design will be included. The talk will also deep dive into how to leverage contained by extending and customizing it for your use case with low-level plugins like remote snapshotters, as well as by implementing your own containerd client. Upcoming features and recent discussion in containerd community will also be covered.
- - -
https://kccnceu2021.sched.com/event/iE6v/introduction-and-deep-dive-into-containerd-kohei-tokunaga-akihiro-suda-ntt-corporation?iframe=no
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Atelier - Innover avec l’IA Générative et les graphes de connaissancesNeo4j
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Allez au-delà du battage médiatique autour de l’IA et découvrez des techniques pratiques pour utiliser l’IA de manière responsable à travers les données de votre organisation. Explorez comment utiliser les graphes de connaissances pour augmenter la précision, la transparence et la capacité d’explication dans les systèmes d’IA générative. Vous partirez avec une expérience pratique combinant les relations entre les données et les LLM pour apporter du contexte spécifique à votre domaine et améliorer votre raisonnement.
Amenez votre ordinateur portable et nous vous guiderons sur la mise en place de votre propre pile d’IA générative, en vous fournissant des exemples pratiques et codés pour démarrer en quelques minutes.
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeAftab Hussain
Understanding variable roles in code has been found to be helpful by students
in learning programming -- could variable roles help deep neural models in
performing coding tasks? We do an exploratory study.
- These are slides of the talk given at InteNSE'23: The 1st International Workshop on Interpretability and Robustness in Neural Software Engineering, co-located with the 45th International Conference on Software Engineering, ICSE 2023, Melbourne Australia
OpenMetadata Community Meeting - 5th June 2024OpenMetadata
The OpenMetadata Community Meeting was held on June 5th, 2024. In this meeting, we discussed about the data quality capabilities that are integrated with the Incident Manager, providing a complete solution to handle your data observability needs. Watch the end-to-end demo of the data quality features.
* How to run your own data quality framework
* What is the performance impact of running data quality frameworks
* How to run the test cases in your own ETL pipelines
* How the Incident Manager is integrated
* Get notified with alerts when test cases fail
Watch the meeting recording here - https://www.youtube.com/watch?v=UbNOje0kf6E
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Zoom is a comprehensive platform designed to connect individuals and teams efficiently. With its user-friendly interface and powerful features, Zoom has become a go-to solution for virtual communication and collaboration. It offers a range of tools, including virtual meetings, team chat, VoIP phone systems, online whiteboards, and AI companions, to streamline workflows and enhance productivity.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Understanding Nidhi Software Pricing: A Quick Guide 🌟
Choosing the right software is vital for Nidhi companies to streamline operations. Our latest presentation covers Nidhi software pricing, key factors, costs, and negotiation tips.
📊 What You’ll Learn:
Key factors influencing Nidhi software price
Understanding the true cost beyond the initial price
Tips for negotiating the best deal
Affordable and customizable pricing options with Vector Nidhi Software
🔗 Learn more at: www.vectornidhisoftware.com/software-for-nidhi-company/
#NidhiSoftwarePrice #NidhiSoftware #VectorNidhi
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Software Engineering, Software Consulting, Tech Lead, Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Transaction, Spring MVC, OpenShift Cloud Platform, Kafka, REST, SOAP, LLD & HLD.
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
Empowering Growth with Best Software Development Company in Noida - Deuglo
[DockerCon 2023] Reproducible builds with BuildKit for software supply chain security
1. Reproducible builds with BuildKit
for software supply chain security
Akihiro Suda
Software Engineer | NTT
2. 2
Background
• Security assessment has been hard for Docker images, due to lack of
verifiability in the software supply chain
• Even when the source code (Dockerfile) is public, and the source code
appears to be harmless, it is hard to prove that the image is actually
buildable from the source code
• Reproducible builds help proving it
(But whether the source code is harmless is another topic)
3. 3
What are Reproducible Builds?
• Same source, same binary
• Attestable by anybody
• Attestable at anytime
Build
FROM debian
RUN apt-get install -y gcc make ...
COPY . .
RUN make
sha256:6ea7098583cb6c9470570df28c154
cfec58e122188382cd4a7ceab8a9a79cb67
sha256:6ea7098583cb6c9470570df28c154
cfec58e122188382cd4a7ceab8a9a79cb67
sha256:6ea7098583cb6c9470570df28c154
cfec58e122188382cd4a7ceab8a9a79cb67
4. 4
Why do we need reproducible builds?
Because non-reproducible builds cannot be proved to be
buildable from harmless sources
5. 5
Why do we need reproducible builds?
sha256:AAAAA…
Upstream build
docker pull some-image Pull
Because non-reproducible builds cannot be proved to be
buildable from harmless sources
6. 6
Because non-reproducible builds cannot be proved to be
buildable from harmless sources
Why do we need reproducible builds?
FROM debian
RUN apt-get install -y gcc make ...
COPY . .
RUN make
sha256:AAAAA…
Upstream build
docker pull some-image Pull
Find the source repo
7. 7
Why do we need reproducible builds?
Build
FROM debian
RUN apt-get install -y gcc make ...
COPY . .
RUN make
sha256:AAAAA…
sha256:BBBBB…
Upstream build
docker pull some-image Pull
Find the source repo
Because non-reproducible builds cannot be proved to be
buildable from harmless sources
Your own build
(Non-reproducible)
8. 8
Why do we need reproducible builds?
Build
FROM debian
RUN apt-get install -y gcc make ...
COPY . .
RUN make
sha256:AAAAA…
sha256:BBBBB…
Upstream build
docker pull some-image Pull
Find the source repo
Your own build
(Non-reproducible)
Is this image
really buildable from
the source repo?
Because non-reproducible builds cannot be proved to be
buildable from harmless sources
9. 9
Why do we need reproducible builds?
Build
FROM debian
RUN apt-get install -y gcc make ...
COPY . .
RUN make
sha256:AAAAA…
sha256:AAAAA…
Upstream build
docker pull some-image Pull
Find the source repo
Proved to be
buildable
from the source
Because non-reproducible builds cannot be proved to be
buildable from harmless sources
Your own build
(Reproducible)
10. 10
Why do we need reproducible builds?
• Reproducibility per se doesn’t prove any harmlessness
• Non-reproducibility doesn’t prove any harmfulness, either
11. 11
Why do we need reproducible builds?
• Reproducibility proves that the image is actually buildable from its source
• The source still has to be reviewed
• The source may still be malicious
• But at least the image contains no secret code that you can never review
15. 15
Docker Hub images are actually reproducible?
$ docker pull golang:1.21.1-alpine@
sha256:96634e55b363cb93d39f78fb18aa64abc7f96d372c176660d7b8b6118939d97b
$ DOCKER_BUILDKIT=0
docker build -t my-golang
"https://github.com/docker-library/golang.git#
585c8c1e705a7a458455f0629922a4f90628ce08:1.21/alpine3.18”
$ go install github.com/reproducible-containers/diffoci/cmd/diffoci@latest
$ diffoci diff docker://golang:1.21.1-alpine docker://my-golang
DOCKER_BUILDKIT=0 with Docker 20.10.23
corresponds to the current Docker Hub image
(Will change in the future)
16. 16
Docker Hub images are actually reproducible?
$ docker pull golang:1.21.1-alpine@
sha256:96634e55b363cb93d39f78fb18aa64abc7f96d372c176660d7b8b6118939d97b
$ DOCKER_BUILDKIT=0
docker build -t my-golang
"https://github.com/docker-library/golang.git#
585c8c1e705a7a458455f0629922a4f90628ce08:1.21/alpine3.18”
$ go install github.com/reproducible-containers/diffoci/cmd/diffoci@latest
$ diffoci diff docker://golang:1.21.1-alpine docker://my-golang
DiffOCI: diff for Open Container Initiative (OCI) images
https://github.com/reproducible-containers/diffoci
17. 17
Docker Hub images are actually reproducible?
$ diffoci diff docker://golang:1.21.1-alpine docker://my-golang
TYPE NAME INPUT-0 INPUT-1
Desc application/vnd.docker.distribution.manifest.v2+json b25862... 3c4eca0...
...
File etc/ssl/certs/3e45d192.0 2023-08-09 03:36:47 +0000 UTC 2023-09-21 08:35:31 +0000 UTC
...
(More than 14,000 lines)
...
File go/ 2023-09-06 18:31:40 +0000 UTC 2023-09-21 08:35:45 +0000 UTC
18. 18
Docker Hub images are actually reproducible?
$ diffoci --semantic diff docker://golang:1.21.1-alpine docker://my-golang
TYPE NAME INPUT-0 INPUT-1
Layer ctx:/layers-1/layer length mismatch (457 vs 454)
Layer ctx:/layers-1/layer name "usr/local/share/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "etc/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "usr/share/ca-certificates/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar eef110e... e9bfe18...
Layer ctx:/layers-2/layer length mismatch (13939 vs 13938)
Layer ctx:/layers-2/layer name "usr/local/go/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar 60e22bb... 67f2648...
Layer ctx:/layers-3/layer length mismatch (4 vs 3)
Layer ctx:/layers-3/layer name "go/.wh..wh..opq" only appears in input 0
The “--semantic” flag ignores ”boring” differences (timestamps, file ordering, etc.)
19. 19
Docker Hub images are actually reproducible?
$ diffoci --semantic diff docker://golang:1.21.1-alpine docker://my-golang
TYPE NAME INPUT-0 INPUT-1
Layer ctx:/layers-1/layer length mismatch (457 vs 454)
Layer ctx:/layers-1/layer name "usr/local/share/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "etc/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "usr/share/ca-certificates/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar eef110e... e9bfe18...
Layer ctx:/layers-2/layer length mismatch (13939 vs 13938)
Layer ctx:/layers-2/layer name "usr/local/go/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar 60e22bb... 67f2648...
Layer ctx:/layers-3/layer length mismatch (4 vs 3)
Layer ctx:/layers-3/layer name "go/.wh..wh..opq" only appears in input 0
“.wh..wh..opq” (AUFS whiteouts) are missing due to the filesystem difference
The “--semantic” flag ignores ”boring” differences (timestamps, file ordering, etc.)
20. 20
Docker Hub images are actually reproducible?
$ diffoci --semantic diff docker://golang:1.21.1-alpine docker://my-golang
TYPE NAME INPUT-0 INPUT-1
Layer ctx:/layers-1/layer length mismatch (457 vs 454)
Layer ctx:/layers-1/layer name "usr/local/share/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "etc/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "usr/share/ca-certificates/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar eef110e... e9bfe18...
Layer ctx:/layers-2/layer length mismatch (13939 vs 13938)
Layer ctx:/layers-2/layer name "usr/local/go/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar 60e22bb... 67f2648...
Layer ctx:/layers-3/layer length mismatch (4 vs 3)
Layer ctx:/layers-3/layer name "go/.wh..wh..opq" only appears in input 0
“.wh..wh..opq” (AUFS whiteouts) are missing due to the filesystem difference
lib/apk/db/scripts.tar differ due to the timestamp information inside scripts.tar
(the “--semantic” flag isn’t still clever enough to ignore this “boring” difference”)
The “--semantic” flag ignores ”boring” differences (timestamps, file ordering, etc.)
21. 21
Docker Hub images are actually reproducible?
$ diffoci --semantic diff docker://golang:1.21.1-alpine docker://my-golang
TYPE NAME INPUT-0 INPUT-1
Layer ctx:/layers-1/layer length mismatch (457 vs 454)
Layer ctx:/layers-1/layer name "usr/local/share/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "etc/ca-certificates/.wh..wh..opq" only appears in input 0
Layer ctx:/layers-1/layer name "usr/share/ca-certificates/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar eef110e... e9bfe18...
Layer ctx:/layers-2/layer length mismatch (13939 vs 13938)
Layer ctx:/layers-2/layer name "usr/local/go/.wh..wh..opq" only appears in input 0
File lib/apk/db/scripts.tar 60e22bb... 67f2648...
Layer ctx:/layers-3/layer length mismatch (4 vs 3)
Layer ctx:/layers-3/layer name "go/.wh..wh..opq" only appears in input 0
“.wh..wh..opq” (AUFS whiteouts) are missing due to the filesystem difference
lib/apk/db/scripts.tar differ due to the timestamp information inside scripts.tar
(the “--semantic” flag isn’t still clever enough to ignore this “boring” difference”)
This image is not fully reproducible, but its non-reproducibility is explainable
(So, this image appears to be actually buildable from the public Dockerfile)
The “--semantic” flag ignores ”boring” differences (timestamps, file ordering, etc.)
22. 22
Why are images not reproducible?
• Timestamps
• Version of the base image (“FROM” images in Dockerfiles)
• Versions of the packages (apt-get, pip, etc.)
• Others:
- Filesystem characteristics (e.g., OverlayFS)
- Ordering of files
- Randomized mktemp, etc.
23. 23
Timestamps
• The images have timestamps in:
- the “created” property in the OCI Image Config
- the “history” property in the OCI Image Config
- the “org.opencontainers.image.created” annotation in the OCI Index
- the timestamps of the files in the image layers
OCI = Open Container Initiative
24. 24
Timestamps
• The images have timestamps in:
- the “created” property in the OCI Image Config
- the “history” property in the OCI Image Config
- the “org.opencontainers.image.created” annotation in the OCI Index
- the timestamps of the files in the image layers
OCI = Open Container Initiative
25. 25
Timestamps
• BuildKit (since v0.11) supports rewriting the timestamps for OCI Image Config
and OCI Index
• Support was incomplete in v0.11 and v0.12;
using v0.13 [beta] is recommended (see the next couple of slides)
OCI = Open Container Initiative
buildctl build --opt build-arg:SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)
docker buildx build --build-arg SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)
Unix epoch (int64, seconds from 1970-01-01 00:00:00 UTC)
26. 26
Timestamps
• The images have timestamps in:
- the “created” property in the OCI Image Config
- the “history” property in the OCI Image Config
- the “org.opencontainers.image.created” annotation in the OCI Index
- the timestamps of the files in the image layers
OCI = Open Container Initiative
27. 27
Timestamps
• BuildKit v0.13 [beta] supports rewriting the timestamps in the OCI image
layers too
• Docs: https://github.com/moby/buildkit/blob/master/docs/build-repro.md
OCI = Open Container Initiative
buildctl build --opt build-arg:SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)
--output type=image,name=example.com/image,push=true,rewrite-timestamp=true
docker buildx build --build-arg SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)
--output type=image,name=example.com/image,push=true,rewrite-timestamp=true
BuildKit v0.13 is still beta and its CLI is still subject to change until its GA
28. 28
Timestamps
• The SOURCE_DATE_EPOCH arg is also propagated to ”RUN” containers as an
environment variable
• The SOURCE_DATE_EPOCH env var is recognized by gcc, clang, cmake, and a
bunch of other tools to make application binaries reproducible:
https://reproducible-builds.org/docs/source-date-epoch/
31. 31
Pinning the base image
FROM debian:bookworm-20230904
FROM debian:bookworm
FROM debian
32. 32
Pinning the base image
FROM debian:bookworm-20230904@sha256:b4042f895d5d1f8df415caebe7c416f9dbcf0dc8867abb225955006de50b21f3
FROM debian:bookworm-20230904
FROM debian:bookworm
FROM debian
33. 33
Pinning the base image
FROM debian:bookworm-20230904@sha256:b4042f895d5d1f8df415caebe7c416f9dbcf0dc8867abb225955006de50b21f3
FROM debian:bookworm-20230904
FROM debian:bookworm
FROM debian
apt-get on bookworm-20230904 still installs
the latest packages, not the past packages
(So, not reproducible)
35. 35
Pinning packages: Debian and Ubuntu
FROM debian:bookworm-20230904-slim
ADD --chmod=0755
https://raw.githubusercontent.com/reproducible-containers/repro-sources-list.sh/v0.1.0/repro-sources-list.sh
/usr/local/bin/repro-sources-list.sh
RUN --mount=type=cache,target=/var/cache/apt
repro-sources-list.sh &&
apt-get update &&
apt-get install -y gcc
More examples at: https://github.com/reproducible-containers/repro-sources-list.sh
repro-sources-list.sh simplifies the Dockerfile, and enables caching dpkg files
Caching is practically necessary,
as snapshot servers are slow
36. 36
Pinning packages: Debian and Ubuntu
• RUN --mount=type=cache,target=/var/cache/apt can be saved on GitHub Actions using:
https://github.com/reproducible-containers/buildkit-cache-dance
steps:
- uses: actions/cache@v3
with:
path: var-cache-apt
key: var-cache-apt-${{ hashFiles('Dockerfile') }}
- uses: reproducible-containers/buildkit-cache-dance@v2.1.2
with:
cache-source: var-cache-apt
cache-target: /var/cache/apt
37. 37
Pinning packages: Debian and Ubuntu
• The (checksums of the) packages on snapshot.debian.org are signed by Debian, just like
regular apt-get repositories
• The signatures are fetched and verified against the package metadata checksums on
running apt-get update (Not on apt-get install)
• If /var/lib/apt (metadata) is compromised, apt-get update will fail
• If /var/cache/apt (dpkg files) is compromised, apt-get install will fail
• The situation is same for snapshot.ubuntu.com (signed by Canonical)
38. 38
Pinning packages: Debian and Ubuntu
• If you don’t trust the latest package signatures, you can reproduce the most
of the packages by yourself:
https://wiki.debian.org/ReproducibleBuilds/Howto
40. 40
Pinning packages: NixOS
• Repro build is much easier with NixOS
(although NixOS per se is often considered to be hard to learn)
• The flake.lock file contains the checksums of the sources
• If the binary is present on cache.nixos.org, the cached binary is used;
otherwise the package is built from the source, with very good reproducibility
(99.77% for nixos.iso_minimal.x86_64-linux installation, according to https://r13y.com/)
41. 41
Pinning packages: Alpine, Rocky, Alma, etc.
• These distros do not provide snapshot servers like snapshot.debian.org
• You have to preserve /etc/apk/cache , /var/cache/dnf, etc. by yourself
• Examples can be found at:
https://github.com/reproducible-containers/repro-pkg-cache
• In the long term, BuildKit frontends may have features to help pinning
packages: https://github.com/moby/buildkit/issues/4259
42. 42
Future work (Help wanted)
• Proposal to make well-known images reproducible
(at least for Debian-based ones)
• ”Single-click” platform for attesting reproducibility and sharing the result
43. 43
Recap
• Repro builds prove that an image is actually buildable from its source
• Whether the source is harmless or not is another topic
• Ideally, every image should be bit-for-bit reproducible
• Practically, subtle differences can be allowed, when they are explainable
(e.g., timestamps)
44. 44
Recap
Tools and examples: https://github.com/reproducible-containers
• diffoci: diff for OCI images, to analyze non-reproducible builds
• repro-sources-list.sh: reproducibility helper for Debian, Ubuntu, etc.
• repro-pkg-cache: reproducibility helper for Alpine, Alma, Rocky, etc.
• buildkit-cache-dance: apt-get cache for GitHub Actions
BuildKit docs: https://github.com/moby/buildkit/blob/master/docs/build-repro.md
OCI = Open Container Initiative
45. 45
Recap
• Slides will be uploaded to https://github.com/AkihiroSuda
(README → “Presentation slides”)