Virtual Machine vs. Container
Major Container Types….
Docker: The most popular container OS - a lightweight, portable, self sufficient LXC container that can run virtually anywhere. It has layered container
image, global container registry, Cluster management, CLI/Rest API connections.
Rocket: From CoreOS/RedHat, Rocket has more rigorous security, app container specifications. The core execution unit of rkt is the pod, a collection of
one or more applications executing in a shared context like k8s pods.
Photon: Photon OS is a minimal Linux container host, optimized to run on VMware platforms. Compatible with Docker, and Kubernetes.
Garden: From Pivotal Cloud Foundry. Garden (Warden) is a platform-agnostic Go API for container creation and management, with pluggable back ends
for different platforms and runtimes.
Mesos Containers: MesosContainerizer provides lightweight containerization and resource isolation of executors using Linux-specific functionality such
as control cgroups and namespaces.
Windows Containers: Two different run times - Windows Server Containers which uses shared kernel space and Hyper-V Isolation Containers which run
each container in a optimized virtual machine (e.g: Windows 10 containers)
IBM Nabla Container: Cut down OS system calls to a bare minimum with as little code as possible. This is expected to decrease the surface area available
for an attack. Make use of Library OS (unikernel techniques) and use only 9 system calls; the rest are blocked through linux seccomp policy.
Google gVisor: User-space kernel, written in Go, that implements a substantial portion of the Linux system surface. It includes an OCI runtime
called runsc that provides an isolation boundary between the application and the host kernel.
Kata Containers: standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workload
isolation and security advantages of VMs. Intel’s clear container + Hyper’s runv merge together - compatible with the OCI.
Conrtainerd: From Docker a core container runtime available, now part of CNCF, as a daemon for Linux and Windows, which can manage the complete
container lifecycle of its host system. It uses runC to run containers according to the OCI specification.
OCI: Open Container Initiative -currently contains two specifications: the Runtime Specification (runtime-spec) and the Image Specification (image-spec).
The Runtime Specification outlines how to run a “filesystem bundle” that is unpacked on disk. At a high-level an OCI implementation would download an
OCI Image then unpack that image into an OCI Runtime filesystem bundle. 3
Docker offers a high-level tool with several powerful functionalities on top of LXC: Portable deployment across
machines; Application centric optimization,Automatic build from source, Versioning, Component re-sue parent
image, Sharing in public registry, Tool for automation (PaaS-like deployment (Dokku, Deis, Flynn), multi-node
orchestration (Maestro, Salt, Mesos, Openstack Nova), management dashboards (docker-ui, Openstack Horizon, Shipyard),
configuration management (Chef, Puppet), continuous integration (Jenkins, Strider, Travis), etc)
Docker is a platform and tool for building, distributing, and running containers. Kubernetes is a container
orchestration system for Docker or other containers.
mini services Vs. micro services
• Microservices architecture, each service should have
zero awareness of the services around it.
• A miniservice is like a group of microservices come
together for a specific purpose.
• A miniservice seems to be a simpler, pragmatic way
of doing microservices or something closely related.
For example, each microservice must handle its own
data, miniservices may share data.”
• The cost efficiency is lost in microservices but the
benefit is more functionality in a single project,
• Miniservices are solid solutions when you have a
higher complexity, like image processing,
• Better build a proof of concept on a monolith
because it’s easier to manage.
• More here: https://thenewstack.io/miniservices-a-realistic-
• Stateless workloads – Disposable and crate new one if find unhealthy - web
applications, mobile backend, or API servers, ngnix
• Stateful workloads – each container has sticky identity - Zookeeper, MySQL,
Other DB hosting, JBOSS
• Daemon Pattern – Every node runs a copy of container – Linkered, fluentd
• Batch Pattern – Independent but related work run in parallel – Bunch of
• Persistent storage type – Separation of data and reuse
• Big data Workloads – external volumes, local volumes
• Long Running Workloads - Services long duration
• Large scale workload – ERP, CRM, Proprietary SW
• Specific need based – advanced routing, protocols other than non-https/web
sockets, network scanners, etc.
What Container Surveys shows?
• State of the Application Container market (??
security concern) -
• ESG Data Survey (35% security concern) -
• RSAConf2018 (25% security concern) -
Major Concern in adoption of containers?
Kernel exploits :
the kernel is shared among all containers and the
a user may gain elevated privileges
attacker can use tampered images
secret info for accessing services can be compromised
one container can monopolize access to certain
Run time unknowns:
Linux kernel Architecture & Container
LXC (Linux Containers) is an operating-system-level virtualization method for running multiple isolated Linux
systems (containers) on a control host using a single Linux kernel. “LXC” refers to capabilities of the Linux kernel
(specifically namespaces and control groups) which allow sandboxing processes from one another, and controlling their
Kernel features: Isolated Namespaces: (Audit, System, Device, Time, Taskcount), Watch mount points, root level
permissions, file system read only privilege for Copy-on-write, Use resource limitation feature, Avoid running container
with User ID = 0; Harden OS,
Container Isolation Mechanisms
Machine Level Virtualization – Expose virtualized
hardware to the guest kernel. E.g: KVM & Xen.
Rule Based Execution – applying fine grained security
policy to enforce kernel hooks/rules. Can use AppArmor –
bind access control to programs, Seccomp – restrict
system calls, SELinux – access control security policy.
Intercept application system calls and acts as the guest
kernel - No virtualized HW needed. This architecture
allows it to provide a flexible resource footprint (i.e. one
based on threads and memory mappings, not fixed guest
physical resources). E.g: gVisor.
Container image Security
Anyone can code, create image and upload to the registry !
Verify the source of the image (download from Docker Hub or Quay or Gitlab).
Enable Docker content Trust which verifies with digital signatures. Docker image integrity can
be implemented using default Docker options by signing the tagged versions.
Add image scanning service from third party vendors
Strong Access control mechanism for container images – by default all users have root privilege
but change it to non-root.
Apply task centric access control or RBAC so that one user compromise still the damage can be
Keep containers light weight – Too many packages means to many problems together and also
Monitor image health during run time with threat detection tools
Update images with the latest version and security patches
Don’t store confidential tokens, etc. in the dockerfile/containerfile.
Container Secrets Management
Don’t use environment variables for secrets
No secret in container image
All secret operation is logged
Secret must be transferred in secure channel
Secret itself must be encrypted. It can be decrypted by containers
Use credential management tool or create secret storage with third
party vendor products
The application is responsible for authenticating and authorizing
Rotate the secret in a regular basis
Revoke the secret in case of expose
Container Run time handling
Make few running process
Be task specific – path, port, daemons config, mount points, etc.
Security Monitoring – In-house application may have issues
Have generous logs - store in searchable format for future analysis
Policy creation based on container behavior
Manage security at application level
Mount as read only – copy on write prevention
Container Boot chain trust – Create a trust chain (Intel TXT, Bootloader, Initrd, etc.)
Partitioning & Device mapper – Storage separate partitions
Vertical (code which is prone to bugs)
Horizontal (security holes including in host) – Associate security profiles with each program - e.g:
Container Security Best Practices….
1) Hardening – Configuration against benchmarks, Host/Deamon/Kernel Features control,
Avoid Privileged mode execution, Avoid Noisy Neighbor, limit resources CPU/Memory,
Network traffic on default bridge,
2) Minimal OS – minimum surface area of attack (e.g: Canonical Light OS, Redhat Atomic)
3) Image Security - Multi stage build of Image, image scanning(from registry), Code level
security, Authenticity, Docker content trust enable,
4) Secret management – Separate until runtime, Encrypt at Rest,
5) Health check – Life cycle management, Delete drifted containers, container sprawl loose
success control, Continuous monitoring – container traffic, Service Log Management
6) Process, File & Device restrictions – Roles, RBAC, Authentication/Authorization, AUFS
driver dont use, enable user name space,
7) Trusted Containers based on HW – Trust Chain
8) Threat control by using third party tools; Configuration best practice like Docker bench
9) Compliance – situation specific Laws, Frameworks, Benchmarks (FISMA, NIST, etc.)
10) Take holistic approach & do Security Benchmarking
• CIS (Center for Internet Security) Benchmarks - (100+ configuration
guidelines for various technology groups to safeguard systems against
today’s evolving cyber threats.) - https://www.cisecurity.org/cis-
benchmarks/ & https://docs.docker.com/compliance/
• Intel & Portworx: Achieving Breakthrough Performance for Containerized
Workloads - https://portworx.com/intel-container-performance/
• Distributed Load Testing Using Kubernetes – LOCUST
Why workload security Important?
All secure inter service communication in production environment
Avoid unauthorized internal access to sensitive service data (e.g: Stolen authentication
token that can be replayed from another client)
Protect all microservices endpoint
Automatic encryption of data at transit
Identify the workload authenticity – limit the access
Quarantine the workload or micro segmentation
Large scale management of certificate, keys, tokens, etc.
E.g: Google implements ALTS (Application Layer Transport Security) which helps in defining fine-grained authorization
policies , workload peer-to-peer authentication and fine-grained security auditing at workload level .
SPIFFE (CNCF project)
SPIFFE (Secure Production Identity Framework For Everyone) provides a secure identity, in the form of a specially
crafted X.509 certificate, to every workload in a modern production environment. SPIFFE removes the need for
application-level authentication and authorization and complex network-level ACL configuration.
At its heart, SPIFFE is:
A standard defining how services identify themselves to each other. These are called SPIFFE IDs and are
implemented as Uniform Resource Identifiers (URIs).
A standard for encoding SPIFFE IDs in a cryptographically-verifiable document called a SPIFFE Verifiable Identity
Document or SVIDs.
An API specification for issuing and/or retrieving SVIDs. This is the Workload API.
SPIFFE also avoids vendor lock-in for authentication and authorization based on IAM modules from vendors .
SPIFFE specification support usage of X509-based SVIDS and workload attestation (which defines policy to allow
communication) . SPIFFE also supports JWT-token based SVID but is inherently susceptible to replay attacks by
token-based usage designs (specification is under development) .
- Istio Auth https://www.cncf.io/projects/
SPIRE (CNCF project)
SPIRE (the SPIFFE Runtime Environment) is a software system that exposes an API (the SPIFFE Workload API) to other
running software systems (workloads) so they can retrieve their identity, as well as documents (SVIDs) to prove it to
other workloads, at run-time. This proof of identity can then be used as the primary authentication mechanism for a
workload when access other systems.
A set of certificates that can be used by the workload to verify the identity of other workloads (a trust bundle.)
SPIFFE/SPIRE is a tool-chain that automatically issues and automatically rotates authorized credentials.
Relies on each workload (rather than host) having an identity, which is expressed as a set of credentials.
Identities are bound to entities instead of to a specific server name or host.
SPIFFE strictly addresses Service-to-Service interactions – human identity out of scope
SPIFFE is a set of conventions around how to get and use x.509 certificates & its life cycle management
SPIRE Deployment – two workloads communicating over Ghostunnel using SVID
Istio Auth (not CNCF project)
The version 0.1 release of Istio Auth runs on
Kubernetes and provides the following features:
• Strong identity assertion between services
• Access control to limit the identities that can
access a service (and its data)
• Automatic encryption of data in transit
• Management of keys and certificates at scale
Istio Auth is based on industry standards like mutual
TLS and X.509. Google is actively contributing to an
open, community-driven service security
framework SPIFFE. As the SPIFFE specifications
mature, Istio Auth to become a reference
implementation of the same.
Harbor (CNCF project)
Project Harbor is an an open source trusted cloud native registry project
that stores, signs, and scans content. Harbor extends the open source
Docker Distribution by adding the functionalities usually required by users
such as security, identity and management. Harbor supports advanced
features such as user management, access control, activity monitoring,
and replication between instances. Having a registry closer to the build
and run environment can also improve image transfer efficiency.
•Multi-tenant content signing and validation
•Security and vulnerability analysis
•Identity integration and role-based access control
•Image replication between instances
•Extensible API and graphical UI
•Internationalization (currently English and Chinese)
Notary (CNCF project)
The Notary project comprises a server and a client for running and interacting
with trusted collections. Notary aims to make the internet more secure by making
it easy for people to publish and verify content. We often rely on TLS to secure
our communications with a web server, which is inherently flawed, as any
compromise of the server enables malicious content to be substituted for the
With Notary, publishers can sign their content offline using keys kept highly
secure. Once the publisher is ready to make the content available, they can push
their signed trusted collection to a Notary Server.
Consumers, having acquired the publisher's public key through a secure channel,
can then communicate with any Notary server or (insecure) mirror, relying only
on the publisher's key to determine the validity and integrity of the received
OPA (CNCF project)
• Decouple policy decisions from your services to achieve
unified control across the entire stack with any language
• Express policies in a high-level declarative
language that promotes safe, fine-grained logic
and enables powerful features such as impact
analysis, hot reloading, query optimization, and
Some container Security Products…(not part of CNCF)
1) Docker Bench security – Image scanning & Application Analysis; compliance/audit; best practices tool
2) CoreOS Clair – Vulnerability static Analysis in appc & docker containers
3) Twistlock – Full life cycle; Compliance explorer, Run time radar, cloud native firewalls,
4) Acqua Security – Full life cycle; container runtime traffic, image scanner & Network nano segmentation
5) Anchore – Open source project- Image analysis/certify & Security Policy for CI/CD
6) NeuVector – A multi-Vector container security; application aware network security for run time threats, Layer 7 firewall
7) CloudPassage Halo – Full life cycle security & compliance
8) Aporeto – Zerio trust security concept; Open source application segmentation – Project Trireme
9) Tenable Flawcheck - continuously monitor container images for malware & vulnerabilities.
10) Black Duck – Scanning – vulnerable and outdated software detection – track all open source in use.
11) Capsule8 – Real time threat protection & automated destruction; 3rd party integration
12) StackRox – Container life cycle; adhere to security policy, runtime detection
13) Sysdig Falco – Open source container run time. Enforce policy, deep container visibility
14) Hashicorp Vault – Tool for managing secrets
15) Google/IBM/RedHat/Twistlock – Grafeas(secure microservices audit API for supply chain) & Kritis(Policy Engine real time for k8s)
& more: 451 Research estimates that there are currently 125 application container vendors, and the firm expects that number to continue to
Some images/contents in this presentation are taken from the web and share it here for educational purpose only.
Thanks for all sharing the materials online
Q & A
z. https://blog.docker.com/2015/05/understanding-docker-security-and-best-practices/ 29