Docker storage designing a platform for persistent data

Dan Finneran (@thebsdbox)
EMEA Solutions Architect,
Docker
Docker Storage:
Designing a Platform for
Persistent Data

Agenda
What does immutable mean for your data
Applications with Persistent Data requirements
Persistent Data with Docker
Docker volume plugins
Orchestrating Storage (Swarm / Kubernetes)
Key Takeaways / Conclusion
Questions

Adjective:
”unchanging over time or unable to
be changed: an immutable fact”
Immutability

• Application + required
libs/assets only
• Designed to be
automated
• Re-Built
ContainerOperating System
• Kernel/Libs/Userland
tools for all uses
• Automation requires
scripts /3rd party
tools
• Patched

Docker Image
FROM alpine (Base image)
COMMIT FE234B (Install binary packages)
COMMIT 234CED (Copy assets or additional code)
Dan's new
Docker Image
V1.0

Docker Container
FROM dan/container:1.0
CoW layer (Copy on Write)
$ docker run --rm dan/container:1.0
/test_file

Docker Container
FROM dan/container:1.0
CoW layer (Copy on Write)
$ docker run --rm dan/container:1.0

Applications with Persistent Data
requirements

• Regardless of the lifespan of the container the data
should always persist.
• The container could be scheduled to run on any
node in the cluster, meaning persistent data may
need to be accessed from any node.
Persistent Data

Accessing Data
Block
File
API
host01
/
mnt
iSCSI / Fibre Channel
NFS
REST

Is it wrong to run a Database in a
container?

Latency IOPs
Bandwidth /
Throughput
Security
(external
requirements)
21
34
Databases (+ additional requirements)

Batch processing
Image Processing Format Conversion Transcoding Batch Processing
• Watermark
• Resizing
• Formatting
• Images
• Documents
• Custom data
• Device
handling
• Bandwidth
streaming
• Tasks on
multiple
files/sources

A large number of applications will typically "park" cold
data to a disk under the following circumstances:
• Waiting for a back-end system to respond
• Out of order data being processed
• Data-sets that are typically too large to mapped to
memory
Applications that Require Persistent Data
Between Restarts

Writing and storing
logs inside a
running container
Push logs to an
external /
centralized
platform

Package large
datasets inside
images
Containers access
datasets through
shared storage

•Per container storage
•Shared storage (same host)
•Multi-Host shared storage
Persistence
Implementations

Running your first
Container (*)
*This at least happened to me.

Per container storage
/
var
lib
docker
volumes
$ docker run –it -–rm
dan/container:1.0 sh
fc4b398edb01
host01
$ # Inside Container
$ mkdir /storage
/
$ touch /storage/dockercon
storage
storage
dockercon

Shared Storage (same host)
/
var
lib
docker
volumes
$ docker volume create dockercon
dockercon
$ docker run –it –-rm –v dockercon:/mnt
busybox sh
/
mnt
/
mnt
$ docker run –it –-rm –v dockercon:/mnt
busybox sh
host01

Host Persistence
192.168.0.2
host01
host02
host01 $ docker volume create
--opt type=nfs
--opt o=addr=192.168.0.2,rw
--opt device=:/volume1/docker
nfs
nfs
nfs

Host Persistence
192.168.0.2
host01
host02
nfs
nfs
/
mnt
host02 $ docker run –it –-rm
–v nfs:/mnt
busybox sh
/
mnt

Host Persistence (cleanup)
192.168.0.2
host01
host02
nfs
nfs
/
mnt
/
mnt

• Extend the functionality of the Docker Engine
• Use the extensible Docker plugin API
• Allows an end-user to consume existing storage and
its functionality
• Create Docker storage volumes that are linked to a
containers lifecycle (can be persisted afterwards if
needed)
Docker Volume Plugins

Volume plugin workflow
[dan@dockercon ~]$ docker volume –d array -o ssd -o 32Gb fast_volume
fast_volume
[dan@dockercon ~]$ docker volume ls
DRIVER VOLUME NAME
array fast_volume
[dan@dockercon ~]$ docker plugin install store/storagedriver/array

Volume plugin workflow
host01
API Call
Mount volume/
mnt
$ docker run –it –-rm
–v dockercon:/mnt
busybox sh

Data-intensive applications:
Volume plugins expose specialized functionality in storage providers that can be
utilised for data–intensive workloads.
Database migration:
Volume plugins make it easy to move data across hosts in the form of snapshots,
which enable migration of production databases from one host to another with
minimum downtime..
Plugin benefits / use-cases

Stateful application failover:
Ability to have volumes that can be easily moved and re-attached, allowing easy
failover to new machines/instances and re-attaching of data volumes.
Reduced Mean Time Between Failures (MTBF):
With volume plugins connected via a shared storage backend, operations teams
can speed up cluster time-to-recovery by attaching a new database container
to an existing data volume. This results in faster recovery of failed systems.
Plugin benefits / use-cases

Orchestrating Storage with Docker EE 2.0

Storage with Swarm
volumes:
nfs:
driver_opts:
type: "nfs"
o: "addr=docker01,nolock,soft,rw"
device: ":/nfs"
Swarm Volume
As part of the compose file,
specify a named volume and
pass in the required settings
Volume name

Storage with Swarm
dockercon:
image: dockercon:18
volumes:
- type: volume
source: nfs
target: /nfs
volume:
nocopy: true
Service Volumes
In the service definition
reference the volume along
with additional configuration.
Volume name
Mount point

Storage with Kubernetes
Persistent Volume
A persistent volume when applied becomes a
resource that is available to the cluster

Persistent Volume Claim
A claim is a request from a user for a persistent
volume, the request can include specifics such
as:
• Volume size
• Volume capabilities
• Access methods

Pod
Defines a container or collection of containers
that share networking and volumes.
• Specify a PVC (persistent volume claim)
• Map the PVC to a path inside the POD
/http

Shared storage:
• Smaller images
• Efficient usage of repetitive
data
• Decouples application and
data
Key Takeaways

• Running a Database is fine
in a container, as long as
the requirements are met.
• Don’t write logs inside a
container
Key Takeaways

Docker storage designing a platform for persistent data

More Related Content

What's hot

Similar to Docker storage designing a platform for persistent data

More from Docker, Inc.

Recently uploaded

Docker storage designing a platform for persistent data