Presented at FOSDEM 2019
K8s as a universal control plane to deploy containerised applications • Public cloud is moving on premises (GKE, Outpost) • K8s capable of doing more then containers due to controllers (VMs)
3. • Screwed up the recording — hope its all good for this year
• Touched briefly on storage history, how SAN and NAS came to be
• Mostly to set the context here
• Introduced the concept of Container Attached Storage (CAS)
Today
• Talk about progress we made, our maiden voyage with RUST, and go over
some of the concepts that we are working on
• What you see here today is only worked on by 2 persons
• Hopefully a quick demo
• If what your hear today somewhat excites you; we are (remote) hiring
OpenEBS last year (2018)
4. • Open source project started now roughly 2 years ago
• Sponsored by my employee MayaData
• Provide a cloud native storage abstraction — data plane as well as
control plane which is operated by means of declarative intent such that
it provides a platform for persistent cloud native workloads
• Build on top of Kubernetes which has demonstrated that abstraction and
intent with reconciliation allows developers to focus on the deployment of
the app rather the underlying infra structure
• What k8s does for apps we inspire to do for data
About openEBS
5. How does that look?
on premises Google packet.net
MayaOnline
Analytics
Alerting
Compliance
Policies
Declarative Data Plane
A
P
I
Advisory
Chatbot
6. Motivation
• Applications have changed and someone forgot to tell storage
• The way modern day software is developed and deployed has changed
a lot due to introduction of docker (tarball on steroids)
• Scalability and availability “batteries” are included
• Small teams of people need to deliver “fast and frequently” and
innovations tends to happen in so called shadow IT (skunkworks)
• Born in the cloud — adopts cloud native patterns
• Hardware trends enforce a change in the way we do things
• These change propagate into our software, and the languages we use
• K8s as a universal control plane to deploy containerised applications
• Public cloud is moving on premises (GKE, Outpost)
• K8s capable of doing more then containers due to controllers (VMs)
7. • Register a set of “mountable” things to k8s cluster (PV)
• Take ownership of such a mountable thing — by claiming it (PVC)
• Refer to the PVC in the application
• To avoid having to fill the up a pool of PVs — a dynamic provisioner can
be used that does that automatically
• Potential implications may vary per storage solution (max LUs)
• Storage typically the mother of all snowflakes
• To avoid a wild fire of plugins, a Container Storage Interface (CSI) has
been developed by community members
• Vendor specific implementation (or black magic) hidden from the user
• Make it a pure consumption model
PVs and PVCs in a nutshell
11. • How does a developer compose its volume in terms of storage specific
features for that particular workload?
• snapshots, clones, compression, encryption — persona in control
• How do we unify storage differences between different cloud providers
and/or storage vendors?
• They are as incompatible as they can be by design
• How to provide cloud native “EBS volume” look and feel on premisses
using your existing storage infra?
• Don’t trow away existing storage solutions and or vendors
• Make storage as agile — as they applications that they serve
Problem solved?
12. • As data grows — it has the tendency to pull applications towards it
• Everything evolves around the storage systems
• Latency, throughput — IO blender
• If the sun goes super nova, all the apps around it will be gone instantly
i.e huge blast radius
• Typically you have far more PV/PVC’s then you have LUs in a virtual
environment — 1000?
• Typical solution let us replicate the sun!
• Exacerbates the problem instead of solving it?
Data gravity
13.
14. • Data placement is expressed in YAML as part of the application
• Replication factors can be dynamically changed (patch)
• Provide a set of composable transformations layers that can be enabled
based on application specific needs
• As monolithic apps are decomposed — so are their storage needs
• Volumes typically small, allows for data agility
• Allows us to reimagine the how we manage the data
• Runs in containers for containers — prevents depending feature mismatch
between different kernel flavours across distributions and “cloud” images
• Decompose the data in to a collection of small stars
• Monolith vs Micro
OpenEBS approach
15.
16. • The user is faced with writing an application that might run in DC1 or DC2
as the k8s cluster is spanning both.
• DC1 happens to have vendor A and DC2 has vendor B
• typically, vendor A does not work with vendor B — efficiently
• OpenEBS can be used to abstract away the differences between the to
storage systems and make the volume available in both DCs
• Almost like a ‘real’ EBS volume except — we have more control
Data availability example
17. Simple replication of small datasets
PV
CAS
TheBox 1 TheBox 2 TheBox 3
• Data routing, you specify where you want
your data to go
• It is openEBS that connects to TheBox — not
the OS
• The openEBS operator, not shown, instantiates
the needed virtual devices on the fly
18. • Facing different type of storage protocols and performance tiers
• OpenEBS cant fill the performance gap, it is storage not magic
• As time moves one, we want to get “rid” of the slow tier as a faster tier has
become available
• PVs come and go all the time, like the slow tier will be repurposed
• The alternative is to “not deploy” and wait for storage
• How-to move the data, non disruptive?
• Hydrate and then vacate, formerly known as migration aka copy =)
Data Mobility use case
19. Data hydration and mobility
PV
iSCSI iSCSI NBD
iSCSI
hydrate/mirror
• Asymmetrical backends, performance depends on replication mode and
interconnect
• async, semi-sync and sync
• Data migration and hydration — small is the new big we copying GBs not PBs!
CAS
20. • Volumes are small, rebuild in general is quick, how to know what to rebuild
• Although small — you really don’t want to rebuild unused blocks
• General approach is to segment the drive(s) into fixed blocks (e.g 16MB)
• Keep a bitmap of dirty segments as writes come in
• Where to store the bitmap?
• Remember: small (Bonwick on Spacemaps)
• As a new drive/LU is added write out the marked segments to the other
drive(s)
• But, what about thin provisioning, clones, snapshots?
• We have something that does that, but.. maybe next year
• Most of this is not new — standing on the shoulder of giants
• “The design and implementation of a Log Structured filesystem”
Rebuilding