OpenEBS is an open source container attached storage solution for Kubernetes that simplifies running stateful workloads. It provides containerized storage that is native to Kubernetes using features like CSI, dynamic provisioning of volumes, and integration with common DevOps tools. OpenEBS offers both local and replicated volume types to meet different use cases for availability, performance, and scalability. Developers can use OpenEBS volumes like any other Kubernetes storage by creating persistent volume claims in their applications.
The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...HostedbyConfluent
Kafka organizes data as immutable append-only logs at its core, and relied on external consensus services (a.k.a. Zookeeper) to manage the metadata --- such as topic-level configs, leader replicas and ISR information, received admin requests --- of these logs. In this talk, I will discuss a recent core initiative, that migrates the management of such metadata from external services into Kafka as its own special logs. More specifically, I will cover the following:
1. Why we believe an internal consensus protocol provides Kafka more benefit than an external consensus service.
2. Why we choose to build this internal "metadata log" based on the Raft protocol, instead of Kafka's current leader-follower replication mechanism.
3. What are the key design decisions we made in its implementation, and how it is different from the standard Raft algorithm (KIP-595).
4. How this Raft-based metadata log is leveraged by the new Quorum Controller (KIP-500).
Debezium is a Kafka Connect plugin that performs Change Data Capture from your database into Kafka. This talk demonstrates how this can be leveraged to move your data from one database platform such as MySQL to PostgreSQL. A working example is available on GitHub (github.com/gh-mlfowler/debezium-demo).
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
This Edureka Spark Tutorial will help you to understand all the basics of Apache Spark. This Spark tutorial is ideal for both beginners as well as professionals who want to learn or brush up Apache Spark concepts. Below are the topics covered in this tutorial:
1) Big Data Introduction
2) Batch vs Real Time Analytics
3) Why Apache Spark?
4) What is Apache Spark?
5) Using Spark with Hadoop
6) Apache Spark Features
7) Apache Spark Ecosystem
8) Demo: Earthquake Detection Using Apache Spark
Building IAM for OpenStack, presented at CIS (Cloud Identity Summit) 2015.
Discuss Identity Sources, Authentication, Managing Access and Federating Identities
CRUSH is the powerful, highly configurable algorithm Red Hat Ceph Storage uses to determine how data is stored across the many servers in a cluster. A healthy Red Hat Ceph Storage deployment depends on a properly configured CRUSH map. In this session, we will review the Red Hat Ceph Storage architecture and explain the purpose of CRUSH. Using example CRUSH maps, we will show you what works and what does not, and explain why.
Presented at Red Hat Summit 2016-06-29.
Checking in your deployment configuration as code
Helm is a tool that streamlines the creation, deployment and management of your Kubernetes-native applications. In this talk, we take a look at how Helm enables you to manage your deployment configurations as code, and demonstrate how it can be used to power your continuous delivery (CI/CD) pipeline.
Kubernetes or OpenShift - choosing your container platform for Dev and OpsTomasz Cholewa
Kubernetes has become the most popular choice among container orchestrators with strong community and growing numbers of production deployments. There is no shortage of various K8s distros, at the moment 20+ and counting. There are many distributions available that just simply add toolsets and products that embed it and adds more features. In this presentation, you'll learn about OpenShift and how it compares to vanilla Kubernetes - their major differences, best features and how they can help to build a consistent platform for Dev and Ops cooperation.
If you’re working with just a few containers, managing them isn't too complicated. But what if you have hundreds or thousands? Think about having to handle multiple upgrades for each container, keeping track of container and node state, available resources, and more. That’s where Kubernetes comes in. Kubernetes is an open source container management platform that helps you run containers at scale. This talk will cover Kubernetes components and show how to run applications on it.
KubeVirt (Kubernetes and Cloud Native Toronto)Stephen Gordon
In this session Stephen will present the use cases for and current state of the KubeVirt project (http://www.kubevirt.io/), which aims to build a virtualization API for Kubernetes in order to manage virtual machines which themselves run in Kubernetes pods.
You will also hear how this project differs from, and is complementary to, the recently announced Katacontainers (https://katacontainers.io/) project.
This presentation provides an overview of the Dell PowerEdge R730xd server performance results with Red Hat Ceph Storage. It covers the advantages of using Red Hat Ceph Storage on Dell servers with their proven hardware components that provide high scalability, enhanced ROI cost benefits, and support of unstructured data.
The Log of All Logs: Raft-based Consensus Inside Kafka | Guozhang Wang, Confl...HostedbyConfluent
Kafka organizes data as immutable append-only logs at its core, and relied on external consensus services (a.k.a. Zookeeper) to manage the metadata --- such as topic-level configs, leader replicas and ISR information, received admin requests --- of these logs. In this talk, I will discuss a recent core initiative, that migrates the management of such metadata from external services into Kafka as its own special logs. More specifically, I will cover the following:
1. Why we believe an internal consensus protocol provides Kafka more benefit than an external consensus service.
2. Why we choose to build this internal "metadata log" based on the Raft protocol, instead of Kafka's current leader-follower replication mechanism.
3. What are the key design decisions we made in its implementation, and how it is different from the standard Raft algorithm (KIP-595).
4. How this Raft-based metadata log is leveraged by the new Quorum Controller (KIP-500).
Debezium is a Kafka Connect plugin that performs Change Data Capture from your database into Kafka. This talk demonstrates how this can be leveraged to move your data from one database platform such as MySQL to PostgreSQL. A working example is available on GitHub (github.com/gh-mlfowler/debezium-demo).
Apache Spark Tutorial | Spark Tutorial for Beginners | Apache Spark Training ...Edureka!
This Edureka Spark Tutorial will help you to understand all the basics of Apache Spark. This Spark tutorial is ideal for both beginners as well as professionals who want to learn or brush up Apache Spark concepts. Below are the topics covered in this tutorial:
1) Big Data Introduction
2) Batch vs Real Time Analytics
3) Why Apache Spark?
4) What is Apache Spark?
5) Using Spark with Hadoop
6) Apache Spark Features
7) Apache Spark Ecosystem
8) Demo: Earthquake Detection Using Apache Spark
Building IAM for OpenStack, presented at CIS (Cloud Identity Summit) 2015.
Discuss Identity Sources, Authentication, Managing Access and Federating Identities
CRUSH is the powerful, highly configurable algorithm Red Hat Ceph Storage uses to determine how data is stored across the many servers in a cluster. A healthy Red Hat Ceph Storage deployment depends on a properly configured CRUSH map. In this session, we will review the Red Hat Ceph Storage architecture and explain the purpose of CRUSH. Using example CRUSH maps, we will show you what works and what does not, and explain why.
Presented at Red Hat Summit 2016-06-29.
Checking in your deployment configuration as code
Helm is a tool that streamlines the creation, deployment and management of your Kubernetes-native applications. In this talk, we take a look at how Helm enables you to manage your deployment configurations as code, and demonstrate how it can be used to power your continuous delivery (CI/CD) pipeline.
Kubernetes or OpenShift - choosing your container platform for Dev and OpsTomasz Cholewa
Kubernetes has become the most popular choice among container orchestrators with strong community and growing numbers of production deployments. There is no shortage of various K8s distros, at the moment 20+ and counting. There are many distributions available that just simply add toolsets and products that embed it and adds more features. In this presentation, you'll learn about OpenShift and how it compares to vanilla Kubernetes - their major differences, best features and how they can help to build a consistent platform for Dev and Ops cooperation.
If you’re working with just a few containers, managing them isn't too complicated. But what if you have hundreds or thousands? Think about having to handle multiple upgrades for each container, keeping track of container and node state, available resources, and more. That’s where Kubernetes comes in. Kubernetes is an open source container management platform that helps you run containers at scale. This talk will cover Kubernetes components and show how to run applications on it.
KubeVirt (Kubernetes and Cloud Native Toronto)Stephen Gordon
In this session Stephen will present the use cases for and current state of the KubeVirt project (http://www.kubevirt.io/), which aims to build a virtualization API for Kubernetes in order to manage virtual machines which themselves run in Kubernetes pods.
You will also hear how this project differs from, and is complementary to, the recently announced Katacontainers (https://katacontainers.io/) project.
This presentation provides an overview of the Dell PowerEdge R730xd server performance results with Red Hat Ceph Storage. It covers the advantages of using Red Hat Ceph Storage on Dell servers with their proven hardware components that provide high scalability, enhanced ROI cost benefits, and support of unstructured data.
Learning from ZFS to Scale Storage on and under Containersinside-BigData.com
Evan Powell presented this deck at the MSST 2107 Mass Storage Conference.
"What is so new about the container environment that a new class of storage software is emerging to address these use cases? And can container orchestration systems themselves be part of the solution? As is often the case in storage, metadata matters here. We are implementing in the open source OpenEBS.io some approaches that are in some regards inspired by ZFS to enable much more efficient scale out block storage for containers that itself is containerized. The goal is to enable storage to be treated in many regards as just another application while, of course, also providing storage services to stateful applications in the environment."
Watch the video: http://wp.me/p3RLHQ-gPs
Learn more: blog.openebs.io
and
http://storageconference.us
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Introduction to the HPE Storage portfolio in relation to containers and Kubernetes in particular. HPE 3PAR, HPE Nimble Storage and HPE Cloud Volumes all provide advanced data services to drive popular use cases for container deployments. In this session you’ll learn the basics of persistent storage in Kubernetes and the components needed to allow dynamic provisioning.
There is a transformation brewing for DevOps in age of Kubernetes. The tools of the trade, configuration management solutions, have been superseded in agility and preference by development teams who want the declarative choreography of containerized applications. The new preference for mixing developer and operations is the site reliability engineering (SRE) model championed by Google. In this new structure, the need to automate doesn’t stop at the containerized application and DevOps professionals should seek to automate the Kubernetes service itself.
Building highly efficient cloud infrastructure, and lessons learned from real deployments: The session will cover how to build converged cloud solution based on industry standard components and open source software, to deliver the best cost/performance, lowest $/GB storage, and lowest $/VM, and the right balance of compute, network, and storage resources. This is based on the speaker experience of working with multiple OpenStack based cloud providers, integrators, and internal implementation of OpenStack private cloud in Mellanox The session will also discuss various software defined storage (SDS) and commercial options, what’s the benefit of one vs the other, how to efficiently combine SSD & HDD, and expiriance with BigData and Hadoop applications, will cover latest innovations in the space of high-performance networking and storage (VXLAN in hardware, DPDK/NFV, Cinder acceleration, Ceph over RDMA, ..) , and will go over a concrete for high-density, high-perform
OpenEBS Technical Workshop - KubeCon San Diego 2019MayaData Inc
Know how to navigate the journey to cloud-native data management with lessons learned and best practices to help you deploy Kubernetes, storage, and data management with confidence.
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
With a current zoo of technologies and different ways of their interaction it's a big challenge to architect a system (or adopt existed one) that will conform to low-latency BigData analysis requirements. Apache Kafka and Kappa Architecture in particular take more and more attention over classic Hadoop-centric technologies stack. New Consumer API put significant boost in this direction. Microservices-based streaming processing and new Kafka Streams tend to be a synergy in BigData world.
Apache Kafka becoming the message bus to transfer huge volumes of data from various sources into Hadoop.
It's also enabling many real-time system frameworks and use cases.
Managing and building clients around Apache Kafka can be challenging. In this talk, we will go through the best practices in deploying Apache Kafka
in production. How to Secure a Kafka Cluster, How to pick topic-partitions and upgrading to newer versions. Migrating to new Kafka Producer and Consumer API.
Also talk about the best practices involved in running a producer/consumer.
In Kafka 0.9 release, we’ve added SSL wire encryption, SASL/Kerberos for user authentication, and pluggable authorization. Now Kafka allows authentication of users, access control on who can read and write to a Kafka topic. Apache Ranger also uses pluggable authorization mechanism to centralize security for Kafka and other Hadoop ecosystem projects.
We will showcase open sourced Kafka REST API and an Admin UI that will help users in creating topics, re-assign partitions, Issuing
Kafka ACLs and monitoring Consumer offsets.
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
Instagram has become one of the most popular social media platforms, allowing people to share photos, videos, and stories with their followers. Sometimes, though, you might want to view someone's story without them knowing.
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBrad Spiegel Macon GA
Brad Spiegel Macon GA’s journey exemplifies the profound impact that one individual can have on their community. Through his unwavering dedication to digital inclusion, he’s not only bridging the gap in Macon but also setting an example for others to follow.
Understanding User Behavior with Google Analytics.pdfSEO Article Boost
Unlocking the full potential of Google Analytics is crucial for understanding and optimizing your website’s performance. This guide dives deep into the essential aspects of Google Analytics, from analyzing traffic sources to understanding user demographics and tracking user engagement.
Traffic Sources Analysis:
Discover where your website traffic originates. By examining the Acquisition section, you can identify whether visitors come from organic search, paid campaigns, direct visits, social media, or referral links. This knowledge helps in refining marketing strategies and optimizing resource allocation.
User Demographics Insights:
Gain a comprehensive view of your audience by exploring demographic data in the Audience section. Understand age, gender, and interests to tailor your marketing strategies effectively. Leverage this information to create personalized content and improve user engagement and conversion rates.
Tracking User Engagement:
Learn how to measure user interaction with your site through key metrics like bounce rate, average session duration, and pages per session. Enhance user experience by analyzing engagement metrics and implementing strategies to keep visitors engaged.
Conversion Rate Optimization:
Understand the importance of conversion rates and how to track them using Google Analytics. Set up Goals, analyze conversion funnels, segment your audience, and employ A/B testing to optimize your website for higher conversions. Utilize ecommerce tracking and multi-channel funnels for a detailed view of your sales performance and marketing channel contributions.
Custom Reports and Dashboards:
Create custom reports and dashboards to visualize and interpret data relevant to your business goals. Use advanced filters, segments, and visualization options to gain deeper insights. Incorporate custom dimensions and metrics for tailored data analysis. Integrate external data sources to enrich your analytics and make well-informed decisions.
This guide is designed to help you harness the power of Google Analytics for making data-driven decisions that enhance website performance and achieve your digital marketing objectives. Whether you are looking to improve SEO, refine your social media strategy, or boost conversion rates, understanding and utilizing Google Analytics is essential for your success.
Gen Z and the marketplaces - let's translate their needsLaura Szabó
The product workshop focused on exploring the requirements of Generation Z in relation to marketplace dynamics. We delved into their specific needs, examined the specifics in their shopping preferences, and analyzed their preferred methods for accessing information and making purchases within a marketplace. Through the study of real-life cases , we tried to gain valuable insights into enhancing the marketplace experience for Generation Z.
The workshop was held on the DMA Conference in Vienna June 2024.
Italy Agriculture Equipment Market Outlook to 2027harveenkaur52
Agriculture and Animal Care
Ken Research has an expertise in Agriculture and Animal Care sector and offer vast collection of information related to all major aspects such as Agriculture equipment, Crop Protection, Seed, Agriculture Chemical, Fertilizers, Protected Cultivators, Palm Oil, Hybrid Seed, Animal Feed additives and many more.
Our continuous study and findings in agriculture sector provide better insights to companies dealing with related product and services, government and agriculture associations, researchers and students to well understand the present and expected scenario.
Our Animal care category provides solutions on Animal Healthcare and related products and services, including, animal feed additives, vaccination
Ready to Unlock the Power of Blockchain!Toptal Tech
Imagine a world where data flows freely, yet remains secure. A world where trust is built into the fabric of every transaction. This is the promise of blockchain, a revolutionary technology poised to reshape our digital landscape.
Toptal Tech is at the forefront of this innovation, connecting you with the brightest minds in blockchain development. Together, we can unlock the potential of this transformative technology, building a future of transparency, security, and endless possibilities.
2. OpenEBS
• Leading Open Source Container Attached Storage Solution for
simplifying the running of Stateful workloads in Kubernetes.
• GitHub: https://github.com/openebs/openebs
• Website: https://openebs.io/
• Slack: https://slack.k8s.io, #openebs
• Twitter: https://twitter.com/openebs
• 121+ Companies contributing since joining CNCF
– (https://openebs.devstats.cncf.io/)
• 187+ New Contributors since May 2019
• 40+ Public References since May 2019
• Incubation PR: https://github.com/cncf/toc/pull/506
5. Challenges with existing Storage
Agility and
Productivity
Monolithic data platform
software is being redesigned
with microservices. Need large
number of smaller volumes,
dynamically provision and
dynamically move with pods
to different nodes.
Connectivity and mounting
issues.
Needs prior design and
planning.
Bottlenecked with Siloed Team
and Storage.
Cost and
Performance
Hardware
Advancements
Improving performance using
Servers with 96 Cores, 1TB Flash,
16 TB Drives, NVMe
Device/Fabric,
IPU/DPU/SmartNICs, ARM
Needs hardware and software
refresh of the Storage. (Better
to replace and migrate).
Clouds are moving fast, but will
cause Data Gravity and
Lock-in.
Life-cycle
management with
Higher Availability
and Resiliency
Harder to setup and maintain.
Upgrades have to be
scheduled and coordinated.
Higher blast radius.
Has software layers that are
redundant to refectored
(cloud native) data platforms.
Legacy stacks.
6. Paradigm Shift. Change is inevitable.
Development and People
Processes have changed
Loosely coupled applications and loosely
coupled teams. Conway’s Law applied at all
layers. Data Mesh and Data as Product.
Examples: CNCF End users like Bloomberg
adopting cloud native for agility and open source.
Improve developer and application team
productivity. Platform Teams standardizing towards
API / Kubernetes.
Hardware Advancements
promise improved
performance and low cost
96 Cores, 1TB Flash, 16 TB Drives, NVMe Calling for a rewrite of the system software to fully
utilize the capabilities of the hardware.
Poll Mode Drivers/Lockless queues, by-pass kernel
OS and Software
Advancements for building
better performing software
DPDK, SPDK, io_uring, meta languages, user
space performance, huge pages
Build systems with expectation that components
will fail. Rust, Go used to write system software and
control plane software. Cloud native and
container native.
Nimble and Fungible Data
Platforms for meeting
demands from users and
government - Evolving Law
around Data Privacy and
Compliance.
HIPAA, GDPR, CCPA and many more with
stricter guidelines on data retention and
conformance.
Data Gravity should be avoided to get
locked in. Hybrid Clouds to mitigate the
issues.
Needs transparency in data storage, allowing
Application and Platform SREs to quickly comply
and provide proof of implementation. Ability to
switch in phases.
8. Why Data on Kubernetes?
• Hybrid Cloud Readiness
• Declarative installation of stateful stacks
for developer environments
• Increased Developer Productivity
• Improved Availability
• Improved resilience with compute storage
separation
DoK Day: Neeraj Bisht & Praveen Kumar GT "eCommerce giant
Flipkart on data on Kubernetes at scale"
(https://youtu.be/D77FLwUN9Oo)
OpenEBS Adopters
(https://github.com/openebs/openebs/blob/main/ADOPTERS.md)
9. Where does OpenEBS fit?
https://www.cncf.io/blog/2020/07/06/announcing-the-updated-cncf-storage-landscape-whitepaper/
Local and
Distributed
Block Storage
Control Plane
Workloads
(e.g. Databases, Key-Value/Object Stores, MQ, AI/ML, CI/CD)
Container Orchestrators
Data Engines
Framework and Tools
Storage Systems
Control-Plane Interface
(e.g. CSI, Others)
C
A
B
B ● Availability
● Consistency
● Durability
● Performance
● Scalability
● Security
● Ease of Use
10. Changing Storage Needs
Workload Type Standalone
(MinIO or MySQL)
Standalone
(Prometheus or Jenkins)
Distributed
(TiDB, Kafka)
Availability access to the
data continues
during a failure
condition
dependent on storage dependent on storage built-in
Consistency strong or weak need strong need strong need strong
Durability bit-rot,
endurance,
fat-fingers
needs protection for long
term
Not required. Easy to recreate. Tolerant to partial failures
Scalability clients, capacity,
throughput
capacity and vertical
scaling
capacity Scale out as adding more
capacities
Performance latency and
throughput
Avoid noisy
neighbour effects
storage should serve
throughput/io coming from
single node within
acceptable (SSD) latency
limits - < 2 ms
Hostpath, HDD, SSD
decent latency / throughput.
(HDD latency of 2-4ms is
acceptable)
Hostpath, HDD, SSD
Low I/O latency and high
throughput
(NVMe) SSD, Memory
11. Kubernetes as universal control plane
Functionality How does Kubernetes (aka containers help?)
Resource Management
and Scheduling
Discover the Storage Node and Storage
Devices. Aggregate and schedule volumes.
Scheduling includes providing Locality, Fault
tolerance, application awareness.
Volumes as services (Pods) and leverage the
capabilities of Kubernetes for scheduling.
Configuration
Management
Configuration Store, RBAC, Disaster Recovery Kubernetes Configuration store, Kubernetes
Operators for implementing the workflows
Usability Web UI, API Declarative, Kubernetes API Extensions, kubectl
plugin
High Availability and
Scalability
Scale up/down of Storage Nodes and
Devices, Movement of Volume Services to
the right nodes for High Availability, Highly
available Provisioning Services
Horizontal scaling with Kubernetes, Scale up/down
the provisioning deployments. Volume High
availability via extensions to Kubernetes scheduling
and Operators
Maintenance / Day 2
Operations
User Interface / CLI, Software Upgrades,
Telemetry and Alert tooling / Co-relation
between application and storage during
incidents
Declarative Upgrades, Standardized Monitoring,
Telemetry and Logging
13. K8s Stateful Stack with OpenEBS
OpenEBS Control Plane
CSI Drivers
Storage Operators, Data
Engine Operators
Prometheus Exporters, Velero
Plugin, ...
Stateful Workloads
( MySQL, PostgreSQL, Kafka, Prometheus, Minio, MongoDB, Cassandra, …)
Kubernetes Storage Control Plane
(SC, PVC, PV, CSI)
OpenEBS Data Engines
Replicated Volumes - Mayastor,
cStor, Jiva
Local Volumes - LVM, ZFS,
hostpath, device
Enterprise Framework / Tools
(Velero, Prometheus, Grafana, EFK/ELK, … )
Any Platform, Any Storage
(On Premise/Cloud, Core/Edge, Bare metal/Virtual, NVMe/SCSI, SSD/HDD)
14. OpenEBS Persistent Volumes
Storage Devices
NVMe/SCSI, SSD/HDD, Cloud/SAN
Block Devices or Device Aggregation/Pooling using LVM, ZFS
Volume Replica
Jiva, cStor, Mayastor
Volume Target
Jiva, cStor, Mayastor
Stateful Workload
Persistent Volume
Mounted using Ext4, XFS, Btrfs, NFS or RawBlock
(Local
Volumes)
iSCSI/NVMeoF
TCP/NVMeoF
TCP/NVMeoF
Synchronous
Replication to
Volume
replicas on
other nodes.
Storage Layer
Volume Data
Layer
Direct
(Replicated
Volumes)
Volume
Services Layer
Volume
Access Layer
15. CAS
OpenEBS Persistent Volumes
Local Volumes Replicated Volumes
CAS - Hyperconverged
Kubernetes native
Runs anywhere.
Easy to install and manage!
Access from single
node.
Access from multiple
nodes.
Durability with
synchronous
replication
Data Services ...
Has overhead on
capacity and
performance
Low overhead on
capacity and
performance
Cloud Native and Distributed workloads - TiDB, etcd,
Kafka, ML Jobs
MySQL, MinIO, GitLab, Postgres and Cloud Native and
Distributed workloads - Cassandra,
CAS CAS
16. OpenEBS Persistent Volumes
Engine Type Local Volumes Replicated Volumes
Example Device, Hostpath, LVM,
Rawfile, ZFS
cStor, Jiva, Mayastor
Availability access to the data
continues during a failure
condition
available from a single node in
cluster.
available from multiple nodes - with
sync replicas.
Scalability clients, capacity, throughput scale-up on node. horizontal
scaling with K8s cluster.
scale-up on node. horizontal scaling
with K8s cluster.
Consistency strong or weak delegated to filesystems -
Example LVM, ZFS.
strong consistency at block level
Durability bit-rot, endurance,
fat-fingers
delegated to choice of
filesystems - LVM, ZFS or none.
provided via replicas
Performance latency and throughput depends on storage type and
type of filesystem used.
Low-overhead (except in case
of ZFS)
depends on storage type and
compute (CPU/RAM)
Low latency - Mayastor
17. How does OpenEBS work?
Storage Devices
NVMe/SCSI, SSD/HDD, Cloud/SAN
Volume Replica
Jiva, cStor,
Mayastor
Volume Target
Jiva, cStor,
Mayastor
Stateful Workload
Persistent Volume
CAS Storage
Control Plane
Platform SREs
will setup the
Kubernetes
nodes with
required Storage
1
Platform
SREs/K8s
administrators
(using K8s API)
will setup
OpenEBS and
create Storage
Classes
2
Application Developers will create
stateful workloads with Persistent
Volume Claims (PVCs)
3
OpenEBS, using
Data Engines,
CSI and K8s
extensions will
create the
required
Persistent
Volumes (PVs)
4
Platform and Operations team will observe
and maintain the system using cloud native
tooling.
5
18. OpenEBS - User Journey
Developer
SRE / Platform Engineer
Run Stateful with local storage
(ML Job or simple app, local s3)
Run Stateful with “enterprise” storage
(DBaaS, CI/CD, Object Storage, AI/ML
Pipelines)
OpenEBS Advocate or
Contributor
OpenEBS Advocate or
Contributor
Phase 1: Non critical workloads
(CI/CD) or resilient workloads
Phase 2: DBaaS
Phase 3: Volumes as Service to
other Data Platforms.
OpenEBS Adopter
Database Administrators
Platform Providers
19. OpenEBS Benefits and Limitations
Benefits
• Kubernetes native - ease of use and operations. integrates
into the standard cloud native tooling
• Lower footprint. Flexible deployment options
• Highly composable. Choice of data engines matching the
node capabilities and storage requirements
• Controlled and predictable blast radius. Easy to visualize
the location of the data of an application or volume
• Horizontally scalable. Scale up/down
• Avoid vendor lock-in with fully functional Open Source
Software
• Optimized to reduce operational costs on cloud or
on-prem.
Limitations
• Scale-out volumes is not supported. Only volumes with
capacity that can be served within a given node are
supported. OpenEBS believes the need for large volumes
will reduce as more and more workloads move into
Kubernetes.
• Read-write many is supported via NFS on top of Block
Storage volumes. OpenEBS believes that Read/Write many
usecases are served better via Object, Key/Value or API
based interfaces that offer more control and efficiency.
25. OpenEBS Local PV - FAQ
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-hostpath
annotations:
openebs.io/cas-type: local
cas.openebs.io/config: |
- name: StorageType
value: hostpath
- name: BasePath
value: /var/local-hostpath
provisioner: openebs.io/local
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowedTopologies:
- matchLabelExpressions:
- key: kubernetes.io/hostname
values:
- node1
- node2
- node3
How are the sub directories managed?
Can I create Local PVs with mounted storage like VMware
or GPDs?
Can I resize Local PV?
How do I monitor Local PV?
How do I backup Local PV?
Why is my PVC unable to bind to a PV?
How do I tell Kubernetes to schedule pods to nodes where
local storage is available?
29. Where does OpenEBS fit?
https://www.cncf.io/blog/2020/07/06/announcing-the-updated-cncf-storage-landscape-whitepaper/
Local and
(Distributed)
Replicated Block
Storage
Control Plane
Workloads
(e.g. Databases, Key-Value/Object Stores, MQ, AI/ML, CI/CD)
Container Orchestrators
Data Engines
Framework and Tools
Storage Systems
Control-Plane Interface
(e.g. CSI, Others)
C
A
B
B ● Availability
● Consistency
● Durability
● Performance
● Scalability
● Security
● Ease of Use
30. CAS
OpenEBS Data Engine Evolution
Local Volumes Replicated Volumes
CAS CAS
OpenEBS 1.0 Hostpath, Device Jiva, cStor
OpenEBS 2.0 Hostpath, Device, ZFS Jiva, cStor, Mayastor (alpha)
OpenEBS 3.0 Hostpath, Device, ZFS, LVM, Rawfile, Partition Jiva (CSI), cStor (CSI), Mayastor (beta)
31. OpenEBS 3.0
● GA:
a. cStor CSI
b. Local PV ZFS
c. Local PV LVM
d. Local PV Hostpath
● Beta:
a. Dynamic NFS
b. Mayastor
c. Jiva CSI
d. Local PV Rawfile
● Alpha:
a. Device (Partition)
● New management components:
a. Upgrade and Migration Operators
b. OpenEBS CLI,
c. Monitoring Mixins,
d. Kyverno Policy Add-on
● Deprecate cStor and Jiva External Provisioners
32. OpenEBS Data Engine comparison
Hostpath Device Rawfile LVM ZFS Jiva cStor Mayastor
Dynamic Provisioned Volumes Yes Yes Yes Yes Yes Yes Yes Yes
Capacity Management No No Yes Yes Yes Yes Yes Yes
Snapshots No No No Yes Yes No Yes Yes*
Incremental Backup No No No No Yes No Yes Yes*
Clones No No No No Yes No Yes Yes*
Performance Yes Yes Yes Yes No No No Yes
Node Failure (HA) No No No No No Yes Yes Yes
34. OpenEBS Local PV - Use cases
node1 node2 node3
Local PVs are great for Cloud Native Workloads (or distributed
system) that have:
● Built in Proxies to distribute the data
● Built in Backup and Migration solutions
● Need low latency access.
Or short lived Stateful workloads that need to save the state
and resume after reboot. ( ML Jobs)
Or edge nodes with single node K8s cluster.
35. LocalPV HostPath
Node 3
LocalPV Device
Node 1
ZFS or LVM LocalPV
Node 2 Pool
Application
Namespace
Internet
Physical Hard disks
OpenEBS LocalPV options
Stateful
Application
Running
Inside Pod in
Kubernetes
Persistent
Volume for
Application
Create LocalPV
StorageClass
XFS or EXT:
NDM knows if
disk is in use
Creates
volume in
user
defined
pool
1
2
3
36. OpenEBS 3.0 (Local PV)
OpenEBS Local Storage Operators make it easy to provision Local Volumes with different flavors of
local storage available on nodes.
● OpenEBS Hostpath LocalPV (stable), the first and the most widely used LocalPV now supports enforcing XFS
quota, ability to use a custom node label for node affinity (instead of the default kubernetes.io/hostname)
● OpenEBS ZFS LocalPV (stable), used widely for production workloads that need direct and resilient storage
has added new capabilities like:
○ Velero plugin to perform incremental backups that make use of the copy-on-write ZFS snapshots.
○ CSI Capacity based scheduling used with waitForFirstConsumer bound Persistent Volumes.
○ Improvements to inbuilt volume scheduler (used with immediate bound Persistent Volumes) that can
now take into account the capacity and the count of volumes provisioned per node.
● OpenEBS LVM LocalPV (stable), can be used to provision volume on top of LVM Volume Groups and
supports the following features:
○ Thick (Default) or Thin Provisioned Volumes
○ CSI Capacity based scheduling used with waitForFirstConsumer bound Persistent Volumes.
○ Snapshot that translates into LVM Snapshots
○ Ability to set QoS on the containers using LVM Volumes.
○ Also supports other CSI capabilities like volume expansion, raw or filesystem mode, metrics.
38. OpenEBS Replicated PV - Use cases
node1 node2 node3
Replicated PVs are great for Cloud Native Workloads (or
distributed system) that need:
● Performance
● Resiliency against single node and/or single device
failure.
● Need low latency access.
Replicated PVs are great if you would like to:
● Lower your blast radius, while still using bin-packing to
efficiently use your hardware resources.
● Efficiently use Capacity and Performance of NVMe
Devices. (with Mayastor)
40. OpenEBS 3.0 (Replicated PV)
OpenEBS Replicated Volumes enable users make use of the local storage available to kubernetes
nodes to provide durable persistent volumes - that are resilient to node failures. The name
replicated stems from the fact that OpenEBS uses synchronous replication of volumes instead of
sharding block across different nodes.
● OpenEBS Jiva (stable), has added support for a CSI Driver and Jiva operator that include features like:
○ Enhanced management of the replicas
○ Ability to auto-remount the volumes marked as read-only due to iSCSI time to read-write.
○ Faster detection of the node failure and helping Kubernetes to move the application out of the failed node to a
new node.
● OpenEBS CStor (stable), has added support for a CSI Driver and also improved customer resources and operators for
managing the lifecycle of CStor Pools. This 3.0 version of the CStor includes:
○ The improved schema allows users to declaratively run operations like replacing the disks in mirrored CStor
pools, add new disks, scale-up replicas, or move the CStor Pools to a new node. The new custom resource for
configuring CStor is called CStorPoolCluster (CSPC) compared to older StoragePoolCluster(SPC).
○ Ability to auto-remount the volumes marked as read-only due to iSCSI time to read-write.
○ Faster detection of the node failure and helping Kubernetes to move the application out of the failed node to a
new node.
● 3.0 also deprecates the older CStor and Jiva volume provisioners - that was based on the kubernetes external
storage provisioner. There will be no more features added to the older provisioners and users are requested to
migrate their Pools and Volumes to CSI Drivers as soon as possible.
41. Node 3
Node 1
Node 2
Application
Namespace
Internet
OpenEBS Mayastor
Stateful
Application
Running
Inside Pod in
Kubernetes
Persistent
Volume for
Application
Create Mayastor
StorageClass
Create Mayastor
pools on all storage
node.
STS or
Deployment
Maya
Maya
Maya
42. OpenEBS Mayastor (Beta In Progress)
Mayastor delivers high performance access to
persistent data and services, using the industry leading
Storage Performance Developer Kit (SPDK)
● Uses SPDK for NVMe features
○ Poll-mode and event-loop design for
maximum performance
○ Memory utilization tuned for environments
with limited huge pages
○ Scales within the node and across nodes
● Implemented in Rust for memory safety
guarantees
● Configuration management using secure gRPC
API
● Volume Services
○ Resilient against node failures via
synchronous replication
Control Plane Improvements
● Control plane implements
application aware data placement
● Fine grained control over errors,
restarts and timeouts for
Kubernetes
● Prometheus Metrics exporter
● Integrate Mayastor into OpenEBS
tools - installer, CLI, monitoring
Core Enhancements
● Reduce fail-over time in loss of K8s
node situation
● Support for LVM as backing store
43. OpenEBS Mayastor (Beta In Progress)
Mayastor (ANA) Volumes
OpenEBS 3.1 Mayastor with ANA
CAS CAS CAS
Mayastor (ANA) Faster HA
CAS CAS CAS
44. OpenEBS 3.0 (Other Features)
Beyond the improvements to the data engines and their corresponding control plane, there are several new enhancements that will help
with ease of use of OpenEBS engines:
● Several fixes and enhancements to the Node Disk Manager like automatically adding a reservation tag to devices, detecting
filesystem changes and updating the block device CR (without the need for a reboot), metrics exporter and an API service that can
be extended in the future to implement storage pooling or cleanup hooks.
● Dynamic NFS Provisioner that allows users to launch a new NFS server on any RWO volume (called backend volume) and expose an
RWX volume that saves the data to the backend volume.
● Kubernetes Operator for automatically upgrading Jiva and CStor volumes that are driven by a Kubernetes Job
● Kubernetes Operator for automatically migrating CStor Pools and Volumes from older pool schema and legacy (external storage
based) provisioners to the new Pool Schema and CSI volumes respectively.
● OpenEBS CLI (a kubectl plugin) for easily checking the status of the block devices, pools (storage) and volumes (PVs).
● OpenEBS Dashboard (a prometheus and grafana mixin) that can be installed via jsonnet or helm chart with a set of default Grafana
Dashboards and AlertManager rules for OpenEBS storage engines.
● Enhanced OpenEBS helm chart that can easily enable or disable a data engine of choice. The 3.0 helm chart stops installing the
legacy CStor and Jiva provisioners. If you would like to continue to use them, you have to set the flag “legacy.enabled=true”.
● OpenEBS helm chart includes sample kyverno policies that can be used as an option for PodSecurityPolicies(PSP) replacement.
● OpenEBS images are delivered as multi-arch images with support for AMD64 and ARM64 and hosted on DockerHub, Quay and
GHCR.
● Support for installation in air gapped environments.
● Enhanced Documentation and Troubleshooting guides for each of the engines located in the respective engine repositories.
● A new and improved design for the OpenEBS website.
45. kubectl apply -f https://openebs.github.io/charts/nfs-operator.yaml
OpenEBS NFS (RWX Volumes)
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: openebs-rwx
annotations:
openebs.io/cas-type: nfsrwx
cas.openebs.io/config: |
- name: NFSServerType
value: "kernel"
- name: BackendStorageClass
value: "openebs-hostpath"
provisioner: openebs.io/nfsrwx
reclaimPolicy: Delete
NAME READY STATUS RESTARTS AGE
openebs-nfs-provisioner-79b6ccd59-626pd 1/1 Running 0 62s
NFS Server
Backend
PV
Create a NFS Server on
top of Backend PV
Create a NFS PV
pointing to OpenEBS
NFS Server
46. kubectl krew install openebs
OpenEBS CLI
$ kubectl openebs version
COMPONENT VERSION
Client v0.4.0
OpenEBS CStor 3.0.0
OpenEBS Jiva Not Installed
OpenEBS LVM LocalPV Not Installed
OpenEBS ZFS LocalPV Not Installed
$ kubectl openebs get bd
NAME PATH SIZE CLAIMSTATE STATUS FSTYPE MOUNTPOINT
gke-kmova-helm-default-pool-595accd4-pgtf
├─blockdevice-2eff94561dab533cabfeb6b4ddbbe851 /dev/sdb 375GiB Unclaimed Active ext4 /mnt/disks/ssd0
├─blockdevice-a2247055ab6c06d27db1de47e61c3ac9 /dev/sdc1 375GiB Unclaimed Active
└─blockdevice-b90456e7143408f1c29738c4d4deafec /dev/sdd 375GiB Unclaimed Active ext4 /mnt/disks/ssd2
gke-kmova-helm-default-pool-595accd4-bwcd
├─blockdevice-3c679953243dfc1344d2a4ac352f4c6e /dev/sdd 375GiB Unclaimed Active ext4 /mnt/disks/ssd2
├─blockdevice-a5158511cf50b507e96fd628dca05af0 /dev/sdc1 375GiB Unclaimed Active
└─blockdevice-bc795daa24fc3589ee2f8b835bcdcba6 /dev/sdb 375GiB Unclaimed Active ext4 /mnt/disks/ssd0
47. OpenEBS CLI
$ kubectl openebs describe volume pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3
pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3 Details :
-----------------
NAME : pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3
ACCESS MODE : ReadWriteOnce
CSI DRIVER : cstor.csi.openebs.io
STORAGE CLASS : cstor-csi-disk
VOLUME PHASE : Bound
VERSION : 3.0.0
CSPC : cstor-disk-pool
SIZE : 10.0GiB
STATUS : Degraded
REPLICA COUNT : 3
Portal Details :
------------------
IQN : iqn.2016-09.com.openebs.cstor:pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3
VOLUME NAME : pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3
TARGET NODE NAME : gke-kmova-helm-default-pool-595accd4-bwcd
PORTAL : 10.3.248.245:3260
TARGET IP : 10.3.248.245
Replica Details :
-----------------
NAME TOTAL USED STATUS AGE
pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3-cstor-disk-pool-clz4 296.9KiB 5.4MiB Healthy 1m2s
pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3-cstor-disk-pool-h8b9 296.9KiB 5.6MiB Healthy 1m2s
pvc-dea356c2-7bd0-442e-a92f-98d503c65fb3-cstor-disk-pool-jznw 300.8KiB 5.5MiB Healthy 1m2s
50. OpenEBS 3.1 ( Planning )
Stateful Operator (STS) with Local PV
● Fault tolerant scheduling for
distributed applications
● Stale PVCs
● Moving Data of Local PV on K8s
upgrade/node-recycle (Data
Populator)
Engineering Optimizations
● CI Infrastructure improvements
● Unified Local CSI Drivers
● NDM - Enclosure / Storage
Management
● Usability Enhancements (based on
user feedback). ( Upgrades, Pool
Creation, …)
● Automated Security Compliance
Checks
Local PV on Shared Device
● Devices visible to multiple nodes
- shared filesystem. Eg. Cluster
LVM.
● Allow pods to move across
nodes that have access to
device.
● Remote access via iSCSI / NVMe
Integration Hooks
● Setting up finalizers or other
metadata on Volume related
objects for add-on operators. Eg:
Billing/Auditing by Platform
operators
Mayastor Beta
51. OpenEBS 3.1 (LocalPV ++)
Local Volumes
OpenEBS 3.1 Local PV ++ ( Shared Devices)
Local Volumes (HA)
52. OpenEBS 3.1 (LocalPV ++)
Local Volumes
OpenEBS 3.1 Local PV ++ ( Shared Devices + Remote Access via
NVMe )
CAS
Local Volumes (HA)
CAS
53. OpenEBS 3.1 (LocalPV ++)
Local Volumes
OpenEBS 3.1 Local PV ++ ( Shared Devices + Remote Access via
NVMe )
CAS CAS
Local Volumes (HA)
CAS
CAS
54. OpenEBS 3.1 Storage Cohort
A storage cohort is an autonomous storage
unit that consists of a set of storage devices
(grouped together as storage pool) and a
storage software running on the nodes
attached to the devices.
The storage software (or the storage controller
aka SDS) helps create and manage storage
volumes and also helps create and manage
corresponding targets that storage initiators
can talk to for any I/O operations.
60. CAS
OpenEBS Future Deployments
Any Workload, Any Cluster
CAS CAS
OpenEBS 4.0 (Multipath) NVMe over Local (Multipath) Mayastor with ANA
NVMe NVMe NVMe
NW Fabric for NVMe
61. OpenEBS Volume Types (Recap)
NVMe NVMe
pv
OpenEBS Local PV ++
(Shared Local Device)
pv
OpenEBS Replicated
(Mayastor)
(over Node Local Devices)
Maya Maya
pv
OpenEBS Local PV (Shared
Device)
pv
OpenEBS
Local PV
62. OpenEBS Storage Cohort
A storage cohort is an autonomous storage
unit that consists of a set of storage devices
(grouped together as storage pool) and a
storage software running on the nodes
attached to the devices.
The storage software (or the storage controller
aka SDS) helps create and manage storage
volumes and also helps create and manage
corresponding targets that storage initiators
can talk to for any I/O operations.