Gluster.Next is an architectural evolution of the Gluster storage platform spanning multiple releases to improve scalability, flexibility, and management. Key changes include distributed hash tables to improve directory operations, server-side replication for flexibility and flash utilization, sharding to spread data blocks, and GlusterD2 for modularity and REST interfaces. These changes will enable storage as a service and platform capabilities at massive scales of thousands of nodes. The first phase debuts in Gluster 3.8 and the full vision in Gluster 4.0 to support new workloads like containers and hyperconverged environments.
Red Hat Gluster Storage - Direction, Roadmap and Use-CasesRed_Hat_Storage
Red Hat Gluster Storage is open, software-defined storage that helps you manage big, unstructured, and semistructured data. This product is based on the open source project GlusterFS, a distributed scale-out file system technology, and focuses on file sharing, analytics, and hyper-converged use cases.
In this session, you will:
See real-life case studies about Red Hat Gluster Storage’s usage in production environments, including ideal workloads.
Learn about the Red Hat Gluster Storage roadmap, including innovations from the GlusterFS community pipeline.
Gain insights into how the product will be integrated with Red Hat Enterprise Virtualization (including hyperconvergence), Red Hat Satellite, and Red Hat Enterprise Linux OpenStack Platform.
"Data classification" is an umbrella term covering things: locality-aware data placement, SSD/disk or normal/deduplicated/erasure-coded data tiering, HSM, etc. They share most of the same infrastructure, and so are proposed (for now) as a single feature.
Red Hat Gluster Storage - Direction, Roadmap and Use-CasesRed_Hat_Storage
Red Hat Gluster Storage is open, software-defined storage that helps you manage big, unstructured, and semistructured data. This product is based on the open source project GlusterFS, a distributed scale-out file system technology, and focuses on file sharing, analytics, and hyper-converged use cases.
In this session, you will:
See real-life case studies about Red Hat Gluster Storage’s usage in production environments, including ideal workloads.
Learn about the Red Hat Gluster Storage roadmap, including innovations from the GlusterFS community pipeline.
Gain insights into how the product will be integrated with Red Hat Enterprise Virtualization (including hyperconvergence), Red Hat Satellite, and Red Hat Enterprise Linux OpenStack Platform.
"Data classification" is an umbrella term covering things: locality-aware data placement, SSD/disk or normal/deduplicated/erasure-coded data tiering, HSM, etc. They share most of the same infrastructure, and so are proposed (for now) as a single feature.
In this session, we'll discuss new volume types in Red Hat Gluster Storage. We will talk about erasure codes and storage tiers, and how they can work together. Future directions will also be touched on, including rule based classifiers and data transformations.
You will learn about:
How erasure codes lower the cost of storage.
How to configure and manage an erasure coded volume.
How to tune Gluster and Linux to optimize erasure code performance.
Using erasure codes for archival workloads.
How to utilize an SSD inexpensively as a storage tier.
Gluster's erasure code and storage tiering design.
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vosNETWAYS
During this talk, Niels will explain the basics of Gluster and show how Bareos integrates with it. Gluster provides a Software Defined Storage environment that can scale-out when the backup storage needs to grow. With a live demonstration Niels shows how simple it is to setup a small Gluster environment and configure Bareos to use the native Gluster protocol.
In this session, we'll discuss new volume types in Red Hat Gluster Storage. We will talk about erasure codes and storage tiers, and how they can work together. Future directions will also be touched on, including rule based classifiers and data transformations.
You will learn about:
How erasure codes lower the cost of storage.
How to configure and manage an erasure coded volume.
How to tune Gluster and Linux to optimize erasure code performance.
Using erasure codes for archival workloads.
How to utilize an SSD inexpensively as a storage tier.
Gluster's erasure code and storage tiering design.
OSBConf 2015 | Scale out backups with bareos and gluster by niels de vosNETWAYS
During this talk, Niels will explain the basics of Gluster and show how Bareos integrates with it. Gluster provides a Software Defined Storage environment that can scale-out when the backup storage needs to grow. With a live demonstration Niels shows how simple it is to setup a small Gluster environment and configure Bareos to use the native Gluster protocol.
For over a decade Network Attached Storage (NAS) was the go to file storage device for organizations needing to store large amounts of unstructured data. But unstructured data is changing. While large file use cases are still prevalent, small file use cases are becoming more dominant. Workloads like artificial intelligence, analytics and IoT are typically driven by millions, if not billions of small files.
Object Storage is often hailed as the heir apparent but is it? Can file systems be redesigned to continue to support traditional NAS workloads while also supporting modern, small file and high velocity workloads? Join Storage Switzerland and Qumulo for our webinar, “NAS vs. Object — Can NAS Make a Comeback,” to learn the state of unstructured data storage and if NAS file systems can provide a superior alternative to object storage.
Join us on our event to learn
•. Why traditional NAS solutions fall short
•. Why object storage systems haven't replaced NAS
• How to bridge the gap by modernizing NAS file systems
•. Live Q&A with file system and NAS experts
In this day and age, data grows so fast it’s not uncommon for those of us using a relational database to reach the limits of its capacity. In this session, Kwangbock Lee explains how Samsung uses ClustrixDB to handle fast-growing data without manual database sharding. He highlights lessons learned, including a few hiccups along the way, and shares Samsung's experience migrating to ClustrixDB.
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022StreamNative
In today’s world, we are seeing a big shift toward the Cloud. With this shift comes a big shift in the expectations we have for a messaging system, especially when the messaging system is presented as managed service in a large-scale, multi-tenant environment. For any large-scale enterprise, it’s very important to evaluate messaging system and be confident before expanding complex distributed data systems like Apache Pulsar from on-premise to elastically scalable, fully managed services on cloud services. We must consider aspects such as: migration from and integration with large-scale on-premise clusters, security, cost efficiency, and the cloud friendliness of the architecture, modeling cost and capacity, tenant isolation, deployment robustness, availability, monitoring, etc. Not every messaging system is built to be cloud-native and run as a managed service with cost efficiency. We have been running large-scale Apache Pulsar at Yahoo for the last 8 years on various platforms and hardware configurations while meeting application SLAs and serving more than 1M topics in a cluster. In this talk, we will talk about Pulsar’s journey in Yahoo! from an on-premise platform to a hybrid cloud and on-premise system. We will talk about Pulsar’s architecture and features that make Pulsar a good cloud-native messaging-system choice for any enterprise.
Hybrid collaborative tiered storage with alluxioThai Bui
Systems that deal with AWS S3 often come with a negative performance impact. There's no co-location and the data has to move through slower, often congested wire networks. Alluxio can provide a caching layer for the data, however there's still the question of how and when to move which data. Should all the data by default be cached or should they be cached when used? In this talk, I will explore that gray area in between where the users and the dataset publishers will collaborate to decide what and how the data is cache in a tiered-storage architecture to maximize performance and minimize operating costs.
This session will cover performance-related developments in Red Hat Gluster Storage 3 and share best practices for testing, sizing, configuration, and tuning.
Join us to learn about:
Current features in Red Hat Gluster Storage, including 3-way replication, JBOD support, and thin-provisioning.
Features that are in development, including network file system (NFS) support with Ganesha, erasure coding, and cache tiering.
New performance enhancements related to the area of remote directory memory access (RDMA), small-file performance, FUSE caching, and solid state disks (SSD) readiness.
2018 Infortrend EonStor GSe Pro Family Introductioninfortrendgroup
EonStor GSe Pro is a single controller solution with support for SATA and SSD drives exclusively designed for SMBs, integrating SAN and NAS services as well as Cloud Gateway features to be deployed under a general NAS architecture for file sharing, backup, cloud synchronization, video editing, and surveillance purposes. Its complete product line features a hardware design of rackmount or desktop enclosures and flexible host boards to choose from; as for software, it comes with complete data services and simple, intuitive management interfaces, so that you can find the perfect storage device according to your performance or budget needs.
More Information:
https://www.infortrend.com/global/products/GSePro
Recommended Series:
https://www.infortrend.com/global/products/families/GSePro/pro3000
https://www.infortrend.com/global/products/families/GSePro/pro1000
https://www.infortrend.com/global/products/families/GSePro/Pro%20200
Apache Hadoop 3 is coming! As the next major milestone for hadoop and big data, it attracts everyone's attention as showcase several bleeding-edge technologies and significant features across all components of Apache Hadoop: Erasure Coding in HDFS, Docker container support, Apache Slider integration and Native service support, Application Timeline Service version 2, Hadoop library updates and client-side class path isolation, etc. In this talk, first we will update the status of Hadoop 3.0 releasing work in apache community and the feasible path through alpha, beta towards GA. Then we will go deep diving on each new feature, include: development progress and maturity status in Hadoop 3. Last but not the least, as a new major release, Hadoop 3.0 will contain some incompatible API or CLI changes which could be challengeable for downstream projects and existing Hadoop users for upgrade - we will go through these major changes and explore its impact to other projects and users.
How to Build a Cloud Native Stack for Analytics with Spark, Hive, and Alluxio...Alluxio, Inc.
Alluxio Austin Meetup
Aug 15, 2019
Speakers: Tim Kelly & Thai Bui, Bazaarvoice
At Bazaarvoice, a software-as-a-service digital marketing company, the data engineering team is tasked to handle data at massive Internet-scale to serve over 1,900 of the biggest internet retailers and brands.
We built our data pipelines all in the cloud using Apache Spark and Hive on AWS EC2 accessing data in S3. AWS enables us to scale “out” the infrastructure capacity effortlessly to keep up with the Internet-scale data and web traffic, but scaling out also exposes certain limitations like the ability to further scale “up”. While this cloud native stack is scalable and elastic we experience performance limitations, because data access is limited by the network bandwidth, and this is exacerbated for workloads that involve repeated queries.
To address the data access challenges, we leverage Alluxio, an open source data orchestration system for analytics in the cloud. Alluxio serves as a transparent caching layer for hot and warm data, such that Hive and Spark jobs are able to access all data transparently in S3. We have seen 10x performance acceleration of Spark and Hive jobs on S3 with Alluxio.
In this talk, we'll walk through RocksDB technology and look into areas where MyRocks is a good fit by comparison to other engines such as InnoDB. We will go over internals, benchmarks, and tuning of MyRocks engine. We also aim to explore the benefits of using MyRocks within the MySQL ecosystem. Attendees will be able to conclude with the latest development of tools and integration within MySQL.
One of the benefits of using MySQL is the fact you can choose a storage engine fitting your requirements. Even though InnoDB is prevalent, it is helpful to know the available choices. Lately, LSM-tree based storage engines and engines using ideas of the technique, are seeing a rise in popularity and use. MyRocks brings LSM-tree benefits and tradeoffs to MySQL. In this talk, we'll walk through RocksDB technology and look into areas where MyRocks is a good fit by comparison to other engines such as InnoDB. We will go over internals, benchmarks, and tuning of MyRocks engine. We also aim to explore the benefits of using MyRocks within the MySQL ecosystem. Attendees will be able to conclude with the latest development of tools and integration within MySQL.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
5. Gluster.Today
- Scale-out storage system
- Aggregates storage exports over network interconnects to provide an
unified namespace
- File, Object, API and Block interfaces
- Layered on disk file systems that support extended attributes
6. Gluster.Today - Features
● Scale-out NAS
○ Elasticity, quotas
● Data Protection and Recovery
○ Volume Snapshots
○ Synchronous & Asynchronous Replication
○ Erasure Coding
● Archival
○ Read-only, WORM
● Native CLI / API for management
7. Gluster.Today - Features
● Isolation for multi-tenancy
○ SSL for data/connection, Encryption at rest
● Performance
○ Data, metadata and readdir caching, tiering (in beta)
● Monitoring
○ Built in io statistics, /proc like interface for introspection
● Provisioning
○ Puppet-gluster, gdeploy
● More..
8. What is Gluster.Next?
- Gluster.today
- Client driven Distribution, Replication and Erasure Coding
(with FUSE)
- Spread across 3 - 100s of nodes
- Geared towards “Storage as a Platform”
9. What is Gluster.Next?
- Gluster.Next
- Architectural Evolution Spanning over multiple releases (3.8 &
4.0)
- Scale-out to 1000s of nodes
- Choice of Distribution, Replication and Erasure Coding on
servers or clients
- Geared towards “Storage as a Platform” and “Storage as a
Service”
- Native REsTful management & eventing for monitoring
11. Terminology
● Trusted Storage Pool - Collection of Storage
Servers
● Brick - An export directory on a server
● Volume - Logical collection of bricks
13. DHT 2
● Problem: directories on all subvolumes
○ directory ops can take O(n) messages
○ not just mkdir or traversals - also create, rename,
and so on
● Solution: each directory on one subvolume
○ can still be replicated etc.
○ each brick can hold data, metadata, or both
○ by default, each is both just like current Gluster
14. DHT 2 (continued)
● Improved layout handling
○ central (replicated) instead of per-brick
○ less space, instantaneous “fix-layout” step
○ layout generations help with lookup efficiency
● Flatter back-end structure
○ makes GFID-based lookups more efficient
○ good for NFS, SMB
15. NSR (name change coming soon)
● Server-side with temporary leader
○ vs. client-side, client-driven
○ can exploit faster/separate server network
● Log/journal based
○ can exploit flash/NVM (“poor man’s tiering”)
● More flexible consistency options
○ fully sync, ordered async, hybrids
○ can replace geo-replication for some use cases
16. NSR Illustrated (data flow)
Client
Server A
(leader)
Server B Server C
no bandwidth splitting
traffic on server network
17. NSR Illustrated (log structure)
Server
client requests
Log 1
(newest)
Log 2
Brick
Filesystem
SSD/NVM for logs
18. Sharding
● Spreads data blocks across a gluster volume
● Primarily targeted for VM image storage
● File sizes not bound by brick or disk limits
● More efficient healing, rebalance and geo-
replication
● Yet another translator in Gluster
20. Network QoS
● Necessary to avoid hot spots at high scale
○ avoid starvation, cascading effects
● A single activity or type of traffic (e.g. self-heal or
rebalance) can be:
○ directed toward a separate network
○ throttled on a shared network
● User gets to control front-end impact vs. recovery time
21. GlusterD2
● More efficient/stable membership
○ especially at high scale
● Stronger configuration consistency
● Modularity and plugins
● Exposes ReST interfaces for management
● Core implementation in Go
26. Event Framework
● Export node and volume events in a more
consumable way
● Support external monitoring and
management
● Currently: via storaged (DBus)
○ yeah, it’s a Red Hat thing
27. Brick Management
● Multi-tenancy, snapshots, etc. mean more
bricks to manage
○ possibly exhaust cores/memory
● Heketi can help
○ can also make things worse (e.g. splitting bricks)
● One daemon/process must handle multiple
bricks to avoid contention/thrashing
○ core infrastructure change, many moving parts
29. Why Gluster.Next?
- Paradigm Changes in IT consumption
- Storage as a Service & Storage as a Platform
- Private, Public & Hybrid clouds
- New Workloads
- Containers, IoT, <buzz-word> demanding scale
- Economics of Scale
30. Why Gluster.Next: StaaS
- User/Tenant driven provisioning
of shares.
- Talk to as many Gluster clusters
and nodes
- Tagging of nodes for
differentiated classes of service
- QoS for preventing noisy
neighbors
31. Why Gluster.Next: Containers
- Persistent storage for stateless Containers
- Non-shared/Block : Gluster backed file through iSCSI
- Shared/File: Multi-tenant Gluster Shares / Volumes
- Shared Storage for container registries
- Geo-replication for DR
- Heketi to ease provisioning
- “Give me a non-shared 5 GB share”
- “Give me a shared 1 TB share”
- Shared Storage use cases being integrated with Docker, Kubernetes &
OpenShift
32. Why Gluster.Next: Hyperconvergence with VMs
- Gluster Processes are
lightweight
- Benign self-healing and
rebalance with sharding
- oVirt & Gluster - already
integrated management
controller
- geo-replication of shards
possible!
37. Other Stuff
● IPv6 support
● “Official” FreeBSD support
● Compression
● Code generation
○ reduce duplication, technical debt
○ ease addition of new functionality
● New tests and test infrastructure