This session will cover performance-related developments in Red Hat Gluster Storage 3 and share best practices for testing, sizing, configuration, and tuning.
Join us to learn about:
Current features in Red Hat Gluster Storage, including 3-way replication, JBOD support, and thin-provisioning.
Features that are in development, including network file system (NFS) support with Ganesha, erasure coding, and cache tiering.
New performance enhancements related to the area of remote directory memory access (RDMA), small-file performance, FUSE caching, and solid state disks (SSD) readiness.
Gluster for Geeks: Performance Tuning Tips & TricksGlusterFS
In this Gluster for Geeks technical webinar, Jacob Shucart, Senior Systems Engineer, will provide useful tips and tricks to make a Gluster cluster meet your performance requirements. He will review considerations for all different phases including planning, configuration, implementation, tuning, and benchmarking.
Topics covered will include:
• Protocols (CIFS, NFS, GlusterFS)
• Hardware configuration
• Tuning parameters
• Performance benchmarks
Amazon Aurora is a MySQL-compatible database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. The service is now in preview. Come to our session for an overview of the service and learn how Aurora delivers up to five times the performance of MySQL yet is priced at a fraction of what you'd pay for a commercial database with similar performance and availability.
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
Instaclustr has a diverse customer base including Ad Tech, IoT and messaging applications ranging from small start ups to large enterprises. In this presentation we share our experiences, common issues, diagnosis methods, and some tips and tricks for managing your Cassandra cluster.
About the Speaker
Brooke Jensen VP Technical Operations & Customer Services, Instaclustr
Instaclustr is the only provider of fully managed Cassandra as a Service in the world. Brooke Jensen manages our team of Engineers that maintain the operational performance of our diverse fleet clusters, as well as providing 24/7 advice and support to our customers. Brooke has over 10 years' experience as a Software Engineer, specializing in performance optimization of large systems and has extensive experience managing and resolving major system incidents.
Gluster for Geeks: Performance Tuning Tips & TricksGlusterFS
In this Gluster for Geeks technical webinar, Jacob Shucart, Senior Systems Engineer, will provide useful tips and tricks to make a Gluster cluster meet your performance requirements. He will review considerations for all different phases including planning, configuration, implementation, tuning, and benchmarking.
Topics covered will include:
• Protocols (CIFS, NFS, GlusterFS)
• Hardware configuration
• Tuning parameters
• Performance benchmarks
Amazon Aurora is a MySQL-compatible database engine that combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. The service is now in preview. Come to our session for an overview of the service and learn how Aurora delivers up to five times the performance of MySQL yet is priced at a fraction of what you'd pay for a commercial database with similar performance and availability.
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
Instaclustr has a diverse customer base including Ad Tech, IoT and messaging applications ranging from small start ups to large enterprises. In this presentation we share our experiences, common issues, diagnosis methods, and some tips and tricks for managing your Cassandra cluster.
About the Speaker
Brooke Jensen VP Technical Operations & Customer Services, Instaclustr
Instaclustr is the only provider of fully managed Cassandra as a Service in the world. Brooke Jensen manages our team of Engineers that maintain the operational performance of our diverse fleet clusters, as well as providing 24/7 advice and support to our customers. Brooke has over 10 years' experience as a Software Engineer, specializing in performance optimization of large systems and has extensive experience managing and resolving major system incidents.
Everything you wanted to know about iSCSI, but were afraid to ask. Presneted by David Black PhD who was a fundamental influence on the development of the protocol. This presentation is really fantastic!
Understanding the Windows Server Administration Fundamentals (Part-2)Tuan Yang
Windows Server Administration is an advanced computer networking topic that includes server installation and configuration, server roles, storage, Active Directory and Group Policy, file, print, and web services, remote access, virtualization, application servers, troubleshooting, performance, and reliability. With these slides, explore the key fundamentals of the Windows Server Administration.
Learn more about:
» Storage technologies.
» File Systems.
» HDD managements.
» Troubleshooting methodology.
» Server boot process.
» System configuration.
» System monitoring.
» High Availability & fault tolerance.
» Back up.
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems, a subsidiary of Oracle Corporation. The features of ZFS include support for high storage capacities, integration of the concepts of file system and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. Unlike traditional file systems, which reside on single devices and thus require a volume manager to use more than one device, ZFS file systems are built on top of virtual storage pools called zpools. A zpool is constructed of virtual devices (vdevs), which are themselves constructed of block devices: files, hard drive partitions, or entire drives, with the last being the recommended usage.[7] Thus, a vdev can be viewed as a group of hard drives. This means a zpool consists of one or more groups of drives.
In addition, pools can have hot spares to compensate for failing disks. In addition, ZFS supports both read and write caching, for which special devices can be used. Solid State Devices can be used for the L2ARC, or Level 2 ARC, speeding up read operations, while NVRAM buffered SLC memory can be boosted with supercapacitors to implement a fast, non-volatile write cache, improving synchronous writes. Finally, when mirroring, block devices can be grouped according to physical chassis, so that the filesystem can continue in the face of the failure of an entire chassis. Storage pool composition is not limited to similar devices but can consist of ad-hoc, heterogeneous collections of devices, which ZFS seamlessly pools together, subsequently doling out space to diverse file systems as needed. Arbitrary storage device types can be added to existing pools to expand their size at any time. The storage capacity of all vdevs is available to all of the file system instances in the zpool. A quota can be set to limit the amount of space a file system instance can occupy, and a reservation can be set to guarantee that space will be available to a file system instance.
Everything you wanted to know about iSCSI, but were afraid to ask. Presneted by David Black PhD who was a fundamental influence on the development of the protocol. This presentation is really fantastic!
Understanding the Windows Server Administration Fundamentals (Part-2)Tuan Yang
Windows Server Administration is an advanced computer networking topic that includes server installation and configuration, server roles, storage, Active Directory and Group Policy, file, print, and web services, remote access, virtualization, application servers, troubleshooting, performance, and reliability. With these slides, explore the key fundamentals of the Windows Server Administration.
Learn more about:
» Storage technologies.
» File Systems.
» HDD managements.
» Troubleshooting methodology.
» Server boot process.
» System configuration.
» System monitoring.
» High Availability & fault tolerance.
» Back up.
Tech Talk: RocksDB Slides by Dhruba Borthakur & Haobo Xu of FacebookThe Hive
This presentation describes the reasons why Facebook decided to build yet another key-value store, the vision and architecture of RocksDB and how it differs from other open source key-value stores. Dhruba describes some of the salient features in RocksDB that are needed for supporting embedded-storage deployments. He explains typical workloads that could be the primary use-cases for RocksDB. He also lays out the roadmap to make RocksDB the key-value store of choice for highly-multi-core processors and RAM-speed storage devices.
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems, a subsidiary of Oracle Corporation. The features of ZFS include support for high storage capacities, integration of the concepts of file system and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. Unlike traditional file systems, which reside on single devices and thus require a volume manager to use more than one device, ZFS file systems are built on top of virtual storage pools called zpools. A zpool is constructed of virtual devices (vdevs), which are themselves constructed of block devices: files, hard drive partitions, or entire drives, with the last being the recommended usage.[7] Thus, a vdev can be viewed as a group of hard drives. This means a zpool consists of one or more groups of drives.
In addition, pools can have hot spares to compensate for failing disks. In addition, ZFS supports both read and write caching, for which special devices can be used. Solid State Devices can be used for the L2ARC, or Level 2 ARC, speeding up read operations, while NVRAM buffered SLC memory can be boosted with supercapacitors to implement a fast, non-volatile write cache, improving synchronous writes. Finally, when mirroring, block devices can be grouped according to physical chassis, so that the filesystem can continue in the face of the failure of an entire chassis. Storage pool composition is not limited to similar devices but can consist of ad-hoc, heterogeneous collections of devices, which ZFS seamlessly pools together, subsequently doling out space to diverse file systems as needed. Arbitrary storage device types can be added to existing pools to expand their size at any time. The storage capacity of all vdevs is available to all of the file system instances in the zpool. A quota can be set to limit the amount of space a file system instance can occupy, and a reservation can be set to guarantee that space will be available to a file system instance.
In this session, we'll discuss new volume types in Red Hat Gluster Storage. We will talk about erasure codes and storage tiers, and how they can work together. Future directions will also be touched on, including rule based classifiers and data transformations.
You will learn about:
How erasure codes lower the cost of storage.
How to configure and manage an erasure coded volume.
How to tune Gluster and Linux to optimize erasure code performance.
Using erasure codes for archival workloads.
How to utilize an SSD inexpensively as a storage tier.
Gluster's erasure code and storage tiering design.
In this session, you'll learn how RBD works, including how it:
Uses RADOS classes to make access easier from user space and within the Linux kernel.
Implements thin provisioning.
Builds on RADOS self-managed snapshots for cloning and differential backups.
Increases performance with caching of various kinds.
Uses watch/notify RADOS primitives to handle online management operations.
Integrates with QEMU, libvirt, and OpenStack.
Red Hat Gluster Storage - Direction, Roadmap and Use-CasesRed_Hat_Storage
Red Hat Gluster Storage is open, software-defined storage that helps you manage big, unstructured, and semistructured data. This product is based on the open source project GlusterFS, a distributed scale-out file system technology, and focuses on file sharing, analytics, and hyper-converged use cases.
In this session, you will:
See real-life case studies about Red Hat Gluster Storage’s usage in production environments, including ideal workloads.
Learn about the Red Hat Gluster Storage roadmap, including innovations from the GlusterFS community pipeline.
Gain insights into how the product will be integrated with Red Hat Enterprise Virtualization (including hyperconvergence), Red Hat Satellite, and Red Hat Enterprise Linux OpenStack Platform.
CRUSH is the powerful, highly configurable algorithm Red Hat Ceph Storage uses to determine how data is stored across the many servers in a cluster. A healthy Red Hat Ceph Storage deployment depends on a properly configured CRUSH map. In this session, we will review the Red Hat Ceph Storage architecture and explain the purpose of CRUSH. Using example CRUSH maps, we will show you what works and what does not, and explain why.
Presented at Red Hat Summit 2016-06-29.
The Google File System (GFS) presented in 2003 is the inspiration for the Hadoop Distributed File System (HDFS). Let's take a deep dive into GFS to better understand Hadoop.
Red Hat Storage Server Administration Deep DiveRed_Hat_Storage
"In this session for administrators of all skill levels, you’ll get a deep technical dive into Red Hat Storage Server and GlusterFS administration.
We’ll start with the basics of what scale-out storage is, and learn about the unique implementation of Red Hat Storage Server and its advantages over legacy and competing technologies. From the basic knowledge and design principles, we’ll move to a live start-to-finish demonstration. Your experience will include:
Building a cluster.
Allocating resources.
Creating and modifying volumes of different types.
Accessing data via multiple client protocols.
A resiliency demonstration.
Expanding and contracting volumes.
Implementing directory quotas.
Recovering from and preventing split-brain.
Asynchronous parallel geo-replication.
Behind-the-curtain views of configuration files and logs.
Extended attributes used by GlusterFS.
Performance tuning basics.
New and upcoming feature demonstrations.
Those new to the scale-out product will leave this session with the knowledge and confidence to set up their first Red Hat Storage Server environment. Experienced administrators will sharpen their skills and gain insights into the newest features. IT executives and managers will gain a valuable overview to help fuel the drive for next-generation infrastructures."
"Data classification" is an umbrella term covering things: locality-aware data placement, SSD/disk or normal/deduplicated/erasure-coded data tiering, HSM, etc. They share most of the same infrastructure, and so are proposed (for now) as a single feature.
Red Hat Storage, based on the upstream GlusterFS project, was developed as a distributed file system in the oil-and-gas, high-performance compute arena. With Red Hat Storage, you can easily set up flexible distributed storage using commodity x86 hardware.
Simple, inexpensive internal or JBOD storage can be linked across multiple physical servers and presented as a single storage namespace. This storage can be used for log files, web content, virtual machine, home directory, and other storage use cases.
In this session, we’ll demonstrate how to:
Install Red Hat Gluster Storage
Configure disks
Link the storage nodes
Define storage bricks
Present storage to clients
We'll talk about tips and tricks, best practices, backup and recovery, and other storage-related topics.
Amazon Elastic Block Store (Amazon EBS) provides persistent block level storage volumes for use with Amazon EC2 instances. In this technical session, we conduct a detailed analysis of the differences among the three types of Amazon EBS block storage: General Purpose (SSD), Provisioned IOPS (SSD), and Magnetic. We discuss how to maximize Amazon EBS performance, with a special eye towards low-latency, high-throughput applications like databases. We discuss Amazon EBS encryption and share best practices for Amazon EBS snapshot management. Throughout, we share tips for success.
BlueStore: a new, faster storage backend for CephSage Weil
Traditionally Ceph has made use of local file systems like XFS or btrfs to store its data. However, the mismatch between the OSD's requirements and the POSIX interface provided by kernel file systems has a huge performance cost and requires a lot of complexity. BlueStore, an entirely new OSD storage backend, utilizes block devices directly, doubling performance for most workloads. This talk will cover the motivation a new backend, the design and implementation, the improved performance on HDDs, SSDs, and NVMe, and discuss some of the thornier issues we had to overcome when replacing tried and true kernel file systems with entirely new code running in userspace.
Data deduplication is a hot topic in storage and saves significant disk space for many environments, with some trade offs. We’ll discuss what deduplication is and where the Open Source solutions are versus commercial offerings. Presentation will lean towards the practical – where attendees can use it in their real world projects (what works, what doesn’t, should you use in production, etcetera).
Accelerating hbase with nvme and bucket cacheDavid Grier
This set of slides describes some initial experiments which we have designed for discovering improvements for performance in Hadoop technologies using NVMe technology
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)Lars Marowsky-Brée
A presentation discussing various aspects that affect performance of Ceph clusters, and how to map, model, and predict their performance.
This lays the groundwork for building a Ceph cluster measurement and benchmark suite that eventually will build up a data corpus on performance characteristics that can be used to answer these key questions:
- How to build a storage system that meets my requirements?
- If I build a system like this, what will its characteristics be?
- If I change XY in my existing system, how will its characteristics change?
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
on-Volatile-Memory express (NVMe) standard promises and order of magnitude faster storage than regular SSDs, while at the same time being more economical than regular RAM on TB/$. This talk evaluates the use cases and benefits of NVMe drives for its use in Big Data clusters with HBase and Hadoop HDFS.
First, we benchmark the different drives using system level tools (FIO) to get maximum expected values for each different device type and set expectations. Second, we explore the different options and use cases of HBase storage and benchmark the different setups. And finally, we evaluate the speedups obtained by the NVMe technology for the different Big Data use cases from the YCSB benchmark.
In summary, while the NVMe drives show up to 8x speedup in best case scenarios, testing the cost-efficiency of new device technologies is not straightforward in Big Data, where we need to overcome system level caching to measure the maximum benefits.
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
Speakers: Dominic Dwyer & Wei Shan Ang
This talk was presented in Percona Live Europe 2017. However, we did not have enough time to test against more scenario. We will be giving an updated talk with a more comprehensive tests and numbers. We hope to run it against citusDB and MongoRocks as well to provide a comprehensive comparison.
https://www.percona.com/live/e17/sessions/high-performance-json-postgresql-vs-mongodb
Como creamos QuestDB Cloud, un SaaS basado en Kubernetes alrededor de QuestDB...javier ramirez
QuestDB es una base de datos open source de alto rendimiento. Mucha gente nos comentaba que les gustaría usarla como servicio, sin tener que gestionar las máquinas. Así que nos pusimos manos a la obra para desarrollar una solución que nos permitiese lanzar instancias de QuestDB con provisionado, monitorización, seguridad o actualizaciones totalmente gestionadas.
Unos cuantos clusters de Kubernetes más tarde, conseguimos lanzar nuestra oferta de QuestDB Cloud. Esta charla es la historia de cómo llegamos ahí. Hablaré de herramientas como Calico, Karpenter, CoreDNS, Telegraf, Prometheus, Loki o Grafana, pero también de retos como autenticación, facturación, multi-nube, o de a qué tienes que decir que no para poder sobrevivir en la nube.
Kafka on ZFS: Better Living Through Filesystems confluent
(Hugh O'Brien, Jet.com) Kafka Summit SF 2018
You’re doing disk IO wrong, let ZFS show you the way. ZFS on Linux is now stable. Say goodbye to JBOD, to directories in your reassignment plans, to unevenly used disks. Instead, have 8K Cloud IOPS for $25, SSD speed reads on spinning disks, in-kernel LZ4 compression and the smartest page cache on the planet. (Fear compactions no more!)
Learn how Jet’s Kafka clusters squeeze every drop of disk performance out of Azure, all completely transparent to Kafka.
-Striping cheap disks to maximize instance IOPS
-Block compression to reduce disk usage by ~80% (JSON data)
-Instance SSD as the secondary read cache (storing compressed data), eliminating >99% of disk reads and safe across host redeployments
-Upcoming features: Compressed blocks in memory, potentially quadrupling your page cache (RAM) for free
We’ll cover:
-Basic Principles
-Adapting ZFS for cloud instances (gotchas)
-Performance tuning for Kafka
-Benchmarks
This presentation provides a basic overview of Ceph, upon which SUSE Storage is based. It discusses the various factors and trade-offs that affect the performance and other functional and non-functional properties of a software-defined storage (SDS) environment.
Similar to Red Hat Gluster Storage Performance (20)
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Red Hat Gluster Storage Performance
1. Red Hat Gluster Storage performance
Manoj Pillai and Ben England
Performance Engineering
June 25, 2015
2. New or improved features (in last year)
Erasure Coding
Snapshots
NFS-Ganesha
RDMA
SSD support
3. Erasure Coding
“distributed software RAID”
● Alternative to RAID controllers or 3-way replication
● Cuts storage cost/TB, but computationally expensive
● Better Sequential Write performance for some workloads
● Roughly same sequential Read performance (depends on mountpoints)
● In RHGS 3.1 avoid Erasure Coding for pure-small-file or pure random I/O workloads
● Example use cases are archival, video capture
4. Disperse translator spreads EC stripes in file across hosts
Example: EC4+2
E[1] E[2] E[6]
E[1] E[2] E[6]
...
Stripe 1
Stripe N
Server 1 Server 2
...
Server 6
Brick 1 Brick 2 Brick 6Brick 1 Brick 2 Brick 6
6. A tale of two mountpoints (per server)
stacked CPU utilization graph by CPU core, 12 cores
glusterfs hot thread limits 1 mountpoint's throughput
1 mountpoint per server 4 mountpoints per server
7. A tale of two mountpoints
why the difference in CPU utilization?
8. SSDs as bricks
● Multi-thread-epoll = multiple threads working in each client mountpoint and server brick
● Can be helpful for SSDs or any other high-IOPS workload
● glusterfs-3.7 with single 2-socket Sandy Bridge using 1 SAS SSD (SANDisk Lightning)
●
9. RDMA enhancements
● Gluster has had RDMA in some form for a long time
● Gluster-3.6 added librdmacm support – broadens supported hardware
● By Gluster 3.7, memory pre-registration reduces latency
10. JBOD Support
● RHGS has traditionally used H/W RAID for brick storage, with replica-2 protection.
● RHGS 3.0.4 has JBOD+replica-3 support.
● H/W RAID problems:
– Proprietary interfaces for managing h/w RAID
– Performance impact with many concurrent streams
● JBOD+replica-3 shortcomings:
– Each file on one disk, low throughput for serial workloads
– Large number of bricks in the volume; problematic for some workloads
● JBOD+replica-3 expands the set of workloads that RHGS can handle well
– Best for highly-concurrent, large-file read workloads
12. NFS-Ganesha
● NFSv3 access to RHGS volumes supported so far with gluster native NFS server
● NFS-Ganesha integration with FSAL-gluster expands supported access protocols
– NFSv3 – has been in Technology Preview
– NFSv4, NFSv4.1, pNFS
● Access path uses libgfapi, avoids FUSE
14. Snapshots
● Based on device-mapper thin-provisioned snapshots
– Simplified space management for snapshots
– Allow large number of snapshots without performance degradation
● Required change from traditional LV to thin LV for RHGS brick storage
– Performance impact? Typically 10-15% for large file sequential read as a result of
fragmentation
● Snapshot performance impact
– Mainly due to writes to “shared” blocks, copy-on-write triggered on first write to a
region after snapshot
– Independent of number of snapshots in existence
15. Improved rebalancing
● Rebalancing lets you add/remove hardware from an online Gluster volume
● Important for scalability, redeployment of hardware resources
● Existing algorithm had shortcomings
– Did not work well for small files
– Was not parallel enough
– No throttle
● New algorithm solves these problems
– Executes in parallel on all bricks
– Gives you control over number of concurrent I/O requests/brick
17. Configurations to avoid with Gluster (today)
● Super-large RAID volumes (e.g. RAID60)
● – example: RAID60 with 2 striped RAID6 12-disk components
● – Single glusterfsd process serving a large number of disks
– recommend separate RAID LUNs instead
● JBOD configuration with very large server count
● – Gluster directories are still spread across every brick
● – with JBOD, that means every disk!
● – 64 servers x 36 disks/server = ~2300 bricks
● – recommendation: use RAID6 bricks of 12 disks each
● – even then, 64x3 = 192 bricks, still not ideal for anything but large files
18. Test methodology
● How well does RHGS work for your use-case?
● Some benchmarking tools:
– Use tools with a distributed mode, so multiple clients can put load on servers
– Iozone (large-file sequential workloads), smallfile benchmark, fio (better than iozone
for random i/o testing.
● Beyond micro-benchmarking
– SPECsfs2014 provides approximation to some real-life workloads
– Being used internally
– Requires license
● SPECsfs2014 provides mixed-workload generation in different flavors
● VDA (video data acquisition), VDI (virtual desktop infrastructure), SWBUILD
(software build)
19. Application filesystem usage patterns to avoid with Gluster
● Single-threaded application – one-file-at-a-time processing
● – uses only small fraction (1 DHT subvolume) of Gluster hardware
Tiny files – cheap on local filesystems, expensive on distributed filesystems
● Small directories
● – creation/deletion/read/rename/metadata-change cost x brick count!
● – large file:directory ratio not bad as of glusterfs-3.7
● Using repeated directory scanning to synchronize processes on different clients
● – Gluster 3.6 (RHS 3.0.4) does not yet invalidate metadata cache on clients
20. Initial Data ingest
● Problem: applications often have previous data, must load Gluster volume
● Typical methods are excruciatingly slow (see lower right!)
● – Example: single mountpoint, rsync -ravu
● Solutions:
– - for large files on glusterfs, use largest xfer size
– - copy multiple subdirectories in parallel
– - multiple mountpoints per client
– - multiple clients
– - mount option "gid-timeout=5"
– - for glusterfs, increase client.event-threads to 8
21. SSDs as bricks
● Avoid use of storage controller WB cache
● separate volume for SSD
● Check “top -H”, look for hot glusterfsd threads on server with SSDs
● Gluster tuning for SSDs: server.event-threads > 2
● SAS SSD:
– Sequential I/O: relatively low sequential write transfer rate
– Random I/O: avoids seek overhead, good IOPS
– Scaling: more SAS slots => greater TB/host, high aggregate IOPS
● PCI:
– Sequential I/O: much higher transfer rate since shorter data path
Random I/O: lowest latency yields highest IOPS
– Scaling: more expensive, aggregate IOPS limited by PCI slots
●
22. High-speed networking > 10 Gbps
● Don't need RDMA for 10-Gbps network, better with >= 40 Gbps
● Infiniband alternative to RDMA – ipoib
● - Jumbo Frames (MTU=65520) – all switches must support
● - “connected mode”
● - TCP will get you to about ½ – ¾ 40-Gbps line speed
● 10-GbE bonding – see gluster.org how-to
● - default bonding mode 0 – don't use it
● - best modes are 2 (balance-xor), 4 (802.3ad), 6 (balance-alb)
● FUSE (glusterfs mountpoints) –
– No 40-Gbps line speed from one mountpoint
– Servers don't run FUSE => best with multiple clients/server
● NFS+SMB servers use libgfapi, no FUSE overhead
27. Bitrot detection – in glusterfs-3.7 = RHS 3.1
● Provides greater durability for Gluster data (JBOD)
● Protects against silent loss of data
● Requires signature on replica recording original checksum
● Requires periodic scan to verify data still matches checksum
● Need more data on cost of the scan
● TBS – DIAGRAMS, ANY DATA?
28. A tale of two mountpoints - sequential write performance
And the result... drum roll....
29. Balancing storage and networking performance
● Based on workload
– Transactional or small-file workloads
● don't need > 10 Gbps
● Need lots of IOPS (e.g. SSD)
– Large-file sequential workloads (e.g. video capture)
● Don't need so many IOPS
● Need network bandwidth
– When in doubt, add more networking, cost < storage
30. Cache tiering
● Goal: performance of SSD with cost/TB of spinning rust
● Savings from Erasure Coding can pay for SSD!
● Definition: Gluster tiered volume consists of two subvolumes:
● - “hot” tier: sub-volume low capacity, high performance
● - “cold” tier: sub-volume – high capacity, low performance
● -- promotion policy: migrates data from cold tier to hot tier
● -- demotion policy: migrates data from hot tier to cold tier
● - new files are written to hot tier initially unless hot tier is full
31. perf enhancements
unless otherwise stated,
UNDER CONSIDERATION, NOT IMPLEMENTED
● Lookup-unhashed=auto in Glusterfs 3.7 today, in RHGS 3.1 soon
– Eliminates LOOKUP per brick during file creation, etc.
● JBOD support – Glusterfs 4.0 – DHT V2 intended to eliminate spread of
directories across all bricks
● Sharding – spread file across more bricks (like Ceph, HDFS)
● Erasure Coding – Intel instruction support, symmetric encoding, bigger chunk
size
● Parallel utilities – examples are parallel-untar.py and parallel-rm-rf.py
● Better client-side caching – cache invalidation starting in glusterfs-3.7
● YOU CAN HELP DECIDE! Express interest and opinion on this