Steffen Christgau, Supercomputing dept., Zuse Institute Berlin
Tobias Watermann, Supercomputing dept., Zuse Institute Berlin
Thomas Steinke, Supercomputing dept., Zuse Institute Berlin
DAOS User Group event, November 2020.
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)Christian Theune
This document summarizes a company's experience deploying the CEPH storage system in a production environment. They migrated about 50% of their storage to CEPH running on existing hardware with 6 object storage devices and 5 monitors. CEPH eliminated their single point of failure, allowed easy creation and destruction of virtual machine images, and provided self-healing capabilities. Some current issues include balancing bandwidth and latency for replicas and deciding how to allocate placement groups for different situations. Overall, the company is pleased with CEPH for providing the virtualized storage capabilities they desired.
Ceph Day London 2014 - Deploying ceph in the wildCeph Community
The document discusses Wido den Hollander's experience deploying Ceph storage clusters in various organizations. It describes two example deployments: one using Ceph block storage with CloudStack for 1000 VMs and S3 object storage, using SSDs and HDDs optimized for IOPS; the other using Ceph block storage with OCFS2 shared filesystem between web servers, using all SSDs and 10GbE networking for low latency. The document emphasizes designing for high IOPS rather than just capacity, and discusses best practices like regular hardware updates and testing recovery scenarios.
This document summarizes Dan van der Ster's experience scaling Ceph at CERN. CERN uses Ceph as the backend storage for OpenStack volumes and images, with plans to also use it for physics data archival and analysis. The 3PB Ceph cluster consists of 47 disk servers and 1,128 OSDs. Some lessons learned include managing latency, handling many objects, tuning CRUSH, trusting clients, and avoiding human errors when managing such a large cluster.
The document provides updates on the Ceph community and projects. It discusses that Ceph Day Melbourne 2015 was community-focused and planning has begun for 2016 events. It also summarizes that a Ceph hackathon was hosted by Intel, metrics platforms are being used and evolved to track community growth, and a user committee discusses community matters. Additional community initiatives discussed include Google Summer of Code, the CentOS Storage SIG, governance plans, Ceph tech talks, and the Ceph developer summit.
1 Million Writes per second on 60 nodes with Cassandra and EBSJim Plush
This document summarizes a presentation about achieving 1 million writes per second on Cassandra using 60 nodes with Amazon EBS storage. It discusses how CrowdStrike tested Cassandra on EBS and achieved higher performance than anticipated by optimizing configurations, using larger instance types with more CPUs and storage, and tweaking Cassandra settings. It provides examples of testing methodology and results showing high throughput and low latency even with large volumes of data and I/O.
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
This document summarizes a presentation about Ceph and OpenStack. The presentation introduces Ceph as software that provides object, block, and file storage. It discusses how Ceph can run on commodity hardware and scale to exabyte sizes. The presentation also covers how Ceph integrates well with OpenStack for tasks like live instance migration between compute nodes with shared Ceph storage. It outlines upcoming improvements to Ceph in areas like geo-replication, erasure coding, and tiering. The document encourages community involvement through online summits, Ceph Days events, IRC channels, and mailing lists.
Flying Circus Ceph Case Study (CEPH Usergroup Berlin)Christian Theune
This document summarizes a company's experience deploying the CEPH storage system in a production environment. They migrated about 50% of their storage to CEPH running on existing hardware with 6 object storage devices and 5 monitors. CEPH eliminated their single point of failure, allowed easy creation and destruction of virtual machine images, and provided self-healing capabilities. Some current issues include balancing bandwidth and latency for replicas and deciding how to allocate placement groups for different situations. Overall, the company is pleased with CEPH for providing the virtualized storage capabilities they desired.
Ceph Day London 2014 - Deploying ceph in the wildCeph Community
The document discusses Wido den Hollander's experience deploying Ceph storage clusters in various organizations. It describes two example deployments: one using Ceph block storage with CloudStack for 1000 VMs and S3 object storage, using SSDs and HDDs optimized for IOPS; the other using Ceph block storage with OCFS2 shared filesystem between web servers, using all SSDs and 10GbE networking for low latency. The document emphasizes designing for high IOPS rather than just capacity, and discusses best practices like regular hardware updates and testing recovery scenarios.
This document summarizes Dan van der Ster's experience scaling Ceph at CERN. CERN uses Ceph as the backend storage for OpenStack volumes and images, with plans to also use it for physics data archival and analysis. The 3PB Ceph cluster consists of 47 disk servers and 1,128 OSDs. Some lessons learned include managing latency, handling many objects, tuning CRUSH, trusting clients, and avoiding human errors when managing such a large cluster.
The document provides updates on the Ceph community and projects. It discusses that Ceph Day Melbourne 2015 was community-focused and planning has begun for 2016 events. It also summarizes that a Ceph hackathon was hosted by Intel, metrics platforms are being used and evolved to track community growth, and a user committee discusses community matters. Additional community initiatives discussed include Google Summer of Code, the CentOS Storage SIG, governance plans, Ceph tech talks, and the Ceph developer summit.
1 Million Writes per second on 60 nodes with Cassandra and EBSJim Plush
This document summarizes a presentation about achieving 1 million writes per second on Cassandra using 60 nodes with Amazon EBS storage. It discusses how CrowdStrike tested Cassandra on EBS and achieved higher performance than anticipated by optimizing configurations, using larger instance types with more CPUs and storage, and tweaking Cassandra settings. It provides examples of testing methodology and results showing high throughput and low latency even with large volumes of data and I/O.
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
This document summarizes a presentation about Ceph and OpenStack. The presentation introduces Ceph as software that provides object, block, and file storage. It discusses how Ceph can run on commodity hardware and scale to exabyte sizes. The presentation also covers how Ceph integrates well with OpenStack for tasks like live instance migration between compute nodes with shared Ceph storage. It outlines upcoming improvements to Ceph in areas like geo-replication, erasure coding, and tiering. The document encourages community involvement through online summits, Ceph Days events, IRC channels, and mailing lists.
Walk Through a Software Defined Everything PoCCeph Community
This document summarizes a proof of concept for a software defined data center using OpenStack and Midokura MidoNet software defined networking. The POC used 4 controllers, 8 Ceph storage nodes, and 16 compute nodes with Midokura providing logical layer 2-4 networking services. Key lessons learned included planning the underlay network configuration, optimizing Zookeeper connections, and improving OpenStack deployment processes which can be complex. Performance testing showed Ceph throughput was higher for reads than writes and SSD journaling improved IOPS. The streamlined workflow provided by the software defined infrastructure could help reduce costs and management complexity for organizations.
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
This document discusses Dell's support for CEPH storage solutions and provides an agenda for a CEPH Day event at Dell. Key points include:
- Dell is a certified reseller of Red Hat-Inktank CEPH support, services, and training.
- The agenda covers why Dell supports CEPH, hardware recommendations, best practices shared with CEPH colleagues, and a concept for research data storage that is seeking input.
- Recommended CEPH architectures, components, configurations, and considerations are discussed for planning and implementing a CEPH solution. Dell server hardware options that could be used are also presented.
OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju
The University of Alabama at Birmingham gives scientists and researchers a massive, on-demand, virtual storage cloud using OpenStack and Ceph for less than $0.41 per gigabyte. This is a session at the OpenStack summit given by Kamesh Pemmaraju at Dell and John Paul at University of Alabama. This will detail how the university IT staff deployed a private storage cloud infrastructure using the Dell OpenStack cloud solution with Dell servers, storage, networking and OpenStack, and Inktank Ceph. After assessing a number of traditional storage scenarios, the University partnered with Dell and Inktank to architect a centralized cloud storage platform that was capable of scaling seamlessly and rapidly, was cost-effective, and that could leverage a single hardware infrastructure for the OpenStack compute and storage environment.
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Community
This document summarizes lessons learned from Target's initial Ceph deployment and subsequent improvements. The initial deployment suffered from poor performance due to using unreliable SATA drives without caching. Instrumentation would have revealed issues sooner. The redesigned deployment used SSD journals and improved hardware, increasing performance 10x. Key lessons are to understand objectives, select suitable hardware, monitor metrics, and not assume Ceph can overcome poor hardware choices. Future work includes all-SSD testing and automating deployments.
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangCeph Community
This document discusses using Linux block caching with Ceph BlueStore. It explains that BlueStore can better utilize fast storage devices like SSDs compared to FileStore. It tested using Bcache and DM-writeboost to cache BlueStore data on HDDs using SSDs. Bcache performed better overall. Issues found were slow requests when caching and BlueStore used the same SSD, and inconsistency in SSD data management between BlueStore and the cache. Future work could have BlueStore control all raw disks and prioritize data saving to fast devices.
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster Ceph Community
CeTune is an open source toolkit for deploying, benchmarking, analyzing, and tuning Ceph clusters. It provides a web UI for configuring Ceph deployments and benchmarks, and generates reports with performance metrics and system analysis. CeTune aims to simplify tuning Ceph for beginners and engineers, and provide an easy way for developers to stress test and analyze performance. It includes modules for deployment, benchmarking with tools like FIO and Cosbench, system analysis with tools like iostat and sar, and tuning Ceph configurations.
Ceph Day Melbourne - Walk Through a Software Defined Everything PoCCeph Community
This document summarizes a proof of concept for a software defined everything architecture using OpenStack and Midokura software defined networking. The objectives are to abstract compute, network and storage resources and dynamically allocate them as needed. It aims to provide scalability, availability, isolation, metering and automation of infrastructure. The proof of concept uses OpenStack for compute with Ceph for storage, and Midokura for software defined networking to provide logical switching, routing, firewalling and load balancing. It details the architecture, infrastructure, configuration and lessons learned from the implementation.
Ceph is a open source , software defined storage excellent and the only ( i would say ) storage backend as a cloud storage. Ceph is the Future of Storage. In this presentation i am explaining ceph and openstack briefly , you would definitely enjoy it.
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph Community
This document discusses Supermicro's portfolio of scale-out optimized storage nodes and Ceph-ready hardware solutions. It presents several models of storage nodes that support high density and ultra dense storage and are optimized for the Red Hat Ceph storage platform. The document also covers Supermicro's modular LAN switching I/O modules that provide flexible networking connectivity including 10GbE, 25GbE, and InfiniBand options.
Cephalocon APAC 2018
March 22-23, 2018 - Beijing, China
Lars Marowsky-Brée SUSE Distinguished Engineer, Ceph Advisory Board member
Marc Koderer, SAP OpenStack Evangelist
This document summarizes a presentation about tuning MySQL performance on Ceph block storage. The presentation covers Ceph architecture, tuning Ceph block devices, and tuning QEMU block virtualization. It then shows benchmarks comparing different configurations for reads, writes, and a 70/30 read/write mix using Sysbench OLTP workloads. Configurations tested include QEMU backends, caching modes, I/O threading, virtio types and queues, and containers versus metal. The goal is to understand how to optimize MySQL on Ceph block devices.
New use cases for Ceph, beyond OpenStack, Luis RicoCeph Community
Luis Rico presented on new use cases for Ceph storage beyond OpenStack. Ceph can now be used for object storage with improved S3 and Swift compatibility, multi-site replication capabilities across data centers, and as elastic storage for big data workloads through compatibility with the Hadoop S3A filesystem client. Red Hat Ceph Storage provides a scalable, software-defined storage platform that is increasingly being used for new applications beyond its traditional role in OpenStack environments.
LinkedIn leverages the Apache Hadoop ecosystem for its big data analytics. Steady growth of the member base at LinkedIn along with their social activities results in exponential growth of the analytics infrastructure. Innovations in analytics tooling lead to heavier workloads on the clusters, which generate more data, which in turn encourage innovations in tooling and more workloads. Thus, the infrastructure remains under constant growth pressure. Heterogeneous environments embodied via a variety of hardware and diverse workloads make the task even more challenging.
This talk will tell the story of how we doubled our Hadoop infrastructure twice in the past two years.
• We will outline our main use cases and historical rates of cluster growth in multiple dimensions.
• We will focus on optimizations, configuration improvements, performance monitoring and architectural decisions we undertook to allow the infrastructure to keep pace with business needs.
• The topics include improvements in HDFS NameNode performance, and fine tuning of block report processing, the block balancer, and the namespace checkpointer.
• We will reveal a study on the optimal storage device for HDFS persistent journals (SATA vs. SAS vs. SSD vs. RAID).
• We will also describe Satellite Cluster project which allowed us to double the objects stored on one logical cluster by splitting an HDFS cluster into two partitions without the use of federation and practically no code changes.
• Finally, we will take a peek at our future goals, requirements, and growth perspectives.
SPEAKERS
Konstantin Shvachko, Sr Staff Software Engineer, LinkedIn
Erik Krogen, Senior Software Engineer, LinkedIn
This document discusses Linux server provisioning using Stacki. Stacki is a tool that automates the provisioning of Linux servers at scale from bare metal to a fully configured system. It addresses the exponential complexity of managing large clusters as more servers are added. Stacki handles all aspects of server provisioning from OS installation to configuration of networking, storage, software and more. It provides a fully automated, repeatable process to quickly deploy and manage servers.
Walk Through a Software Defined Everything PoCCeph Community
This document summarizes a proof of concept for a software defined data center using OpenStack and Midokura MidoNet software defined networking. The POC used 4 controllers, 8 Ceph storage nodes, and 16 compute nodes with Midokura providing logical layer 2-4 networking services. Key lessons learned included planning the underlay network configuration, optimizing Zookeeper connections, and improving OpenStack deployment processes which can be complex. Performance testing showed Ceph throughput was higher for reads than writes and SSD journaling improved IOPS. The streamlined workflow provided by the software defined infrastructure could help reduce costs and management complexity for organizations.
Ceph Day London 2014 - Best Practices for Ceph-powered Implementations of Sto...Ceph Community
This document discusses Dell's support for CEPH storage solutions and provides an agenda for a CEPH Day event at Dell. Key points include:
- Dell is a certified reseller of Red Hat-Inktank CEPH support, services, and training.
- The agenda covers why Dell supports CEPH, hardware recommendations, best practices shared with CEPH colleagues, and a concept for research data storage that is seeking input.
- Recommended CEPH architectures, components, configurations, and considerations are discussed for planning and implementing a CEPH solution. Dell server hardware options that could be used are also presented.
OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju
The University of Alabama at Birmingham gives scientists and researchers a massive, on-demand, virtual storage cloud using OpenStack and Ceph for less than $0.41 per gigabyte. This is a session at the OpenStack summit given by Kamesh Pemmaraju at Dell and John Paul at University of Alabama. This will detail how the university IT staff deployed a private storage cloud infrastructure using the Dell OpenStack cloud solution with Dell servers, storage, networking and OpenStack, and Inktank Ceph. After assessing a number of traditional storage scenarios, the University partnered with Dell and Inktank to architect a centralized cloud storage platform that was capable of scaling seamlessly and rapidly, was cost-effective, and that could leverage a single hardware infrastructure for the OpenStack compute and storage environment.
Ceph Day Chicago - Ceph Deployment at Target: Best Practices and Lessons LearnedCeph Community
This document summarizes lessons learned from Target's initial Ceph deployment and subsequent improvements. The initial deployment suffered from poor performance due to using unreliable SATA drives without caching. Instrumentation would have revealed issues sooner. The redesigned deployment used SSD journals and improved hardware, increasing performance 10x. Key lessons are to understand objectives, select suitable hardware, monitor metrics, and not assume Ceph can overcome poor hardware choices. Future work includes all-SSD testing and automating deployments.
Linux Block Cache Practice on Ceph BlueStore - Junxin ZhangCeph Community
This document discusses using Linux block caching with Ceph BlueStore. It explains that BlueStore can better utilize fast storage devices like SSDs compared to FileStore. It tested using Bcache and DM-writeboost to cache BlueStore data on HDDs using SSDs. Bcache performed better overall. Issues found were slow requests when caching and BlueStore used the same SSD, and inconsistency in SSD data management between BlueStore and the cache. Future work could have BlueStore control all raw disks and prioritize data saving to fast devices.
Ceph Day Shanghai - CeTune - Benchmarking and tuning your Ceph cluster Ceph Community
CeTune is an open source toolkit for deploying, benchmarking, analyzing, and tuning Ceph clusters. It provides a web UI for configuring Ceph deployments and benchmarks, and generates reports with performance metrics and system analysis. CeTune aims to simplify tuning Ceph for beginners and engineers, and provide an easy way for developers to stress test and analyze performance. It includes modules for deployment, benchmarking with tools like FIO and Cosbench, system analysis with tools like iostat and sar, and tuning Ceph configurations.
Ceph Day Melbourne - Walk Through a Software Defined Everything PoCCeph Community
This document summarizes a proof of concept for a software defined everything architecture using OpenStack and Midokura software defined networking. The objectives are to abstract compute, network and storage resources and dynamically allocate them as needed. It aims to provide scalability, availability, isolation, metering and automation of infrastructure. The proof of concept uses OpenStack for compute with Ceph for storage, and Midokura for software defined networking to provide logical switching, routing, firewalling and load balancing. It details the architecture, infrastructure, configuration and lessons learned from the implementation.
Ceph is a open source , software defined storage excellent and the only ( i would say ) storage backend as a cloud storage. Ceph is the Future of Storage. In this presentation i am explaining ceph and openstack briefly , you would definitely enjoy it.
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph Community
This document discusses Supermicro's portfolio of scale-out optimized storage nodes and Ceph-ready hardware solutions. It presents several models of storage nodes that support high density and ultra dense storage and are optimized for the Red Hat Ceph storage platform. The document also covers Supermicro's modular LAN switching I/O modules that provide flexible networking connectivity including 10GbE, 25GbE, and InfiniBand options.
Cephalocon APAC 2018
March 22-23, 2018 - Beijing, China
Lars Marowsky-Brée SUSE Distinguished Engineer, Ceph Advisory Board member
Marc Koderer, SAP OpenStack Evangelist
This document summarizes a presentation about tuning MySQL performance on Ceph block storage. The presentation covers Ceph architecture, tuning Ceph block devices, and tuning QEMU block virtualization. It then shows benchmarks comparing different configurations for reads, writes, and a 70/30 read/write mix using Sysbench OLTP workloads. Configurations tested include QEMU backends, caching modes, I/O threading, virtio types and queues, and containers versus metal. The goal is to understand how to optimize MySQL on Ceph block devices.
New use cases for Ceph, beyond OpenStack, Luis RicoCeph Community
Luis Rico presented on new use cases for Ceph storage beyond OpenStack. Ceph can now be used for object storage with improved S3 and Swift compatibility, multi-site replication capabilities across data centers, and as elastic storage for big data workloads through compatibility with the Hadoop S3A filesystem client. Red Hat Ceph Storage provides a scalable, software-defined storage platform that is increasingly being used for new applications beyond its traditional role in OpenStack environments.
LinkedIn leverages the Apache Hadoop ecosystem for its big data analytics. Steady growth of the member base at LinkedIn along with their social activities results in exponential growth of the analytics infrastructure. Innovations in analytics tooling lead to heavier workloads on the clusters, which generate more data, which in turn encourage innovations in tooling and more workloads. Thus, the infrastructure remains under constant growth pressure. Heterogeneous environments embodied via a variety of hardware and diverse workloads make the task even more challenging.
This talk will tell the story of how we doubled our Hadoop infrastructure twice in the past two years.
• We will outline our main use cases and historical rates of cluster growth in multiple dimensions.
• We will focus on optimizations, configuration improvements, performance monitoring and architectural decisions we undertook to allow the infrastructure to keep pace with business needs.
• The topics include improvements in HDFS NameNode performance, and fine tuning of block report processing, the block balancer, and the namespace checkpointer.
• We will reveal a study on the optimal storage device for HDFS persistent journals (SATA vs. SAS vs. SSD vs. RAID).
• We will also describe Satellite Cluster project which allowed us to double the objects stored on one logical cluster by splitting an HDFS cluster into two partitions without the use of federation and practically no code changes.
• Finally, we will take a peek at our future goals, requirements, and growth perspectives.
SPEAKERS
Konstantin Shvachko, Sr Staff Software Engineer, LinkedIn
Erik Krogen, Senior Software Engineer, LinkedIn
This document discusses Linux server provisioning using Stacki. Stacki is a tool that automates the provisioning of Linux servers at scale from bare metal to a fully configured system. It addresses the exponential complexity of managing large clusters as more servers are added. Stacki handles all aspects of server provisioning from OS installation to configuration of networking, storage, software and more. It provides a fully automated, repeatable process to quickly deploy and manage servers.
Apache Bigtop: a crash course in deploying a Hadoop bigdata management platformrhatr
A long time ago in a galaxy far, far away only the chosen few could deploy and operate a fully functional Hadoop cluster. Vendors were taking pride in rationalizing this experience to their customers by creating various distributions including Apache Hadoop. It all changed when Cloudera decided to support Apache Bigtop as the first 100% community driven bigdata management distribution of Apache Hadoop. Today, most major commercial distribution of Apache Hadoop are based on Bigtop. Bigtop has won the Hadoop distributions war and is offering a superset of packaged components. In this talk we will focus on practical advice of how to deploy and start operating a Hadoop cluster using Bigtop’s packages and deployment code. We will dive into the details of using packages of Hadoop ecosystem provided by Bigtop and how to build data management pipelines in support your enterprise applications.
Latest (storage IO) patterns for cloud-native applications OpenEBS
Applying micro service patterns to storage giving each workload its own Container Attached Storage (CAS) system. This puts the DevOps persona within full control of the storage requirements and brings data agility to k8s persistent workloads. We will go over the concept and the implementation of CAS, as well as its orchestration.
Presentation of ActiveStates micro-cloud solution Stackato at Open Source Days 2012.
Stackato is a cloud solution from renowned ActiveState. It is based on the Open Source CloudFoundry and offers a serious cloud solution for Perl programmers, but also supports Python, Ruby, Node.js, PHP, Clojure and Java.
Stackato is very strong in the private PaaS area, but do also support as public PaaS and deployment onto Amazon's EC2.
The presentation will cover basic use of Stackato and the reason for using a PaaS, public as private. Stackato can also be used as a micro-cloud for developers supporting vSphere, VMware Fusion, Parallels and VirtualBox.
Stackato is currently in public beta, but it is already quite impressive in both features and tools. Stackato is not Open Source, but CloudFoundry is and Stackato offers a magnificent platform for deployment of Open Source projects, sites and services.
ActiveState has committed to keeping the micro-cloud solution free so it offers an exciting capability and extension to the developers toolbox and toolchain.
More information available at: https://logiclab.jira.com/wiki/display/OPEN/Stackato
This document discusses approaches for improving Django performance. It notes that front-end performance issues typically account for 80-90% of response time and recommends caching static assets, bundling/minifying assets, and using a CDN. For back-end issues, it recommends profiling views to identify SQL or Python bottlenecks and provides techniques like select_related, prefetch_related, and caching to address different problem areas. The key message is that performance work requires understanding where time is actually being spent before applying optimizations.
Apache hadoop 3.x state of the union and upgrade guidance - Strata 2019 NYWangda Tan
The document discusses Apache Hadoop 3.x updates and provides guidance for upgrading to Hadoop 3. It covers community updates, features in YARN, Submarine, HDFS, and Ozone. Release plans are outlined for Hadoop, Submarine, and upgrades from Hadoop 2 to 3. Express upgrades are recommended over rolling upgrades for the major version change. The session summarizes that Hadoop 3 is an eagerly awaited release with many successful production uses, and that now is a good time for those not yet upgraded.
This document provides an overview of Capital One's plans to introduce Hadoop and discusses several proof of concepts (POCs) that could be developed. It summarizes the history and practices of using Hadoop at other companies like LinkedIn, Netflix, and Yahoo. It then outlines possible POCs for Hadoop distributions, ETL/analytics frameworks, performance testing, and developing a scaling layer. The goal is to contribute open source code and help with Capital One's transition to using Hadoop in production.
Galaxy Big Data with MariaDB 10 by Bernard Garros, Sandrine Chirokoff and Stéphane Varoqui.
Presented 26.6.2014 at the MariaDB Roadshow in Paris, France.
This document provides an overview of a roundtable discussion on real-time analytics with Hadoop. It discusses the requirements for real-time data, applications, and queries. For real-time data, logs and operational data need to be written directly into the cluster. For applications, operational applications need to run in the cluster to avoid delays. For queries, analysts need to query data as soon as it lands without waiting. It also discusses how MapR addresses these requirements through features like NFS access, low-latency database access, and table replication. The presentation concludes with a discussion of ensuring security, reliability, and other enterprise capabilities for real-time analytics.
20150704 benchmark and user experience in sahara weitingWei Ting Chen
Sahara provides a way to deploy and manage Hadoop clusters within an OpenStack cloud. It addresses common customer needs like providing an elastic environment for data processing jobs, integrating Hadoop with the existing private cloud infrastructure, and reducing costs. Key challenges include speeding up cluster provisioning times, supporting complex data workflows, optimizing storage architectures, and improving performance when using remote object storage.
Leveraging Docker for Hadoop build automation and Big Data stack provisioningDataWorks Summit
Apache Bigtop as an open source Hadoop distribution, focuses on developing packaging, testing and deployment solutions that help infrastructure engineers to build up their own customized big data platform as easy as possible. However, packages deployed in production require a solid CI testing framework to ensure its quality. Numbers of Hadoop component must be ensured to work perfectly together as well. In this presentation, we'll talk about how Bigtop deliver its containerized CI framework which can be directly replicated by Bigtop users. The core revolution here are the newly developed Docker Provisioner that leveraged Docker for Hadoop deployment and Docker Sandbox for developer to quickly start a big data stack. The content of this talk includes the containerized CI framework, technical detail of Docker Provisioner and Docker Sandbox, a hierarchy of docker images we designed, and several components we developed such as Bigtop Toolchain to achieve build automation.
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
This document discusses best practices for implementing Ceph-powered storage as a service. It covers planning a Ceph implementation based on business and technical requirements. Various use cases for Ceph are described, including OpenStack, cloud storage, web-scale applications, high performance block storage, archive/cold storage, databases and Hadoop. Architectural considerations for redundancy, servers, networking are also discussed. The document concludes with a case study of a university implementing a Ceph-based storage cloud to address storage needs for cancer and genomic research data.
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
Have you heard about Inktank Ceph and are interested to learn some tips and tricks for getting started quickly and efficiently with Ceph? Then this is the session for you!
In this two part session you learn details of:
• the very latest enhancements and capabilities delivered in Inktank Ceph Enterprise such as a new erasure coded storage back-end, support for tiering, and the introduction of user quotas.
• best practices, lessons learned and architecture considerations founded in real customer deployments of Dell and Inktank Ceph solutions that will help accelerate your Ceph deployment.
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Red_Hat_Storage
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About? By: Kamesh Pemmaraju,Neil Levine
Have you heard about Inktank Ceph and are interested to learn some tips and tricks for getting started quickly and efficiently with Ceph? Then this is the session for you! In this two part session you learn details of: • the very latest enhancements and capabilities delivered in Inktank Ceph Enterprise such as a new erasure coded storage back-end, support for tiering, and the introduction of user quotas. • best practices, lessons learned and architecture considerations founded in real customer deployments of Dell and Inktank Ceph solutions that will help accelerate your Ceph deployment.
The document summarizes the results of a study that evaluated the performance of different Platform-as-a-Service offerings for running SQL on Hadoop workloads. The study tested Amazon EMR, Google Cloud DataProc, Microsoft Azure HDInsight, and Rackspace Cloud Big Data using the TPC-H benchmark at various data sizes up to 1 terabyte. It found that at 1TB, lower-end systems had poorer performance. In general, HDInsight running on D4 instances and Rackspace Cloud Big Data on dedicated hardware had the best scalability and execution times. The study provides insights into the performance, scalability, and price-performance of running SQL on Hadoop in the cloud.
The document discusses using data virtualization and masking to optimize database migrations to the cloud. It notes that traditional copying of data is inefficient for large environments and can incur high data transfer costs in the cloud. Using data virtualization allows creating virtual copies of production databases that only require a small storage footprint. Masking sensitive data before migrating non-production databases ensures security while reducing costs. Overall, data virtualization and masking enable simpler, more secure, and cost-effective migrations to cloud environments.
Similar to DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed (20)
HPE plans to deliver a DAOS storage solution targeting version 2.0 to enable initial customer deployments. The reference implementation will include HPE servers, Intel Ice Lake CPUs with DCPMM and NVMe SSDs, Mellanox or HPE Slingshot switches, and customized Cray management software. HPE will host potential customers in their CTO lab to run proofs of concept and collect feedback to inform full productization planned with Sapphire Rapids.
DUG'20: 11 - Platform Performance Evolution from bring-up to reaching link sa...Andrey Kudryavtsev
1) The document discusses the performance evolution of a reference storage platform over time as DAOS software improved from version 0.8 to 1.0.
2) Bandwidth and IOPS measurements increased significantly with each DAOS update as well as when using dual socket CPUs in DAOS 1.0.
3) Read latency times improved in DAOS 1.0, showing Optane-like write latencies and NAND-like read latencies from data destaged to QLC SSDs.
DUG'20: 10 - Storage Orchestration for Composable Storage ArchitecturesAndrey Kudryavtsev
RSC's BasIS storage orchestration platform addresses complications with deploying DAOS storage. It simplifies DAOS deployment by dynamically composing DAOS clusters from servers' NVMe and PMEM resources over a fabric. This composable disaggregated approach provides flexibility to use PMEM nodes for different roles like DAOS or databases. The orchestration significantly improves on DAOS by making it deployable on existing heterogeneous servers and suitable for cloud environments. Performance tests show NVMe-over-Fabric with the orchestrator achieves similar throughput to local NVMe drives.
The document provides an overview of updates to the DAOS middleware:
1. It discusses different options for accessing DAOS storage including directly through the DAOS API, through a POSIX interface using dfuse, or through an interception library for performance.
2. It describes two consistency modes for the distributed file system - a relaxed mode prioritizing performance and a more balanced mode for stricter consistency.
3. An MPI-IO driver has been added to allow applications to use the DAOS object store through MPI file interfaces without changes.
4. An HDF5 connector has been developed to enable HDF5 files on top of DAOS, now compatible with the main HDF5 release and supporting various
1) The document discusses mapping seismic data stored in the SEG-Y format to the DAOS object storage system to improve storage and processing efficiency.
2) Currently, seismic processing copies data for each step, but DAOS snapshots and versioning can reduce copies by storing only updates.
3) The DAOS-SEIS mapping represents seismic data as a graph with trace headers and data mapped to DAOS objects to allow random access and filtering.
4) Early benchmarking shows the DAOS-SEIS API outperforms seismic processing libraries on large datasets, though further optimization is needed.
The document discusses storing high-energy physics data in the DAOS object storage system. It provides an overview of high-energy physics experiments and data, and the ROOT framework commonly used for data analysis. It then introduces RNTuple, a new data format being developed to replace the legacy TTree format used in ROOT. The document details how RNTuple maps data to DAOS objects and describes a C++ interface for interacting with DAOS. It notes that from the user perspective, working with RNTuple data in DAOS is similar to working with files on disk.
The document summarizes the experiences of CERN openlab in testing and benchmarking the Distributed Asynchronous Object Storage (DAOS) system. Key points include:
- CERN openlab collaborated with Intel to test and benchmark DAOS on their cluster hardware.
- They found initial performance gains when switching from socket to PSM2 configuration, but also limitations likely due to hardware restrictions.
- Feedback is provided to DAOS developers on areas like installation difficulties, lack of error and system information, and opportunities for improved monitoring integration.
- While commissioning DAOS presented challenges, CERN openlab gained valuable insights and sees potential for increased performance with more nodes and optimized configurations.
1. DAOS provides end-to-end data integrity through checksums calculated by the client library and stored persistently with the data to detect silent corruption.
2. It supports online server addition through a data migration service that handles data movement for various activities like rebuild, reintegration, and addition in a generic way.
3. DAOS uses erasure coding for efficient data recovery and space utilization, with Reed-Solomon coding and support for degraded reads, background rebuild, and space reclamation through data aggregation.
This document discusses compression capabilities added to DAOS (Distributed Asynchronous Object Storage). It provides an overview of the DAOS compression framework and algorithms including LZ4 and deflate. It evaluates the performance of these algorithms with and without using Intel QuickAssist Technology hardware acceleration. Testing shows QAT achieves the best compression throughput and ratio while LZ4 has the best decompression throughput. Future work is outlined to improve compression functionality in DAOS.
DUG'20: 02 - Accelerating apache spark with DAOS on AuroraAndrey Kudryavtsev
This document summarizes accelerating Apache Spark with DAOS (Distributed Asynchronous Object Storage) on Aurora. It describes using DAOS as a Hadoop filesystem for Spark input/output storage and as a shuffle data store. It shows how the DAOS Hadoop filesystem delivers similar throughput to DAOS DFS. It also introduces a DAOS object-based Spark shuffle manager that improves shuffle read throughput by up to 1.5x compared to using the local filesystem, especially for smaller shuffle blocks. Future plans include optimizing the DAOS Spark shuffle manager using async APIs and simplifying user configuration.
This document provides an agenda for a meeting on high performance computing. The agenda includes presentations on accelerating Apache Spark with DAOS, online data compression in DAOS with Intel QAT, DAOS features and updates, experiences deploying and using DAOS in different environments, and plans for DAOS from various companies. Resources for the DAOS open source distributed storage platform are also listed at the end.
DAOS (Distributed Application Object Storage) is a high-performance storage architecture and software stack that delivers scalable object storage capabilities. It uses Intel Optane memory and NVMe SSDs to provide high IOPS, bandwidth, and low latency storage. DAOS supports various data models and interfaces like POSIX, HDF5, Spark, and Python. It allows applications to access storage with library calls instead of system calls for high performance.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Trusted Execution Environment for Decentralized Process MiningLucaBarbaro3
Presentation of the paper "Trusted Execution Environment for Decentralized Process Mining" given during the CAiSE 2024 Conference in Cyprus on June 7, 2024.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
DUG'20: 05 - Very Early Experiences with a 0.5 PByte DAOS Testbed
1. Very Early Experiences with a 0.5 PByte DAOS Testbed
Steffen Christgau, Tobias Watermann, Thomas Steinke
Supercomputing Department
Zuse Institute Berlin
DAOS User Group Meeting 2020
2. Computing + Data Storing @ Zuse Institute Berlin
• ZIB operates HLRN-IV "Lise" for German science community
• Motivation for DAOS:
Our current Lustre installation w/o Burst Buffer ⇒ poor IOPS performance
Complement Lustre? — Burst Buffer, BeeGFS/BeeOND, DAOS
Workloads that can benefit from DAOS: turbulence simulation, astrophysics,
small/many files I/O, · · · AI/DL
Evaluation of new storage concepts vs. "traditional" concepts
• DAOS as research and later production platform
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 1 / 10
3. HLRN-IV "Lise" @ ZIB
8 PFlop/s peak
1,270 nodes, Intel CLX AP
121,920 cores
3,000+ users — 200+ projects
8 PB Lustre
A B
~32 Å
~16 Å
• Chemistry incl. Material Science
• Earth Science incl. Climate Research
• Engineering
• Life Science (biology, medicine)
• Physics incl. Astro and High-Energy Physics
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 2 / 10
4. DAOS @ ZIB: Exploration Testbed
• DAOS user since July 2019
• version 0.6... and git commits before that
• manual compilation process
• Exploration Testbed: used for DCPM and DAOS exploration
isolated from HLRN
2 Inspur dual-socket nodes (CLX-SP Platinum 8260L)
3 + 6 TB Optane DCPMM and 384 GB + 768 GB DRAM
8 + 16 TB Optane SSD
single 100 Gb/s OmniPath back-to-back
CentOS 7
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 3 / 10
5. DAOS @ HLRN: Integration Testbed
larger testbed to be integrated in HLRN infrastructure
• 20 dual-socket server nodes (CLX Gold 6240R)
• 192 GB DRAM
• 1.5 TB DCPM
• 25.6 TB NVMe NAND SSD
• 2 x 100 Gbit/s OmniPath
total capacity 512 TB SSD + 30 TB DCPM
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 4 / 10
6. Installation Experiences with DAOS 1.0 (I)
compared to earlier versions
• Prebuilt DAOS packages are a good thing! We use these, no in-house build.
• online package repositories would be even better (no login please, see oneAPI)
• better OS integeration, support for installation and deployment
• documentation improved a lot
• documentation sometimes more promising than reality
• configuration defaults and comments from examples do not apply
• # scm_class default: dcpm → scm config validation failed: scm_class not set
• immutable after reformat hints are good
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 5 / 10
7. Installation Experiences with DAOS 1.0 (II)
compared to earlier versions
• error messages not always helpful: failed to connect to pool: -1026
• logs are more useful (sometimes)
• content of stderr vs. log files vs. system logging (journal)
• MPI(CH) integration appreciated!
• middleware in general: unclear version management → packages?!
• user interaction with fusefs/POSIX container: orphaned/forgotten mounts
• dfuse daemon might be a good thing
• PSM2 issue: multi-tennant usage with OmniPath not supported
• intent to use sockets or verbs instead
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 6 / 10
8. Simple DAOS Key Value API Benchmark
• use DAOS key value (kv) library, DAOS 1.0.1 for very simple test
• perform operation 1000× in blocking fashion, median reported
• key = 4 Bytes, value = 32 Bytes
0 20 40 60 80 100 120 140
fi_pingpong RTT psm2
fi_pingpong RTT sockets
put
get (size-only)
get (value)
get (miss)
remove
2
26
135
118
118
54
92
time / us
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 7 / 10
9. Current Status
Phase 1: Installation / integration
• hardware shipped mid September, ready to boot OS end of September
• DAOS software integration in HLRN cluster near completion
• early tests on exploration testbed with critical HLRN workloads
MPI IO middleware works seemless with MPI-ready application, see DUG’19
netCDF/HDF5 testing planned (waiting for compatible versions)
before
after
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 8 / 10
10. Next Planned Step
Phase 2: Research with integration testbed
• Usability & user interface: application integration for a few test cases
• Administration: experiences with capabilities of the management of pools,
containers, . . ., monitoring & performance
Phase 3: Access for selected user projects
• intent to use per-project pool
• provide DAOS as optional and additional offer for heavy IO worklads besides
existing Lustre (work), NFS (home), and SSD drive (scratch)
• support power users in migration → easy when MPI/HDF/netCDF is used for IO
• disseminiation planned for 2021 ff.
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 9 / 10
11. Summary
• DAOS improved significantly since our first contact...and is still improving
• integration into production system in progress
• mature for early access of (power) users
Thanks for your attention!
Questions?
Christgau/Watermann/Steinke (ZIB) Very Early Experiences with a 0.5 PByte DAOS Testbed DAOS User Group Meeting 2020 10 / 10