Talk by Lincoln Bryant (University of Chicago ATLAS team) on using Ceph for ATLAS data analysis @ Ceph Days Chicago http://ceph.com/cephdays/ceph-day-chicago/
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
Have you heard about Inktank Ceph and are interested to learn some tips and tricks for getting started quickly and efficiently with Ceph? Then this is the session for you!
In this two part session you learn details of:
• the very latest enhancements and capabilities delivered in Inktank Ceph Enterprise such as a new erasure coded storage back-end, support for tiering, and the introduction of user quotas.
• best practices, lessons learned and architecture considerations founded in real customer deployments of Dell and Inktank Ceph solutions that will help accelerate your Ceph deployment.
An intro to Ceph and big data - CERN Big Data WorkshopPatrick McGarry
Ceph is an open-source distributed storage system that provides scalable object, block, and file storage in a single unified platform. It uses a technique called CRUSH to automatically distribute data across clusters of commodity servers and provides self-healing capabilities through data replication. Ceph's unified storage platform includes RADOS, an object store; RBD for block storage; CephFS for distributed file storage; and RADOSGW for cloud storage compatibility. It is designed for large-scale deployments of 10s to 10,000s of nodes using heterogeneous hardware.
Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh
Together with my colleagues at Red Hat Storage Team, i am very proud to have worked on this reference architecture for Ceph Object Storage.
If you are building Ceph object storage at scale, this document is for you.
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
This document discusses QCT's Ceph storage solutions, including an overview of Ceph architecture, QCT hardware platforms, Red Hat Ceph software, workload considerations, reference architectures, test results and a QCT/Red Hat whitepaper. It provides technical details on QCT's throughput-optimized and capacity-optimized solutions and shows how they address different storage needs through workload-driven design. Hands-on testing and a test drive lab are offered to explore Ceph features and configurations.
This document provides an overview of Ceph, a distributed storage system. It defines software defined storage and discusses different storage options like file, object, and block storage. It then introduces Ceph, highlighting that it provides a unified storage platform for block, object, and file storage using commodity hardware in a massively scalable, self-managing, and fault-tolerant manner. The document outlines Ceph's architecture including its components like OSDs, monitors, RADOS, RBD, RGW, and CephFS. It also discusses various access methods and use cases for Ceph like with OpenStack.
This document discusses the status update of Hadoop running over Ceph RGW with SSD caching. It describes the RGW-Proxy component that returns the closest RGW instance to data, and the RGWFS component that allows Hadoop to access a Ceph cluster through RGW. Performance testing shows that avoiding object renames in Swift reduces overhead compared to HDFS. The next steps are to finish RGWFS development, address heavy renames in RGW, and open source the code.
Ceph is an open source project, which provides software-defined, unified storage solutions. Ceph is a distributed storage system which is massively scalable and high-performing without any single point of failure. From the roots, it has been designed to be highly scalable, up to exabyte level and beyond while running on general-purpose commodity hardware.
BlueStore, A New Storage Backend for Ceph, One Year InSage Weil
BlueStore is a new storage backend for Ceph OSDs that consumes block devices directly, bypassing the local XFS file system that is currently used today. It's design is motivated by everything we've learned about OSD workloads and interface requirements over the last decade, and everything that has worked well and not so well when storing objects as files in local files systems like XFS, btrfs, or ext4. BlueStore has been under development for a bit more than a year now, and has reached a state where it is becoming usable in production. This talk will cover the BlueStore design, how it has evolved over the last year, and what challenges remain before it can become the new default storage backend.
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
Have you heard about Inktank Ceph and are interested to learn some tips and tricks for getting started quickly and efficiently with Ceph? Then this is the session for you!
In this two part session you learn details of:
• the very latest enhancements and capabilities delivered in Inktank Ceph Enterprise such as a new erasure coded storage back-end, support for tiering, and the introduction of user quotas.
• best practices, lessons learned and architecture considerations founded in real customer deployments of Dell and Inktank Ceph solutions that will help accelerate your Ceph deployment.
An intro to Ceph and big data - CERN Big Data WorkshopPatrick McGarry
Ceph is an open-source distributed storage system that provides scalable object, block, and file storage in a single unified platform. It uses a technique called CRUSH to automatically distribute data across clusters of commodity servers and provides self-healing capabilities through data replication. Ceph's unified storage platform includes RADOS, an object store; RBD for block storage; CephFS for distributed file storage; and RADOSGW for cloud storage compatibility. It is designed for large-scale deployments of 10s to 10,000s of nodes using heterogeneous hardware.
Ceph Object Storage Reference Architecture Performance and Sizing GuideKaran Singh
Together with my colleagues at Red Hat Storage Team, i am very proud to have worked on this reference architecture for Ceph Object Storage.
If you are building Ceph object storage at scale, this document is for you.
QCT Ceph Solution - Design Consideration and Reference ArchitecturePatrick McGarry
This document discusses QCT's Ceph storage solutions, including an overview of Ceph architecture, QCT hardware platforms, Red Hat Ceph software, workload considerations, reference architectures, test results and a QCT/Red Hat whitepaper. It provides technical details on QCT's throughput-optimized and capacity-optimized solutions and shows how they address different storage needs through workload-driven design. Hands-on testing and a test drive lab are offered to explore Ceph features and configurations.
This document provides an overview of Ceph, a distributed storage system. It defines software defined storage and discusses different storage options like file, object, and block storage. It then introduces Ceph, highlighting that it provides a unified storage platform for block, object, and file storage using commodity hardware in a massively scalable, self-managing, and fault-tolerant manner. The document outlines Ceph's architecture including its components like OSDs, monitors, RADOS, RBD, RGW, and CephFS. It also discusses various access methods and use cases for Ceph like with OpenStack.
This document discusses the status update of Hadoop running over Ceph RGW with SSD caching. It describes the RGW-Proxy component that returns the closest RGW instance to data, and the RGWFS component that allows Hadoop to access a Ceph cluster through RGW. Performance testing shows that avoiding object renames in Swift reduces overhead compared to HDFS. The next steps are to finish RGWFS development, address heavy renames in RGW, and open source the code.
Ceph is an open source project, which provides software-defined, unified storage solutions. Ceph is a distributed storage system which is massively scalable and high-performing without any single point of failure. From the roots, it has been designed to be highly scalable, up to exabyte level and beyond while running on general-purpose commodity hardware.
BlueStore, A New Storage Backend for Ceph, One Year InSage Weil
BlueStore is a new storage backend for Ceph OSDs that consumes block devices directly, bypassing the local XFS file system that is currently used today. It's design is motivated by everything we've learned about OSD workloads and interface requirements over the last decade, and everything that has worked well and not so well when storing objects as files in local files systems like XFS, btrfs, or ext4. BlueStore has been under development for a bit more than a year now, and has reached a state where it is becoming usable in production. This talk will cover the BlueStore design, how it has evolved over the last year, and what challenges remain before it can become the new default storage backend.
Ceph, being a distributed storage system, is highly reliant on the network for resiliency and performance. In addition, it is crucial that the network topology beneath a Ceph cluster be designed in such a way to facilitate easy scaling without service disruption. After an introduction to Ceph itself this talk will dive into the design of Ceph client and cluster network topologies.
This document summarizes a distributed storage system called Ceph. Ceph uses an architecture with four main components - RADOS for reliable storage, Librados client libraries, RBD for block storage, and CephFS for file storage. It distributes data across intelligent storage nodes using the CRUSH algorithm and maintains reliability through replication and erasure coding of placement groups across the nodes. The monitors manage the cluster map and placement, while OSDs on each node store and manage the data and metadata.
The document discusses research into data management, file systems, and storage systems being conducted at UC Santa Cruz. Specific projects mentioned include using Ceph as a prototyping platform, the SIRIUS project studying challenges of heterogeneous, multi-tiered storage for exascale systems, the Programmable Storage project developing the Malacology and Mantle systems, and the Skyhook project to build an elastic database system that leverages programmable storage interfaces. The research aims to address issues like data placement, predictable performance at scale, and allowing databases to better utilize storage resources.
- Librados is a C/C++ programming interface that provides applications access to the Ceph distributed object store (RADOS) and hides the complexity of networking, data distribution, replication and failure recovery.
- It can be used by Ceph components like RADOS Gateway and tools like rados, as well as third party applications that want to use Ceph for storage. Examples include providing storage for mail systems, Hadoop, and building custom applications.
- The interface handles configuration, connections, I/O operations on objects, extended attributes and more so applications can easily integrate scalable reliable storage via Ceph without having to implement these functions themselves.
BlueStore: a new, faster storage backend for CephSage Weil
BlueStore is a new storage backend for Ceph that provides faster performance compared to the existing FileStore backend. BlueStore stores metadata in RocksDB and data directly on block devices, avoiding double writes and improving transaction performance. It supports multiple storage tiers by allowing different components like the RocksDB WAL, database and object data to be placed on SSDs, HDDs or NVRAM as appropriate.
This document introduces Ceph storage. It discusses the evolution of storage systems from individual disks to clustered storage. Ceph is introduced as a clustered, software-defined storage system that provides object, block, and file storage. Key Ceph components like RADOS, monitors, OSDs, and the CRUSH algorithm are explained. CRUSH provides a scalable and reliable way to determine object locations across multiple nodes. Python and command line interfaces for Ceph are also summarized. Finally, Yahoo's Ceph cluster architecture is briefly described.
The document provides an overview and update on CephFS, the distributed file system implemented in Ceph. Key points include new capabilities in the Jewel release like improved scrub/repair functionality to handle metadata damage, fine-grained authorization using new MDS auth caps, and experimental support for multiple CephFS filesystems backed by a single RADOS storage cluster. It also discusses ongoing work integrating CephFS with OpenStack Manila for shared file system provisioning and new tools like CephFSVolumeClient.
Distributed Storage and Compute With Ceph's librados (Vault 2015)Sage Weil
The Ceph distributed storage system sports object, block, and file interfaces to a single storage cluster. These interface are built on a distributed object storage and compute platform called RADOS, which exports a conceptually simple yet powerful interface for storing and processing large amounts of data and is well-suited for backing web-scale applications and data analytics. In features a rich object model, efficient key/value storage, atomic transactions (including efficient compare-and-swap semantics), object cloning and other primitives for supporting snapshots, simple inter-client communication and coordination (ala Zookeeper), and the ability to extend the object interface using arbitrary code executed on the storage node. This talk will focus on librados API, how it is used, the security model, and some examples of RADOS classes implementing interesting functionality.
Storage tiering and erasure coding in Ceph (SCaLE13x)Sage Weil
Ceph is designed around the assumption that all components of the system (disks, hosts, networks) can fail, and has traditionally leveraged replication to provide data durability and reliability. The CRUSH placement algorithm is used to allow failure domains to be defined across hosts, racks, rows, or datacenters, depending on the deployment scale and requirements.
Recent releases have added support for erasure coding, which can provide much higher data durability and lower storage overheads. However, in practice erasure codes have different performance characteristics than traditional replication and, under some workloads, come at some expense. At the same time, we have introduced a storage tiering infrastructure and cache pools that allow alternate hardware backends (like high-end flash) to be leveraged for active data sets while cold data are transparently migrated to slower backends. The combination of these two features enables a surprisingly broad range of new applications and deployment configurations.
This talk will cover a few Ceph fundamentals, discuss the new tiering and erasure coding features, and then discuss a variety of ways that the new capabilities can be leveraged.
Ceph data services in a multi- and hybrid cloud worldSage Weil
IT organizations of the future (and present) are faced with managing infrastructure that spans multiple private data centers and multiple public clouds. Emerging tools and operational patterns like kubernetes and microservices are easing the process of deploying applications across multiple environments, but the achilles heel of such efforts remains that most applications require large quantities of state, either in databases, object stores, or file systems. Unlike stateless microservices, state is hard to move.
Ceph is known for providing scale-out file, block, and object storage within a single data center, but it also includes a robust set of multi-cluster federation capabilities. This talk will cover how Ceph's underlying multi-site capabilities complement and enable true portability across cloud footprints--public and private--and how viewing Ceph from a multi-cloud perspective has fundamentally shifted our data services roadmap, especially for Ceph object storage.
Managing data analytics in a hybrid cloudKaran Singh
Managing Data Analytics in a Hybrid Cloud discusses challenges with traditional analytics approaches and proposes using shared data lakes with dynamic compute clusters. Common challenges include explosive analytics team growth leading to resource contention, and duplicating large datasets for each cluster. The proposed approach uses shared object storage to hold unified datasets accessed by multiple ephemeral analytics clusters provisioned on-demand. This allows teams independent resources while avoiding duplicate storage costs and improving agility. The document outlines example architectures and benefits of this shared data lake approach when implemented on a private or public cloud.
Introduction to Ceph, an open-source, massively scalable distributed file system.
This document explains the architecture of Ceph and integration with OpenStack.
BlueStore: a new, faster storage backend for CephSage Weil
Traditionally Ceph has made use of local file systems like XFS or btrfs to store its data. However, the mismatch between the OSD's requirements and the POSIX interface provided by kernel file systems has a huge performance cost and requires a lot of complexity. BlueStore, an entirely new OSD storage backend, utilizes block devices directly, doubling performance for most workloads. This talk will cover the motivation a new backend, the design and implementation, the improved performance on HDDs, SSDs, and NVMe, and discuss some of the thornier issues we had to overcome when replacing tried and true kernel file systems with entirely new code running in userspace.
CRUSH is the powerful, highly configurable algorithm Red Hat Ceph Storage uses to determine how data is stored across the many servers in a cluster. A healthy Red Hat Ceph Storage deployment depends on a properly configured CRUSH map. In this session, we will review the Red Hat Ceph Storage architecture and explain the purpose of CRUSH. Using example CRUSH maps, we will show you what works and what does not, and explain why.
Presented at Red Hat Summit 2016-06-29.
This document summarizes what's new in Ceph. Key updates include improved management and usability features like simplified configuration, hands-off operation, and device health tracking. It also covers new orchestrator capabilities for Kubernetes and container platforms, continued performance optimizations, and multi-cloud capabilities like object storage federation across data centers and clouds.
CephFS performance testing was conducted on a Jewel deployment. Key findings include:
- Single MDS performance is limited by its single-threaded design; operations reached CPU limits
- Improper client behavior can cause MDS OOM issues by exceeding inode caching limits
- Metadata operations like create, open, update showed similar performance, reaching 4-5k ops/sec maximum
- Caching had a large impact on performance when the working set exceeded cache size
The document discusses the QuarkNet and Physics RET programs which aim to provide professional development for teachers through research experiences in particle physics and other areas of science. The programs allow teachers to experience scientific research firsthand and gain skills in scientific thinking that can help them teach science standards. Teachers apply to participate in a paid summer research program or academic year meetings to discuss teaching, research, and current science events.
This document provides an overview and introduction to Red Hat Gluster Storage. It explains that Gluster Storage concatenates all disks from multiple servers into one large volume, allowing users to access much more storage. It also describes how Gluster Storage provides data protection through RAID 6 and replication between nodes, and can heal itself if a disk fails. Common use cases for Gluster Storage are also listed, such as network log analysis, HPC data, and big data analysis where large amounts of petabyte-scale storage are needed.
Ceph, being a distributed storage system, is highly reliant on the network for resiliency and performance. In addition, it is crucial that the network topology beneath a Ceph cluster be designed in such a way to facilitate easy scaling without service disruption. After an introduction to Ceph itself this talk will dive into the design of Ceph client and cluster network topologies.
This document summarizes a distributed storage system called Ceph. Ceph uses an architecture with four main components - RADOS for reliable storage, Librados client libraries, RBD for block storage, and CephFS for file storage. It distributes data across intelligent storage nodes using the CRUSH algorithm and maintains reliability through replication and erasure coding of placement groups across the nodes. The monitors manage the cluster map and placement, while OSDs on each node store and manage the data and metadata.
The document discusses research into data management, file systems, and storage systems being conducted at UC Santa Cruz. Specific projects mentioned include using Ceph as a prototyping platform, the SIRIUS project studying challenges of heterogeneous, multi-tiered storage for exascale systems, the Programmable Storage project developing the Malacology and Mantle systems, and the Skyhook project to build an elastic database system that leverages programmable storage interfaces. The research aims to address issues like data placement, predictable performance at scale, and allowing databases to better utilize storage resources.
- Librados is a C/C++ programming interface that provides applications access to the Ceph distributed object store (RADOS) and hides the complexity of networking, data distribution, replication and failure recovery.
- It can be used by Ceph components like RADOS Gateway and tools like rados, as well as third party applications that want to use Ceph for storage. Examples include providing storage for mail systems, Hadoop, and building custom applications.
- The interface handles configuration, connections, I/O operations on objects, extended attributes and more so applications can easily integrate scalable reliable storage via Ceph without having to implement these functions themselves.
BlueStore: a new, faster storage backend for CephSage Weil
BlueStore is a new storage backend for Ceph that provides faster performance compared to the existing FileStore backend. BlueStore stores metadata in RocksDB and data directly on block devices, avoiding double writes and improving transaction performance. It supports multiple storage tiers by allowing different components like the RocksDB WAL, database and object data to be placed on SSDs, HDDs or NVRAM as appropriate.
This document introduces Ceph storage. It discusses the evolution of storage systems from individual disks to clustered storage. Ceph is introduced as a clustered, software-defined storage system that provides object, block, and file storage. Key Ceph components like RADOS, monitors, OSDs, and the CRUSH algorithm are explained. CRUSH provides a scalable and reliable way to determine object locations across multiple nodes. Python and command line interfaces for Ceph are also summarized. Finally, Yahoo's Ceph cluster architecture is briefly described.
The document provides an overview and update on CephFS, the distributed file system implemented in Ceph. Key points include new capabilities in the Jewel release like improved scrub/repair functionality to handle metadata damage, fine-grained authorization using new MDS auth caps, and experimental support for multiple CephFS filesystems backed by a single RADOS storage cluster. It also discusses ongoing work integrating CephFS with OpenStack Manila for shared file system provisioning and new tools like CephFSVolumeClient.
Distributed Storage and Compute With Ceph's librados (Vault 2015)Sage Weil
The Ceph distributed storage system sports object, block, and file interfaces to a single storage cluster. These interface are built on a distributed object storage and compute platform called RADOS, which exports a conceptually simple yet powerful interface for storing and processing large amounts of data and is well-suited for backing web-scale applications and data analytics. In features a rich object model, efficient key/value storage, atomic transactions (including efficient compare-and-swap semantics), object cloning and other primitives for supporting snapshots, simple inter-client communication and coordination (ala Zookeeper), and the ability to extend the object interface using arbitrary code executed on the storage node. This talk will focus on librados API, how it is used, the security model, and some examples of RADOS classes implementing interesting functionality.
Storage tiering and erasure coding in Ceph (SCaLE13x)Sage Weil
Ceph is designed around the assumption that all components of the system (disks, hosts, networks) can fail, and has traditionally leveraged replication to provide data durability and reliability. The CRUSH placement algorithm is used to allow failure domains to be defined across hosts, racks, rows, or datacenters, depending on the deployment scale and requirements.
Recent releases have added support for erasure coding, which can provide much higher data durability and lower storage overheads. However, in practice erasure codes have different performance characteristics than traditional replication and, under some workloads, come at some expense. At the same time, we have introduced a storage tiering infrastructure and cache pools that allow alternate hardware backends (like high-end flash) to be leveraged for active data sets while cold data are transparently migrated to slower backends. The combination of these two features enables a surprisingly broad range of new applications and deployment configurations.
This talk will cover a few Ceph fundamentals, discuss the new tiering and erasure coding features, and then discuss a variety of ways that the new capabilities can be leveraged.
Ceph data services in a multi- and hybrid cloud worldSage Weil
IT organizations of the future (and present) are faced with managing infrastructure that spans multiple private data centers and multiple public clouds. Emerging tools and operational patterns like kubernetes and microservices are easing the process of deploying applications across multiple environments, but the achilles heel of such efforts remains that most applications require large quantities of state, either in databases, object stores, or file systems. Unlike stateless microservices, state is hard to move.
Ceph is known for providing scale-out file, block, and object storage within a single data center, but it also includes a robust set of multi-cluster federation capabilities. This talk will cover how Ceph's underlying multi-site capabilities complement and enable true portability across cloud footprints--public and private--and how viewing Ceph from a multi-cloud perspective has fundamentally shifted our data services roadmap, especially for Ceph object storage.
Managing data analytics in a hybrid cloudKaran Singh
Managing Data Analytics in a Hybrid Cloud discusses challenges with traditional analytics approaches and proposes using shared data lakes with dynamic compute clusters. Common challenges include explosive analytics team growth leading to resource contention, and duplicating large datasets for each cluster. The proposed approach uses shared object storage to hold unified datasets accessed by multiple ephemeral analytics clusters provisioned on-demand. This allows teams independent resources while avoiding duplicate storage costs and improving agility. The document outlines example architectures and benefits of this shared data lake approach when implemented on a private or public cloud.
Introduction to Ceph, an open-source, massively scalable distributed file system.
This document explains the architecture of Ceph and integration with OpenStack.
BlueStore: a new, faster storage backend for CephSage Weil
Traditionally Ceph has made use of local file systems like XFS or btrfs to store its data. However, the mismatch between the OSD's requirements and the POSIX interface provided by kernel file systems has a huge performance cost and requires a lot of complexity. BlueStore, an entirely new OSD storage backend, utilizes block devices directly, doubling performance for most workloads. This talk will cover the motivation a new backend, the design and implementation, the improved performance on HDDs, SSDs, and NVMe, and discuss some of the thornier issues we had to overcome when replacing tried and true kernel file systems with entirely new code running in userspace.
CRUSH is the powerful, highly configurable algorithm Red Hat Ceph Storage uses to determine how data is stored across the many servers in a cluster. A healthy Red Hat Ceph Storage deployment depends on a properly configured CRUSH map. In this session, we will review the Red Hat Ceph Storage architecture and explain the purpose of CRUSH. Using example CRUSH maps, we will show you what works and what does not, and explain why.
Presented at Red Hat Summit 2016-06-29.
This document summarizes what's new in Ceph. Key updates include improved management and usability features like simplified configuration, hands-off operation, and device health tracking. It also covers new orchestrator capabilities for Kubernetes and container platforms, continued performance optimizations, and multi-cloud capabilities like object storage federation across data centers and clouds.
CephFS performance testing was conducted on a Jewel deployment. Key findings include:
- Single MDS performance is limited by its single-threaded design; operations reached CPU limits
- Improper client behavior can cause MDS OOM issues by exceeding inode caching limits
- Metadata operations like create, open, update showed similar performance, reaching 4-5k ops/sec maximum
- Caching had a large impact on performance when the working set exceeded cache size
The document discusses the QuarkNet and Physics RET programs which aim to provide professional development for teachers through research experiences in particle physics and other areas of science. The programs allow teachers to experience scientific research firsthand and gain skills in scientific thinking that can help them teach science standards. Teachers apply to participate in a paid summer research program or academic year meetings to discuss teaching, research, and current science events.
This document provides an overview and introduction to Red Hat Gluster Storage. It explains that Gluster Storage concatenates all disks from multiple servers into one large volume, allowing users to access much more storage. It also describes how Gluster Storage provides data protection through RAID 6 and replication between nodes, and can heal itself if a disk fails. Common use cases for Gluster Storage are also listed, such as network log analysis, HPC data, and big data analysis where large amounts of petabyte-scale storage are needed.
THE WONDER OF CERN...by Stefano GallizioUccioPwer96
The Large Hadron Collider (LHC) is a 27 km ring of magnets located in Geneva, Switzerland that accelerates particle beams to extremely high energies in order to recreate conditions shortly after the Big Bang. Five large experiments - ALICE, ATLAS, CMS, LHCb, and TOTEM - are located underground around the LHC's four collision points to study what occurs when the particle beams collide at high energies, helping physicists better understand fundamental particles and forces. No one knows what will result from these unprecedented high-energy collisions.
Particle Physics, CERN and the Large Hadron Colliderjuanrojochacon
The document discusses particle physics research done at CERN's Large Hadron Collider (LHC). It describes the LHC as the most powerful particle accelerator ever built, with a 27 km long tunnel housing detectors that observe proton collisions at very high energies. One of the LHC's major discoveries was the Higgs boson particle in 2012. The document outlines how the LHC allows scientists to study the fundamental building blocks of matter at the smallest observable scales.
- The document describes the Large Hadron Collider (LHC) particle accelerator located at CERN. It is made up of several components that sequentially accelerate protons to higher energies, including Linac 2, the Proton Synchrotron Booster, the Proton Synchrotron, and the Super Proton Synchrotron.
- The largest component is the LHC, which is 27 kilometers in circumference and accelerates the protons to an energy of 7 TeV before they collide in detectors. It requires operating at a temperature colder than outer space to function.
- The document provides details on each component of the accelerator chain and their purpose in accelerating the protons up to collision energy in the LHC. It
Ceph Day Chicago: Using Ceph for Large Hadron Collider Data Ceph Community
Lincoln Bryant from the University of Chicago gave a presentation on using Ceph for Large Hadron Collider data storage and analysis. Key points:
- Ceph is used to store reconstruction data from the ATLAS experiment at the LHC and for analysis datasets.
- Ceph provides scalable storage through an erasure coded CephFS that is mounted by XRootD servers for data access.
- Ceph allows efficient data transfers from CERN to the University of Chicago site for regional analysis.
- Future evaluations include using librados directly with XRootD and running Ceph and analysis jobs on the same cluster nodes using cgroups for resource control.
Hadoop 3.0 will include major new features like HDFS erasure coding for improved storage efficiency and YARN support for long running services and Docker containers to improve resource utilization. However, it will maintain backwards compatibility and a focus on testing given the importance of compatibility for existing Hadoop users. The release is targeted for late 2017 after several alpha and beta stages.
Apache Hadoop 3 is coming! As the next major milestone for hadoop and big data, it attracts everyone's attention as showcase several bleeding-edge technologies and significant features across all components of Apache Hadoop: Erasure Coding in HDFS, Docker container support, Apache Slider integration and Native service support, Application Timeline Service version 2, Hadoop library updates and client-side class path isolation, etc. In this talk, first we will update the status of Hadoop 3.0 releasing work in apache community and the feasible path through alpha, beta towards GA. Then we will go deep diving on each new feature, include: development progress and maturity status in Hadoop 3. Last but not the least, as a new major release, Hadoop 3.0 will contain some incompatible API or CLI changes which could be challengeable for downstream projects and existing Hadoop users for upgrade - we will go through these major changes and explore its impact to other projects and users.
This document provides an introduction to OpenStack, an open source software platform for building private and public clouds. It describes the key OpenStack components for compute (Nova), storage (Cinder, Glance, Swift), networking (Neutron), and identity (Keystone). It then discusses how organizations like CERN and PayPal use OpenStack to manage large amounts of data and computing resources in a scalable, distributed manner. The document concludes by outlining various ways that individuals can get involved and contribute to the OpenStack community.
Sanger OpenStack presentation March 2017Dave Holland
A description of the Sanger Institute's journey with OpenStack to date, covering RHOSP, Ceph, S3, user applications, and future plans. Given at the Sanger Institute's OpenStack Day.
This document summarizes the work done by OpenStack@IIIT-H, which uses OpenStack to run an Indian language search engine, conduct research in information extraction/retrieval/access and virtualization/cloud computing, and provide users with OpenStack, Hadoop, and other open-source software. Before OpenStack, provisioning resources was ad hoc and unmanaged, user management lacked controls, and storage had reliability and duplication issues. After implementing OpenStack, resources can be quickly and reliably provisioned on demand, usage is monitored and restricted with quotas, and storage uses Swift for reliability without data fragmentation. OpenStack also supports research and teaching over 350 students through hands-on projects.
Skyhook: Towards an Arrow-Native Storage System, CCGrid 2022JayjeetChakraborty
With the ever-increasing dataset sizes, several file
formats such as Parquet, ORC, and Avro have been developed
to store data efficiently, save the network, and interconnect
bandwidth at the price of additional CPU utilization. However,
with the advent of networks supporting 25-100 Gb/s and storage
devices delivering 1, 000, 000 reqs/sec, the CPU has become the
bottleneck trying to keep up feeding data in and out of these
fast devices. The result is that data access libraries executed
on single clients are often CPU-bound and cannot utilize the
scale-out benefits of distributed storage systems. One attractive
solution to this problem is to offload data-reducing processing
and filtering tasks to the storage layer. However, modifying
legacy storage systems to support compute offloading is often
tedious and requires an extensive understanding of the system
internals. Previous approaches re-implemented functionality of
data processing frameworks and access libraries for a particular
storage system, a duplication of effort that might have to be
repeated for different storage systems.
This paper introduces a new design paradigm that allows extending programmable object storage systems to embed existing,
widely used data processing frameworks and access libraries
into the storage layer with no modifications. In this approach,
data processing frameworks and access libraries can evolve
independently from storage systems while leveraging distributed
storage systems’ scale-out and availability properties. We present
Skyhook, an example implementation of our design paradigm
using Ceph, Apache Arrow, and Parquet. We provide a brief
performance evaluation of Skyhook and discuss key results.
What's New with Ceph - Ceph Day Silicon ValleyCeph Community
This document discusses what's new in Ceph, including priorities around community, management/usability, performance of core Ceph components like RADOS, RBD, RGW and CephFS, and container platforms. Specific updates mentioned include centralized configuration in Mimic, Project Crimson reimplementing the OSD data path, Msgr2 network protocol, automated management features, telemetry/insights, performance optimizations, and the continued development of the Ceph dashboard.
Hadoop Meetup Jan 2019 - Overview of OzoneErik Krogen
A presentation by Anu Engineer of Cloudera regarding the state of the Ozone subproject. He covers a brief introduction of what Ozone is, and where it's headed.
This is taken from the Apache Hadoop Contributors Meetup on January 30, hosted by LinkedIn in Mountain View.
Ceph is an open-source distributed storage system that provides object, block, and file storage on commodity hardware. It uses a pseudo-random placement algorithm called CRUSH to distribute data across a cluster in a fault-tolerant manner without single points of failure. Ceph has various applications including a RADOS Gateway for S3/Swift compatibility, RADOS Block Device for virtual machine images, and a CephFS for a POSIX-compliant distributed file system.
Initial presentation of swift (for montreal user group)Marcos García
Swift is an open source object storage system that provides scalable storage and retrieval of any amount of unstructured data over HTTP. It is designed to be scalable, reliable, and inexpensive for storing large amounts of unstructured data. Some key uses of Swift include storing backups, web content like images, and large scientific data objects. Swift uses a ring architecture to distribute and replicate data across multiple servers for high availability.
Using Containers and HPC to Solve the Mysteries of the Universe by Deborah BardDocker, Inc.
This document discusses using containers and supercomputers to enable open science. It describes how supercomputers are used for diverse scientific research in many fields. Containers can help address issues with portability and scalability on supercomputers by replicating software environments. Shifter enables the use of Docker containers on supercomputers while addressing security and performance concerns. Examples are given of scientific projects using containers, such as astronomy, particle physics, and biology projects. Ensuring reproducibility of results through containerization is also discussed.
Netflix running Presto in the AWS CloudZhenxiao Luo
Netflix runs Presto in its AWS cloud environment to enable low-latency ad-hoc queries on petabyte-scale data stored in S3. Some key things Netflix did include optimizing Presto to read from and write directly to S3, fixing bugs, integrating Presto with its EMR and Ganglia monitoring, and deploying a 100+ node Presto cluster that handles over 1000 queries per day. Performance testing showed Presto was often 10x faster than Hive for various queries and joins. Netflix continues optimizing Presto for its needs like supporting Parquet, ODBC/JDBC drivers, and looking to address current limitations.
This document provides an introduction to Docker and Openshift including discussions around infrastructure, storage, monitoring, metrics, logs, backup, and security considerations. It describes the recommended infrastructure for a 3 node Openshift cluster including masters, etcd, and nodes. It also discusses strategies for storage, monitoring both internal pod status and external infrastructure metrics, collecting and managing logs, backups, and security features within Openshift like limiting resource usage and isolating projects.
This document summarizes Dan van der Ster's experience scaling Ceph at CERN. CERN uses Ceph as the backend storage for OpenStack volumes and images, with plans to also use it for physics data archival and analysis. The 3PB Ceph cluster consists of 47 disk servers and 1,128 OSDs. Some lessons learned include managing latency, handling many objects, tuning CRUSH, trusting clients, and avoiding human errors when managing such a large cluster.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Similar to Using Ceph for Large Hadron Collider Data (20)
Rob Gardner presented on bringing OSG pools to campus clusters and connecting clusters to the Open Science Grid (OSG). The presentation covered installing the Connect Client software to submit jobs from a campus cluster to OSG resources, totaling over 100,000 CPU cores. Users can check job status and view where jobs ran using Connect Client commands like "connect q" and "connect histogram". Tutorials are available online to help users learn new techniques for running on OSG Connect.
Ci Connect: A service for building multi-institutional cluster environmentsRob Gardner
Rob Gardner presented CI Connect, a service that builds multi-institutional cluster environments using existing technologies like Globus, HTCondor, and Xrootd. It allows small research groups to access distributed high throughput computing resources through virtual clusters. The service also extends the capacity of ATLAS Tier3 clusters. Upcoming work includes automating provisioning to easily configure sharing among campuses with minimal operational effort at resource sites.
This document summarizes the Open Science Grid (OSG), a distributed high throughput computing resource that aggregates computing clusters totaling over 100,000 cores. It provides statistics on OSG usage and describes how OSG can be used by individual researchers and campus clusters to access additional computing and storage resources beyond what is locally available. OSG supports a variety of science domains and thousands of users through login services, job scheduling, software stacks, and large-scale data transfers.
UiPath Test Automation using UiPath Test Suite series, part 5DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 5. In this session, we will cover CI/CD with devops.
Topics covered:
CI/CD with in UiPath
End-to-end overview of CI/CD pipeline with Azure devops
Speaker:
Lyndsey Byblow, Test Suite Sales Engineer @ UiPath, Inc.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
1. Lincoln Bryant • University of Chicago
Ceph Day Chicago
August 18th, 2015
Using Ceph for Large
Hadron Collider Data
2. about us
● ATLAS at the Large Hadron Collider at CERN
○ Tier2 center at UChicago, Indiana, UIUC/NCSA:
■ 12,000 processing slots, 4 PB of storage
○ 140 PB on disk in 120 data centers worldwide + CERN
● Open Science Grid: high throughput computing
○ Supporting “large science” and small research labs on
campuses nationwide
○ >100k cores, >800M CPU-hours/year, ~PB transfer/day
● Users of Ceph for 2 years
○ started with v0.67, using v0.94 now
○ 1PB, more to come
3. ATLAS site
5.4 miles
(17 mi circumference)
⇐ Standard Model of Particle
Physics
Higgs boson: final piece
discoved in 2012
⇒ Nobel Prize
2015 → 2018:
Cool new physics searches
underway at 13 TeV
← credit: Katherine Leney
(UCL, March 2015)
LHC
4. ATLAS detector
● Run2 center of mass energy = 13 TeV (Run1: 8 TeV)
● 40 MHz proton bunch crossing rate
○ 20-50 collisions/bunch crossing (“pileup”)
● Trigger (filters) reduces raw rate to ~ 1kHz
● Events are written to disk at ~ 1.5 GB/s
LHC
5. ATLAS detector
100M active sensors
torroid
magnets
inner
tracking
person
(scale)
Not shown:
Tile calorimeters
(electrons,
photons)
Liquid argon
calorimeter
(hadrons)
Muon chambers
Forward detectors
6. ATLAS
data &
analysis
Primary data from CERN
globally processed (event
reconstruction and analysis)
Role for
Ceph:
analysis
datasets &
object store for
single events
3x100 Gbps
8. Our setup
● Ceph v0.94.2 on Scientific Linux 6.6
● 14 storage servers
● 12 x 6 TB disks, no dedicated journal devices
○ Could buy PCI-E SSD(s) if the performance is needed
● Each connected at 10 Gbps
● Mons and MDS virtualized
● CephFS pools using erasure coding + cache
tiering
10. ● ATLAS uses the Open Science Grid
middleware in the US
○ among other things: facilitates data management and
transfer between sites
● Typical sites will use Lustre, dCache, etc as the
“storage element” (SE)
● Goal: Build and productionize a storage
element based on Ceph
11. XRootD
● Primary file access protocol for accessing files
within ATLAS
● Developed by Stanford Linear Accelerator
(SLAC)
● Built to support standard high-energy physics
analysis tools (e.g., ROOT)
○ Supports remote reads, caching, etc
● Federated over WAN via hierarchical system of
‘redirectors’
12. Ceph and XRootD
● How to pair our favorite access protocol with
our favorite storage platform?
13. Ceph and XRootD
● How to pair our favorite access protocol with
our favorite storage platform?
● Original approach: RBD + XRootD
○ Performance was acceptable
○ Problem: RBD only mounted on 1 machine
■ Can only run one XRootD server
○ Could create new RBDs and add to XRootD cluster to
scale out
■ Problem: NFS exports for interactive users become
a lot trickier
14. Ceph and XRootD
● Current approach: CephFS + XRootD
○ All XRootD servers mount CephFS via kernel client
■ Scale out is a breeze
○ Fully POSIX filesystem, integrates simply with existing
infrastructure
● Problem: Users want to r/w to the filesystem
directly via CephFS, but XRootD needs to own
the files it serves
○ Permissions issues galore
15. Squashing with Ganesha NFS
● XRootD does not run in a privileged mode
○ Cannot modify/delete files written by users
○ Users can’t modify/delete files owned by XRootD
● How to allow users to read/write via FS mount?
● Using Ganesha to export CephFS as NFS and
squash all users to the XRootD user
○ Doesn’t prevent users from stomping on each other’s
files, but works well enough in practice
16. Transfers from CERN to Chicago
● Using Ceph as the backend store for data
from the LHC
● Analysis input data sets for regional physics
analysis
● Easily obtain 200 MB/s from Geneva to our
Ceph storage system in Chicago
18. Potential evaluations
● XRootD with librados plugin
○ Skip the filesystem, write directly to object store
○ XRootD handles POSIX filesystem semantics as a
pseudo-MDS
○ Three ways of accessing:
■ Directly access files via XRootD clients
■ Mount XRootD via FUSE client
■ LD_PRELOAD hook to intercept system calls to
/xrootd
20. Ceph and the batch system
● Goal: Run Ceph and user analysis jobs on the
same machines
● Problem: Poorly defined jobs can wreak havoc
on the Ceph cluster
○ e.g., machine starts heavily swapping, OOM killer starts
killing random processes including OSDs, load spikes to
hundreds, etc..
21. Ceph and the batch system
● Solution: control groups (cgroups)
● Configured batch system (HTCondor) to use
cgroups to limit the amount of CPU/RAM used
on a per-job basis
● We let HTCondor scavenge about 80% of the
cycles
○ May need to be tweaked as our Ceph usage increases.
22. Ceph and the batch system
● Working well thus far:
23. Ceph and the batch system
● Further work in this area:
○ Need to configure the batch system to immediately kill
jobs when Ceph-related load goes up
■ e.g., disk failure
○ Re-nice OSDs to maximum priority
○ May require investigation into limiting network saturation
25. ATLAS Event Service
● Deliver single ATLAS events for processing
○ Rather than a complete dataset - “fine grained”
● Able to efficiently fill opportunistic resources
like AWS instances (spot pricing), semi-idle
HPC clusters, BOINC
● Can be evicted from resources immediately
with negligible loss of work
● Output data is streamed to remote object
storage
26. ATLAS Event Service
● Rather than pay for S3, RadosGW fits this use
case perfectly
● Colleagues at Brookhaven National Lab have
deployed a test instance already
○ interested in providing this service as well
○ could potentially federate gateways
● Still in the pre-planning stage at our site
28. Final thoughts
● Overall, quite happy with Ceph
○ Storage endpoint should be in production soon
○ More nodes on the way: plan to expand to 2 PB
● Looking forward to new CephFS features like quotas,
offline fsck, etc
● Will be experimenting with Ceph pools shared between
data centers with low RTT ping in the near future
● Expect Ceph to play important role in ATLAS data
processing ⇒ new discoveries