The document discusses IBM Spectrum Scale's unified file and object access feature. It allows data to be accessed as both files and objects within the same namespace without data copies. This enables use cases like running analytics directly on object data using Hadoop/Spark without data movement. It also allows publishing analytics results back as objects. The feature supports common user authentication for both file and object access and flexible identity management modes. A demo is shown of uploading a file as object, running analytics on it, and downloading the results as object.
Spectrum Scale Unified File and Object with WAN CachingSandeep Patil
This document provides an overview of IBM Spectrum Scale's Active File Management (AFM) capabilities and use cases. AFM uses a home-and-cache model to cache data from a home site at local clusters for low-latency access. It expands GPFS' global namespace across geographical distances and provides automated namespace management. The document discusses AFM caching basics, global sharing, use cases like content distribution and disaster recovery. It also provides details on Spectrum Scale's protocol support, unified file and object access, using AFM with object storage, and configuration.
Analytics with unified file and object Sandeep Patil
Presentation takes you through on way to achive in-place hadoop based analytics for your file and object data. Also give you example of storage integration with cloud congnitive services
In Place Analytics For File and Object DataSandeep Patil
The document discusses IBM Spectrum Scale's unified file and object access feature. It introduces Spectrum Scale and its support for file and object access. The unified file and object access feature allows data to be accessed as both files and objects without copying, through a single management plane. Use cases like in-place analytics for object data and common identity management across file and object access are enabled. A demo is presented where a file is uploaded as an object, analytics is run on it, and the result downloaded as an object, without data movement.
Spectrum Scale - Diversified analytic solution based on various storage servi...Wei Gong
This slides describe diversified analytic solutions based on Spectrum Scale with various deployment mode, such as storage rich-server, share storage, IBM DeepFlash 150 and Elastic Storage Server. It deep dives several advanced data management features and solutions for BD&A workload derived from Spectrum Scale.
Hadoop and Spark Analytics over Better StorageSandeep Patil
This document discusses using IBM Spectrum Scale to provide a colder storage tier for Hadoop & Spark workloads using IBM Elastic Storage Server (ESS) and HDFS transparency. Some key points discussed include:
- Using Spectrum Scale to federate ESS with existing HDFS or Spectrum Scale filesystems, allowing data to be seamlessly accessed even if moved to the ESS tier.
- Extending HDFS across multiple HDFS and Spectrum Scale clusters without needing to move data using Spectrum Scale's HDFS transparency connector.
- Integrating ESS tier with Spectrum Protect for backup and Spectrum Archive for archiving to take advantage of their policy engines and automation.
- Examples of using the unified storage for analytics workflows, life
Ozone: Evolution of HDFS scalability & built-in GDPR complianceDinesh Chitlangia
This talk was delivered at ApacheCON, Las Vegas USA, September 2019.
Audio Recording: https://feathercast.apache.org/2019/09/12/ozone-evolving-hdfs-scalability-to-new-heights-built-in-gdpr-compliance-dinesh-chitlangia/
Speakers:
Dinesh Chitlangia: https://www.linkedin.com/in/dineshchitlangia/
Ajay Kumar aka Ajay Yadav: https://www.linkedin.com/in/ajayydv/
Abstract:
https://www.apachecon.com/acna19/s/#/scheduledEvent/1176
Apache Hadoop Ozone is a robust, distributed key-value object store for Hadoop with layered architecture and strong consistency. It separates the namespace management from block and node management layer, which allows users to independently scale on both axes. Ozone is interoperable with Hadoop ecosystem as it provides OzoneFS (Hadoop compatible file system API), data locality and plug-n-play deployment with HDFS as it can be installed in an existing Hadoop cluster and can share storage disks with HDFS. Ozone solves the scalability challenges with HDFS by being size agnostic. Consequently, it allows users to store trillions of files in Ozone and access them as if they are on HDFS. Ozone plugs into existing Hadoop deployments seamlessly, and programs like Yarn, MapReduce, Spark, Hive and work without any modifications. In the era of increasing need for data privacy and regulations, Ozone also aims to provide built-in support for GDPR compliance with strong focus on Right to be Forgotten i.e., Data Erasure. At the end of this presentation the audience will be able to understand: 1. Overview of current challenges with HDFS scalability 2. How Ozone’s Architecture solves these challenges 3. Overview of GDPR 4. Built-in support for GDPR in Ozone
The document discusses scaling challenges with HDFS and proposed solutions from Hortonworks called HDDS and Ozone. HDFS scales well for data and IO but has limitations scaling the namespace beyond 500 million files. HDDS aims to scale the block layer using block containers which can reduce block reports. Ozone uses a flat key-value namespace that is easier to shard and scale beyond billions of objects compared to HDFS hierarchical namespace. It also provides an HDFS compatible filesystem called OzoneFS. Together HDDS and Ozone aim to retain HDFS features while scaling to exabytes of data and trillions of files.
Spectrum Scale Unified File and Object with WAN CachingSandeep Patil
This document provides an overview of IBM Spectrum Scale's Active File Management (AFM) capabilities and use cases. AFM uses a home-and-cache model to cache data from a home site at local clusters for low-latency access. It expands GPFS' global namespace across geographical distances and provides automated namespace management. The document discusses AFM caching basics, global sharing, use cases like content distribution and disaster recovery. It also provides details on Spectrum Scale's protocol support, unified file and object access, using AFM with object storage, and configuration.
Analytics with unified file and object Sandeep Patil
Presentation takes you through on way to achive in-place hadoop based analytics for your file and object data. Also give you example of storage integration with cloud congnitive services
In Place Analytics For File and Object DataSandeep Patil
The document discusses IBM Spectrum Scale's unified file and object access feature. It introduces Spectrum Scale and its support for file and object access. The unified file and object access feature allows data to be accessed as both files and objects without copying, through a single management plane. Use cases like in-place analytics for object data and common identity management across file and object access are enabled. A demo is presented where a file is uploaded as an object, analytics is run on it, and the result downloaded as an object, without data movement.
Spectrum Scale - Diversified analytic solution based on various storage servi...Wei Gong
This slides describe diversified analytic solutions based on Spectrum Scale with various deployment mode, such as storage rich-server, share storage, IBM DeepFlash 150 and Elastic Storage Server. It deep dives several advanced data management features and solutions for BD&A workload derived from Spectrum Scale.
Hadoop and Spark Analytics over Better StorageSandeep Patil
This document discusses using IBM Spectrum Scale to provide a colder storage tier for Hadoop & Spark workloads using IBM Elastic Storage Server (ESS) and HDFS transparency. Some key points discussed include:
- Using Spectrum Scale to federate ESS with existing HDFS or Spectrum Scale filesystems, allowing data to be seamlessly accessed even if moved to the ESS tier.
- Extending HDFS across multiple HDFS and Spectrum Scale clusters without needing to move data using Spectrum Scale's HDFS transparency connector.
- Integrating ESS tier with Spectrum Protect for backup and Spectrum Archive for archiving to take advantage of their policy engines and automation.
- Examples of using the unified storage for analytics workflows, life
Ozone: Evolution of HDFS scalability & built-in GDPR complianceDinesh Chitlangia
This talk was delivered at ApacheCON, Las Vegas USA, September 2019.
Audio Recording: https://feathercast.apache.org/2019/09/12/ozone-evolving-hdfs-scalability-to-new-heights-built-in-gdpr-compliance-dinesh-chitlangia/
Speakers:
Dinesh Chitlangia: https://www.linkedin.com/in/dineshchitlangia/
Ajay Kumar aka Ajay Yadav: https://www.linkedin.com/in/ajayydv/
Abstract:
https://www.apachecon.com/acna19/s/#/scheduledEvent/1176
Apache Hadoop Ozone is a robust, distributed key-value object store for Hadoop with layered architecture and strong consistency. It separates the namespace management from block and node management layer, which allows users to independently scale on both axes. Ozone is interoperable with Hadoop ecosystem as it provides OzoneFS (Hadoop compatible file system API), data locality and plug-n-play deployment with HDFS as it can be installed in an existing Hadoop cluster and can share storage disks with HDFS. Ozone solves the scalability challenges with HDFS by being size agnostic. Consequently, it allows users to store trillions of files in Ozone and access them as if they are on HDFS. Ozone plugs into existing Hadoop deployments seamlessly, and programs like Yarn, MapReduce, Spark, Hive and work without any modifications. In the era of increasing need for data privacy and regulations, Ozone also aims to provide built-in support for GDPR compliance with strong focus on Right to be Forgotten i.e., Data Erasure. At the end of this presentation the audience will be able to understand: 1. Overview of current challenges with HDFS scalability 2. How Ozone’s Architecture solves these challenges 3. Overview of GDPR 4. Built-in support for GDPR in Ozone
The document discusses scaling challenges with HDFS and proposed solutions from Hortonworks called HDDS and Ozone. HDFS scales well for data and IO but has limitations scaling the namespace beyond 500 million files. HDDS aims to scale the block layer using block containers which can reduce block reports. Ozone uses a flat key-value namespace that is easier to shard and scale beyond billions of objects compared to HDFS hierarchical namespace. It also provides an HDFS compatible filesystem called OzoneFS. Together HDDS and Ozone aim to retain HDFS features while scaling to exabytes of data and trillions of files.
Running secured Spark job in Kubernetes compute cluster and integrating with ...DataWorks Summit
This presentation will provide technical design and development insights to run a secured Spark job in Kubernetes compute cluster that accesses job data from a Kerberized HDFS cluster. Joy will show how to run a long-running machine learning or ETL Spark job in Kubernetes and to access data from HDFS using Kerberos Principal and Delegation token.
The first part of this presentation will unleash the design and best practices to deploy and run Spark in Kubernetes integrated with HDFS that creates on-demand multi-node Spark cluster during job submission, installing/resolving software dependencies (packages), executing/monitoring the workload, and finally disposing the resources at the end of job completion. The second part of this presentation covers the design and development details to setup a Spark+Kubernetes cluster that supports long-running jobs accessing data from secured HDFS storage by creating and renewing Kerberos delegation tokens seamlessly from end-user's Kerberos Principal.
All the techniques covered in this presentation are essential in order to set up a Spark+Kubernetes compute cluster that accesses data securely from distributed storage cluster such as HDFS in a corporate environment. No prior knowledge of any of these technologies is required to attend this presentation.
Speaker
Joy Chakraborty, Data Architect
Hadoop has some built-in data protection features like replication, snapshots, and trash bins. However, these may not be sufficient on their own. Hadoop data can still be lost due to software bugs or human errors. A well-designed data protection strategy for Hadoop should include diversified copies of valuable data both within and outside the Hadoop environment. This protects against data loss from both software and hardware failures.
How to Protect Big Data in a Containerized EnvironmentBlueData, Inc.
Every enterprise spends significant resources to protect its data. This is especially true in the case of big data, since some of this data may include sensitive or confidential customer and financial information. Common methods for protecting data include permissions and access controls as well as the encryption of data at rest and in flight.
The Hadoop community has recently rolled out Transparent Data Encryption (TDE) support in HDFS. Transparent Data Encryption refers to the process whereby data is transparently encrypted by the big data application writing the data; it is not decrypted again until it is accessed by another application. The data is encrypted during its entire lifespan—in transit and at rest—except when it is being specifically accessed by a processing application.
TDE is an excellent approach for protecting data stored in data lakes built on the latest versions of HDFS. However, it does have its challenges and limitations. Systems that want to use TDE require tight integration with enterprise-wide Kerberos Key Distribution Center (KDC) services and Key Management Systems (KMS). This integration isn’t easy to set up or maintain. These issues can be even more challenging in a virtualized or containerized environment where one Kerberos realm may be used to secure the big data compute cluster and a different Kerberos realm may be used to secure the HDFS filesystem accessed by this cluster.
BlueData has developed significant expertise in configuring, managing, and optimizing access to TDE-protected HDFS. This session at the Strata Data Conference in March 2018 (by Thomas Phelan, co-founder and chief architect at BlueData) offers a detailed overview of how transparent data encryption works with HDFS, with a particular focus on containerized environments.
You’ll learn how HDFS TDE is configured and maintained in an environment where many big data frameworks run simultaneously (e.g., in a hybrid cloud architecture using Docker containers). Moreover, you’ll learn how KDC credentials can be managed in a Kerberos cross-realm environment to provide data scientists and analysts with the greatest flexibility in accessing data while maintaining complete enterprise-grade data security.
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63763
A comprehensive overview of the security concepts in the open source Hadoop stack in mid 2015 with a look back into the "old days" and an outlook into future developments.
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Microsoft Azure or Amazon S3, and on-premises object stores, such as Western Digital’s ActiveScale. In these settings, applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems for business continuity planning (BCP) and/or supporting hybrid cloud architectures to achieve the required business goals for durability, performance, and coordination.
To resolve this complexity, HDFS-9806 has added a PROVIDED storage tier to mount external storage systems in the HDFS NameNode. Building on this functionality, we can now allow remote namespaces to be synchronized with HDFS, enabling asynchronous writes to the remote storage and the possibility to synchronously and transparently read data back to a local application wanting to access file data which is stored remotely. In this talk, which corresponds to the work in progress under HDFS-12090, we will present how the Hadoop admin can manage storage tiering between clusters and how that is then handled inside HDFS through the snapshotting mechanism and asynchronously satisfying the storage policy.
Speakers
Chris Douglas, Microsoft, Principal Research Software Engineer
Thomas Denmoor, Western Digital, Object Storage Architect
Seamless replication and disaster recovery for Apache Hive WarehouseDataWorks Summit
As Apache Hadoop clusters become central to an organization’s operations, they have clusters in more than one data center. Historically, this has been largely driven by requirements of business continuity planning or geo localization. It has also recently been gaining a lot of interest from a hybrid cloud perspective, i.e. wherein people are trying to augment their traditional on-prem setup with cloud-based additions as well. A robust replication solution is a fundamental requirement in such cases.
Seamless disaster recovery has several challenges. Data, metadata, and transaction information need to be moved in sync. It should also be easy for the users and applications to reason about the state of the replica. The “hadoop scale” also brings unique challenges as bandwidth between clusters can be a limiting factor. The data transfer has to be minimized for replication, failover, as well as fail back scenarios.
In this talk we will discuss how the above challenges are addressed for supporting seamless replication and disaster recovery for Hive.
Speakers
Sankar Hariappan, Hortonworks, Staff Software Engineer
Anishek Agarwal, Hortonworks, Engineering Manager
This document provides summaries of various distributed file systems and distributed programming frameworks that are part of the Hadoop ecosystem. It summarizes Apache HDFS, GlusterFS, QFS, Ceph, Lustre, Alluxio, GridGain, XtreemFS, Apache Ignite, Apache MapReduce, and Apache Pig. For each one it provides 1-3 links to additional resources about the project.
HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Azure HDInsight and Amazon EMR. In these settings- but also in more traditional, on premise deployments- applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems to achieve goals for durability, performance, and coordination.
Building on existing heterogeneous storage support, we add a storage tier to HDFS to work with external stores, allowing remote namespaces to be "mounted" in HDFS. This capability not only supports transparent caching of remote data as HDFS blocks, it also supports synchronous writes to remote clusters for business continuity planning (BCP) and supports hybrid cloud architectures.
This idea was presented at last year’s Summit in San Jose. Lots of progress has been made since then and the feature is in active development at the Apache Software Foundation on branch HDFS-9806, driven by Microsoft and Western Digital. We will discuss the refined design & implementation and present how end-users and admins will be able to use this powerful functionality.
The document discusses deploying Hadoop in the cloud. Some key benefits of using Hadoop in the cloud include scalability, automated failover of replicated data, and cost efficiency through distributed processing and storage. Microsoft's Azure HDInsight offering provides a fully managed Hadoop and Spark service in the cloud that allows clusters to be provisioned in minutes and is optimized for analytics workloads. The Cortana Intelligence Suite integrates big data technologies like HDInsight with machine learning and data processing tools.
How to Achieve a Self-Service and Secure Multitenant Data Lake in a Large Com...DataWorks Summit
This document discusses authentication, authorization, and application integration considerations for a large company implementing a self-service and secure multitenant data lake on Hadoop. It describes three approaches to integrating the data lake with the company's existing identity and access management system and evaluates the tradeoffs of each. It also examines options for authorization controls in Hadoop, methods for applications to authenticate to the data lake, and how applications can access data impersonating user permissions. The goal is to provide analytics capabilities to users while maintaining security, compliance, and governance.
HPE Hadoop Solutions - From use cases to proposalDataWorks Summit
Hadoop is now doing a lot more than just storage and Map/Reduce and always improving and innovating. It brings near real time, interactive and cost efficient features to do Big Data.
Join us to hear about solutions based on Hadoop, how they responds to specific customer needs, with what component(s) from the Hadoop ecosystem, based on what HPE Reference Architecture(s) for the platform.
Hadoop solutions like, ETL offloading, Predictive Analytics, Ad hoc query, Complex Event processing, Stream processing, Search, Machine learning, Deep learning, …
Based on software components like, Spark, Hive, HBase, Kafka, Storm, Flume, Impala and Elastic Search.
Speaker
John Osborn, SA, Hewlett Packard Enterprise
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsDataWorks Summit
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics - Apache Spark’s in memory capabilities catapulted it as the premier processing framework for Hadoop. Apache Ignite and Alluxio, both high-performance, integrated and distributed in-memory platform, takes Apache Spark to the next level by providing an even more powerful, faster and scalable platform to the most demanding data processing and analytic environments.
Speaker
Irfan Elahi, Consultant, Deloitte
Bare-metal performance for Big Data workloads on Docker containersBlueData, Inc.
In a benchmark study, Intel® compared the performance of Big Data workloads running on a bare-metal deployment versus running in Docker* containers with the BlueData® EPIC™ software platform.
This in-depth study shows that performance ratios for container-based Hadoop workloads on BlueData EPIC are equal to — and in some cases, better than — bare-metal Hadoop. For example, benchmark tests showed that the BlueData EPIC platform demonstrated an average 2.33% performance gain over bare metal, for a configuration with 50 Hadoop compute nodes and 10 terabytes (TB) of data. These performance results were achieved without any modifications to the Hadoop software.
This is a revolutionary milestone, and the result of an ongoing collaboration between Intel and BlueData software engineering teams.
This white paper describes the software and hardware configurations for the benchmark tests, as well as details of the performance benchmark process and results.
How to deploy Apache Spark in a multi-tenant, on-premises environmentBlueData, Inc.
Adoption of Apache Spark in the enterprise is increasing rapidly - it's become one of the fastest growing and most popular technologies in the Big Data ecosystem.
However, implementing an enterprise-ready, on-premises Spark deployment can be very complex and it requires expertise that is generally not available to all.
BlueData makes it easier to deploy Apache Spark on-premises. With BlueData, you can spin up virtual Spark clusters within minutes – providing secure, self-service, on-demand access to Big Data analytics and infrastructure. You can deploy Spark in standalone mode or with Hadoop / YARN. You can also build analytical pipelines and create Spark clusters using our RESTful APIs, and use web-based Zeppelin notebooks for interactive data analytics.
BlueData’s software platform leverages virtualization and Docker containers – combined with our own patent-pending innovations – to make it faster, and more cost-effective for enterprises to get up and running with a multi-tenant Spark deployment on-premises.
Learn more at www.bluedata.com
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit
The document discusses scaling HDFS to manage billions of files through distributed storage schemes. It outlines the current HDFS architecture and challenges with namespace and block scaling. It proposes a storage container architecture with distributed block maps and a storage container manager to address these challenges. This would allow HDFS to easily scale to manage trillions of blocks and billions of files across large clusters.
This document discusses security features in Apache Kafka including SSL for encryption, SASL/Kerberos for authentication, authorization controls using an authorizer, and securing Zookeeper. It provides details on how these security components work, such as how SSL establishes an encrypted channel and SASL performs authentication. The authorizer implementation stores ACLs in Zookeeper and caches them for performance. Securing Zookeeper involves setting ACLs on Zookeeper nodes and migrating security configurations. Future plans include moving more functionality to the broker side and adding new authorization features.
As Hadoop becomes a critical part of Enterprise data infrastructure, securing Hadoop has become critically important. Enterprises want assurance that all their data is protected and that only authorized users have access to the relevant bits of information. In this session we will cover all aspects of Hadoop security including authentication, authorization, audit and data protection. We will also provide demonstration and detailed instructions for implementing comprehensive Hadoop security.
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...DataWorks Summit
Businesses often have to interact with different data sources to get a unified view of the business or to resolve discrepancies. These EDW data repositories are often large and complex, are business critical, and cannot afford downtime. This session will share best practices and lessons learned for building a Data Fabric on Spark / Hadoop / HIVE/ NoSQL that provides a unified view, enables a simplified access to the data repositories, resolves technical challenges and adds business value. Businesses often have to interact with different data sources to get a unified view of the business or to resolve discrepancies. These EDW data repositories are often large and complex, are business critical, and cannot afford downtime. This session will share best practices and lessons learned for building a Data Fabric on Spark / Hadoop / HIVE/ NoSQL that provides a unified view, enables a simplified access to the data repositories, resolves technical challenges and adds business value.
Ibm spectrum scale_backup_n_archive_v03_ashAshutosh Mate
IBM Spectrum Scale can be used as both the source and destination for backup and archiving. As a source, Spectrum Scale data can be backed up to products like Spectrum Protect, Spectrum Archive, and third-party backup software. As a destination, Spectrum Protect can use Spectrum Scale and ESS storage for storing backed up or archived data, providing scalability, performance, and cost benefits over other solutions. Case studies demonstrate how large enterprises and regional hospital networks have consolidated backup infrastructure and improved availability, capacity, and backup/restore speeds by combining Spectrum Scale and Spectrum Protect.
Running secured Spark job in Kubernetes compute cluster and integrating with ...DataWorks Summit
This presentation will provide technical design and development insights to run a secured Spark job in Kubernetes compute cluster that accesses job data from a Kerberized HDFS cluster. Joy will show how to run a long-running machine learning or ETL Spark job in Kubernetes and to access data from HDFS using Kerberos Principal and Delegation token.
The first part of this presentation will unleash the design and best practices to deploy and run Spark in Kubernetes integrated with HDFS that creates on-demand multi-node Spark cluster during job submission, installing/resolving software dependencies (packages), executing/monitoring the workload, and finally disposing the resources at the end of job completion. The second part of this presentation covers the design and development details to setup a Spark+Kubernetes cluster that supports long-running jobs accessing data from secured HDFS storage by creating and renewing Kerberos delegation tokens seamlessly from end-user's Kerberos Principal.
All the techniques covered in this presentation are essential in order to set up a Spark+Kubernetes compute cluster that accesses data securely from distributed storage cluster such as HDFS in a corporate environment. No prior knowledge of any of these technologies is required to attend this presentation.
Speaker
Joy Chakraborty, Data Architect
Hadoop has some built-in data protection features like replication, snapshots, and trash bins. However, these may not be sufficient on their own. Hadoop data can still be lost due to software bugs or human errors. A well-designed data protection strategy for Hadoop should include diversified copies of valuable data both within and outside the Hadoop environment. This protects against data loss from both software and hardware failures.
How to Protect Big Data in a Containerized EnvironmentBlueData, Inc.
Every enterprise spends significant resources to protect its data. This is especially true in the case of big data, since some of this data may include sensitive or confidential customer and financial information. Common methods for protecting data include permissions and access controls as well as the encryption of data at rest and in flight.
The Hadoop community has recently rolled out Transparent Data Encryption (TDE) support in HDFS. Transparent Data Encryption refers to the process whereby data is transparently encrypted by the big data application writing the data; it is not decrypted again until it is accessed by another application. The data is encrypted during its entire lifespan—in transit and at rest—except when it is being specifically accessed by a processing application.
TDE is an excellent approach for protecting data stored in data lakes built on the latest versions of HDFS. However, it does have its challenges and limitations. Systems that want to use TDE require tight integration with enterprise-wide Kerberos Key Distribution Center (KDC) services and Key Management Systems (KMS). This integration isn’t easy to set up or maintain. These issues can be even more challenging in a virtualized or containerized environment where one Kerberos realm may be used to secure the big data compute cluster and a different Kerberos realm may be used to secure the HDFS filesystem accessed by this cluster.
BlueData has developed significant expertise in configuring, managing, and optimizing access to TDE-protected HDFS. This session at the Strata Data Conference in March 2018 (by Thomas Phelan, co-founder and chief architect at BlueData) offers a detailed overview of how transparent data encryption works with HDFS, with a particular focus on containerized environments.
You’ll learn how HDFS TDE is configured and maintained in an environment where many big data frameworks run simultaneously (e.g., in a hybrid cloud architecture using Docker containers). Moreover, you’ll learn how KDC credentials can be managed in a Kerberos cross-realm environment to provide data scientists and analysts with the greatest flexibility in accessing data while maintaining complete enterprise-grade data security.
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63763
A comprehensive overview of the security concepts in the open source Hadoop stack in mid 2015 with a look back into the "old days" and an outlook into future developments.
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Microsoft Azure or Amazon S3, and on-premises object stores, such as Western Digital’s ActiveScale. In these settings, applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems for business continuity planning (BCP) and/or supporting hybrid cloud architectures to achieve the required business goals for durability, performance, and coordination.
To resolve this complexity, HDFS-9806 has added a PROVIDED storage tier to mount external storage systems in the HDFS NameNode. Building on this functionality, we can now allow remote namespaces to be synchronized with HDFS, enabling asynchronous writes to the remote storage and the possibility to synchronously and transparently read data back to a local application wanting to access file data which is stored remotely. In this talk, which corresponds to the work in progress under HDFS-12090, we will present how the Hadoop admin can manage storage tiering between clusters and how that is then handled inside HDFS through the snapshotting mechanism and asynchronously satisfying the storage policy.
Speakers
Chris Douglas, Microsoft, Principal Research Software Engineer
Thomas Denmoor, Western Digital, Object Storage Architect
Seamless replication and disaster recovery for Apache Hive WarehouseDataWorks Summit
As Apache Hadoop clusters become central to an organization’s operations, they have clusters in more than one data center. Historically, this has been largely driven by requirements of business continuity planning or geo localization. It has also recently been gaining a lot of interest from a hybrid cloud perspective, i.e. wherein people are trying to augment their traditional on-prem setup with cloud-based additions as well. A robust replication solution is a fundamental requirement in such cases.
Seamless disaster recovery has several challenges. Data, metadata, and transaction information need to be moved in sync. It should also be easy for the users and applications to reason about the state of the replica. The “hadoop scale” also brings unique challenges as bandwidth between clusters can be a limiting factor. The data transfer has to be minimized for replication, failover, as well as fail back scenarios.
In this talk we will discuss how the above challenges are addressed for supporting seamless replication and disaster recovery for Hive.
Speakers
Sankar Hariappan, Hortonworks, Staff Software Engineer
Anishek Agarwal, Hortonworks, Engineering Manager
This document provides summaries of various distributed file systems and distributed programming frameworks that are part of the Hadoop ecosystem. It summarizes Apache HDFS, GlusterFS, QFS, Ceph, Lustre, Alluxio, GridGain, XtreemFS, Apache Ignite, Apache MapReduce, and Apache Pig. For each one it provides 1-3 links to additional resources about the project.
HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit
Most users know HDFS as the reliable store of record for big data analytics. HDFS is also used to store transient and operational data when working with cloud object stores, such as Azure HDInsight and Amazon EMR. In these settings- but also in more traditional, on premise deployments- applications often manage data stored in multiple storage systems or clusters, requiring a complex workflow for synchronizing data between filesystems to achieve goals for durability, performance, and coordination.
Building on existing heterogeneous storage support, we add a storage tier to HDFS to work with external stores, allowing remote namespaces to be "mounted" in HDFS. This capability not only supports transparent caching of remote data as HDFS blocks, it also supports synchronous writes to remote clusters for business continuity planning (BCP) and supports hybrid cloud architectures.
This idea was presented at last year’s Summit in San Jose. Lots of progress has been made since then and the feature is in active development at the Apache Software Foundation on branch HDFS-9806, driven by Microsoft and Western Digital. We will discuss the refined design & implementation and present how end-users and admins will be able to use this powerful functionality.
The document discusses deploying Hadoop in the cloud. Some key benefits of using Hadoop in the cloud include scalability, automated failover of replicated data, and cost efficiency through distributed processing and storage. Microsoft's Azure HDInsight offering provides a fully managed Hadoop and Spark service in the cloud that allows clusters to be provisioned in minutes and is optimized for analytics workloads. The Cortana Intelligence Suite integrates big data technologies like HDInsight with machine learning and data processing tools.
How to Achieve a Self-Service and Secure Multitenant Data Lake in a Large Com...DataWorks Summit
This document discusses authentication, authorization, and application integration considerations for a large company implementing a self-service and secure multitenant data lake on Hadoop. It describes three approaches to integrating the data lake with the company's existing identity and access management system and evaluates the tradeoffs of each. It also examines options for authorization controls in Hadoop, methods for applications to authenticate to the data lake, and how applications can access data impersonating user permissions. The goal is to provide analytics capabilities to users while maintaining security, compliance, and governance.
HPE Hadoop Solutions - From use cases to proposalDataWorks Summit
Hadoop is now doing a lot more than just storage and Map/Reduce and always improving and innovating. It brings near real time, interactive and cost efficient features to do Big Data.
Join us to hear about solutions based on Hadoop, how they responds to specific customer needs, with what component(s) from the Hadoop ecosystem, based on what HPE Reference Architecture(s) for the platform.
Hadoop solutions like, ETL offloading, Predictive Analytics, Ad hoc query, Complex Event processing, Stream processing, Search, Machine learning, Deep learning, …
Based on software components like, Spark, Hive, HBase, Kafka, Storm, Flume, Impala and Elastic Search.
Speaker
John Osborn, SA, Hewlett Packard Enterprise
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsDataWorks Summit
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics - Apache Spark’s in memory capabilities catapulted it as the premier processing framework for Hadoop. Apache Ignite and Alluxio, both high-performance, integrated and distributed in-memory platform, takes Apache Spark to the next level by providing an even more powerful, faster and scalable platform to the most demanding data processing and analytic environments.
Speaker
Irfan Elahi, Consultant, Deloitte
Bare-metal performance for Big Data workloads on Docker containersBlueData, Inc.
In a benchmark study, Intel® compared the performance of Big Data workloads running on a bare-metal deployment versus running in Docker* containers with the BlueData® EPIC™ software platform.
This in-depth study shows that performance ratios for container-based Hadoop workloads on BlueData EPIC are equal to — and in some cases, better than — bare-metal Hadoop. For example, benchmark tests showed that the BlueData EPIC platform demonstrated an average 2.33% performance gain over bare metal, for a configuration with 50 Hadoop compute nodes and 10 terabytes (TB) of data. These performance results were achieved without any modifications to the Hadoop software.
This is a revolutionary milestone, and the result of an ongoing collaboration between Intel and BlueData software engineering teams.
This white paper describes the software and hardware configurations for the benchmark tests, as well as details of the performance benchmark process and results.
How to deploy Apache Spark in a multi-tenant, on-premises environmentBlueData, Inc.
Adoption of Apache Spark in the enterprise is increasing rapidly - it's become one of the fastest growing and most popular technologies in the Big Data ecosystem.
However, implementing an enterprise-ready, on-premises Spark deployment can be very complex and it requires expertise that is generally not available to all.
BlueData makes it easier to deploy Apache Spark on-premises. With BlueData, you can spin up virtual Spark clusters within minutes – providing secure, self-service, on-demand access to Big Data analytics and infrastructure. You can deploy Spark in standalone mode or with Hadoop / YARN. You can also build analytical pipelines and create Spark clusters using our RESTful APIs, and use web-based Zeppelin notebooks for interactive data analytics.
BlueData’s software platform leverages virtualization and Docker containers – combined with our own patent-pending innovations – to make it faster, and more cost-effective for enterprises to get up and running with a multi-tenant Spark deployment on-premises.
Learn more at www.bluedata.com
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesDataWorks Summit
The document discusses scaling HDFS to manage billions of files through distributed storage schemes. It outlines the current HDFS architecture and challenges with namespace and block scaling. It proposes a storage container architecture with distributed block maps and a storage container manager to address these challenges. This would allow HDFS to easily scale to manage trillions of blocks and billions of files across large clusters.
This document discusses security features in Apache Kafka including SSL for encryption, SASL/Kerberos for authentication, authorization controls using an authorizer, and securing Zookeeper. It provides details on how these security components work, such as how SSL establishes an encrypted channel and SASL performs authentication. The authorizer implementation stores ACLs in Zookeeper and caches them for performance. Securing Zookeeper involves setting ACLs on Zookeeper nodes and migrating security configurations. Future plans include moving more functionality to the broker side and adding new authorization features.
As Hadoop becomes a critical part of Enterprise data infrastructure, securing Hadoop has become critically important. Enterprises want assurance that all their data is protected and that only authorized users have access to the relevant bits of information. In this session we will cover all aspects of Hadoop security including authentication, authorization, audit and data protection. We will also provide demonstration and detailed instructions for implementing comprehensive Hadoop security.
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...DataWorks Summit
Businesses often have to interact with different data sources to get a unified view of the business or to resolve discrepancies. These EDW data repositories are often large and complex, are business critical, and cannot afford downtime. This session will share best practices and lessons learned for building a Data Fabric on Spark / Hadoop / HIVE/ NoSQL that provides a unified view, enables a simplified access to the data repositories, resolves technical challenges and adds business value. Businesses often have to interact with different data sources to get a unified view of the business or to resolve discrepancies. These EDW data repositories are often large and complex, are business critical, and cannot afford downtime. This session will share best practices and lessons learned for building a Data Fabric on Spark / Hadoop / HIVE/ NoSQL that provides a unified view, enables a simplified access to the data repositories, resolves technical challenges and adds business value.
Ibm spectrum scale_backup_n_archive_v03_ashAshutosh Mate
IBM Spectrum Scale can be used as both the source and destination for backup and archiving. As a source, Spectrum Scale data can be backed up to products like Spectrum Protect, Spectrum Archive, and third-party backup software. As a destination, Spectrum Protect can use Spectrum Scale and ESS storage for storing backed up or archived data, providing scalability, performance, and cost benefits over other solutions. Case studies demonstrate how large enterprises and regional hospital networks have consolidated backup infrastructure and improved availability, capacity, and backup/restore speeds by combining Spectrum Scale and Spectrum Protect.
From archive to insight debunking myths of analytics on object storesDean Hildebrand
This document summarizes four common myths about using object stores for analytics and debunks each one. It discusses that data does not need to migrate between Swift and HDFS, object stores can support frameworks beyond just in-memory analytics, they can efficiently support frameworks that require appending to files like Hive and HBase, and object stores are not inherently slow for analytics when used with Swift-on-File. The document demonstrates Swift-on-File, which allows objects stored in Swift to be accessed as files, avoiding unnecessary data movement and enabling analytics in place directly on the object store.
The document discusses IBM's Watson IoT platform and how it can be used from device connectivity to analytics. It provides an overview of the different phases of using the platform from try/dev to managing services. It also discusses how the platform allows composing applications using tools like Node-RED and integrating various cloud services for analytics, security and more. Industry solutions and the value of IBM's IoT platform from connecting assets to optimizing operations and innovating new business models is highlighted.
Software Defined Analytics with File and Object Access Plus Geographically Di...Trishali Nayar
Introduction to Spectrum Scale Active File Management (AFM)
and its use cases. Spectrum Scale Protocols - Unified File & Object Access (UFO) Feature Details
AFM + Object : Unique Wan Caching for Object Store
IBM Spectrum Scale for File and Object StorageTony Pearson
This document provides information about a technical university presentation on IBM Spectrum Scale for file and object storage given by Tony Pearson. The presentation schedule lists topics such as software defined storage, converged and hyperconverged environments, big data architectures, and IBM storage integration with OpenStack. The document discusses challenges of islands of block, file, and object level data and how IBM Spectrum Scale provides a single global namespace and universal data access across various protocols. It describes features of IBM Spectrum Scale such as extreme scalability, high performance, reliability, and supported topologies.
This document summarizes the top 6 advantages of using Cleversafe for object storage: scalability up to over 100PB deployed by customers, encryption providing government-grade security without risk of data breach from single disk/node/site failures, availability with no downtime during upgrades or hardware/disk/node/site failures, manageability needing no RAID or replication management even with petabytes of storage, efficiency using less storage, power and space for the lowest total cost of ownership, and reliability of 9 nines with on-premises, hybrid and public cloud options available from day one. It also notes that Cleversafe is the #1 ranked solution for unstructured data in Gartner reports.
Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale ba...xKinAnx
The document provides an overview of key concepts covered in a GPFS 4.1 system administration course, including backups using mmbackup, SOBAR integration, snapshots, quotas, clones, and extended attributes. The document includes examples of commands and procedures for administering these GPFS functions.
IBM Spectrum Storage Family is a suite of storage management and optimization software including IBM Spectrum Control, IBM Spectrum Protect, IBM Spectrum Archive, IBM Spectrum Virtualize, IBM Spectrum Accelerate, and IBM Spectrum Scale. These solutions provide analytics-driven hybrid cloud data management, optimized hybrid cloud data protection, fast data retention, virtualization of mixed block environments, enterprise block storage for hybrid cloud, high-performance scalable storage for unstructured data, and flexible scalable hybrid cloud object storage.
IBM Object Storage and Software Defined Solutions - CleversafeDiego Alberto Tamayo
Digital Content Growth
• Continued growth in graphical content creation
• Multi-device and HD/4K/8K make it even more challenging to store and process data
• Time & location shifting: viewing on individual schedule
• Content life-cycle management provides a balance between cost & performance
while maintaining customer experience
• Meta data availability and access
• Digital disruption with Over The Top (OTT) digital only competitors - new business
models
• OTT viewing will grow from representing 3.4% of TV viewing hours in 2013 to
20.4% by 2017 in NA
• 63% stream on-demand media more than weekly
• Security and data protection are big issues.
• File sharing declines with legal on-demand options
• Connection speeds are increasing making new options possible
IBM Spectrum Scale provides unified file and object access, allowing data to be ingested and stored as either files or objects and accessed via both file and object interfaces. Key capabilities include a single global namespace for files and objects, automatic placement of data on optimal storage tiers, ability to analyze data in place without copying or moving data, and support for both legacy file applications and new object-based workloads and data stores.
IBM Streams V4.2 Submission Time Fusion and Configurationlisanl
Brad Fawcett, Queenie Ma, and Mary Komor are developers with IBM Streams. In their presentation, they cover the new Submission Time Fusion and Configuration support available in IBM Streams V4.2.
IBM Cloud Object Storage provides flexible, scalable, and simple storage designed for today's data challenges. It offers hybrid cloud storage options that can be deployed both on-premise and off-premise. Key benefits include lower total cost of ownership compared to traditional storage, massive scalability across IBM's global network, and unified management. IBM Cloud Object Storage is used by organizations across industries for various use cases including backup, archive, content management, and more.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.lisanl
Fang Zheng is a developer with IBM Streams. In his presentation, Fang describes the enhancements related to consistent regions that are available in IBM Streams V4.2.
This document provides an introduction and disclaimer for an IBM Streams presentation. It notes that the information is provided as-is without warranty and is subject to change. It directs the reader to several IBM Streams resources including the Streams developer website, GitHub organization, tutorials, an online SPL course, and a water conservation starter kit. Contact information is provided for any questions about the presentation.
Introduction to IBM Spectrum Scale and Its Use in Life ScienceSandeep Patil
IBM Spectrum Scale is a scalable file system that can be used to support life science research. It provides high scalability, high availability, and a software read cache called Local Read Only Cache (LROC) that uses SSDs to improve performance. The University of Basel uses Spectrum Scale in their scientific computing and storage infrastructure to support various research areas including bioinformatics, structural biology, and hosting reference services. It provides features such as cluster file systems, data migration, hierarchical storage management, encryption, and disaster recovery between two sites using asynchronous file migration.
Installation and Setup for IBM InfoSphere Streams V4.0lisanl
Laurie Williams is the Installation component lead on the InfoSphere Streams developement team. Her presentation describes the installation and setup of IBM InfoSphere Streams V4.0 in a multi-host environment.
View related presentations and recordings from the Streams V4.0 Developers Conference at:
https://developer.ibm.com/answers/questions/183353/ibm-infosphere-streams-40-developers-conference-on.html?smartspace=streamsdev
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Programinside-BigData.com
In this video from the DDN User Group at SC16, Sven Oehme Chief Research Strategist, IBM, presents "Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program."
Watch the video presentation: http://wp.me/p3RLHQ-g52
Sign up for our insideHPC Newsletter: http://wp.me/p3RLHQ-g52
Introducing IBM Spectrum Scale 4.2 and Elastic Storage Server 3.5Doug O'Flaherty
The document discusses IBM Spectrum Scale, a software-defined storage product. It provides a unified file and object storage system with integrated analytics support. New features in versions 4.2 and 3.5 include reducing costs through compression and quality of service policies, accelerating analytics with native HDFS support, and simplifying deployment with new graphical user interfaces.
This document provides an overview of object storage. It defines object storage and when it is applicable, such as for unstructured data workloads over 100TB, distributed access to content, data archiving, and non-high performance applications. It describes how object storage uses metadata and a flat organization without directories. Examples of use cases like media storage, content stores, data analytics, private clouds, and backup/archive are listed. Characteristics of object storage systems and how they are built from clusters of storage nodes are also covered.
BIOIT14: Deploying very low cost cloud storage technology in a traditional re...Dirk Petersen
When implementing storage charge backs we wanted to offer researchers an alternative storage solution that would not cost more than AWS Glacier. We also wanted it to be long term durable, self-protecting, easy to manage, store petabytes, survive the loss of an entire data center and deliver predictable performance. Learn how to avoid pitfalls and be able to determine if a solution like this makes sense for your organization.
Elastic storage in the cloud session 5224 final v2BradDesAulniers2
IBM Spectrum Scale (formerly Elastic Storage) provides software defined storage capabilities using standard commodity hardware. It delivers automated, policy-driven storage services through orchestration of the underlying storage infrastructure. Key features include massive scalability up to a yottabyte in size, built-in high availability, data integrity, and the ability to non-disruptively add or remove storage resources. The software provides a single global namespace, inline and offline data tiering, and integration with applications like HDFS to enable analytics on existing storage infrastructure.
SoftLayer Storage Services Overview (for Interop Las Vegas 2015)Michael Fork
Introduction to SoftLayer's Storage Services. Topics covered include Block and File offerings Endurance, Performance, Mass Storage Servers (QuantaStor), and Backup (EVault, R1Soft), Object Storage (OpenStack Swift), CDN, Data Transfer Service, and Aspera.
IBM Cloud Object Storage System (powered by Cleversafe) and its ApplicationsTony Pearson
This document discusses IBM's Cloud Object Storage System, which was acquired from Cleversafe. It provides a scalable and cost-effective solution for storing large amounts of unstructured data, such as photos, videos, and research files. The system uses erasure coding to split data into slices and distribute them across commodity hardware for high availability and reliability without proprietary equipment. It can be deployed as software, pre-built appliances, or as a cloud service on IBM SoftLayer infrastructure.
In this session, we’ll focus exclusively on OpenStack Swift, OpenStack’s object store capability. We’ll review the architecture, use cases, deployment strategies and common obstacles as we “open up the covers” on this exciting element of the OpenStack architecture.
Design - Building a Foundation for Hybrid Cloud StorageLaurenWendler
Building a foundation for hybrid cloud storage involves using object storage as a key building block and software defined storage to provide compatible infrastructure across on-premises and cloud environments. This allows for workflow mobility across locations by overcoming data gravity, seamless operations, strong data security, and flexibility in infrastructure choice. Object storage is designed to handle the growing amounts of unstructured data and needs of web-scale applications, while hybrid cloud storage integrates on-premises and cloud storage through methods like using cloud storage as remote storage, replicating and mirroring data between locations, and providing consistent APIs and workload management across environments.
Case Study: Implementing Hadoop and Elastic Map Reduce on Scale-out Object S...Cloudian
This document discusses implementing Hadoop and Elastic MapReduce on Cloudian's scale-out object storage platform. It describes Cloudian's hybrid cloud storage capabilities and how their approach reduces costs and provides faster analytics by analyzing log and event data directly on their storage platform without needing to transform the data for HDFS. Key benefits highlighted include no redundant storage, scaling analytics with storage capacity by adding nodes, and taking advantage of multi-core CPUs for MapReduce tasks.
Webinar: What Your Object Storage Vendor Isn’t Telling You About NFS SupportStorage Switzerland
NFS has been the “go to” file system for large data stores but there is a new offering on the horizon…Object Storage. To help ease the transition, many Object Storage vendors have provided a gateway that allows their systems to look like NFS servers. The problem is that most of these implementations are very limiting and often create more problems than they fix. In this webinar Storage Switzerland and Caringo discuss why object storage systems are the heir apparent to NFS servers and how to make that transition without the typical roadblocks that NFS gateways create.
Attend this webinar to learn:
* What is Object Storage
* Object Storage vs. NFS
* Challenges of Object Storage without NFS
* How NFS makes Object Storage more useful
* Architecture of a typical NFS gateway
* The Challenges of Hosting NFS on an Object Store
Integrating On-premises Enterprise Storage Workloads with AWS (ENT301) | AWS ...Amazon Web Services
AWS gives designers of enterprise storage systems a completely new set of options. Aimed at enterprise storage specialists and managers of cloud-integration teams, this session gives you the tools and perspective to confidently integrate your storage workloads with AWS. We show working use cases, a thorough TCO model, and detailed customer blueprints. Throughout we analyze how data-tiering options measure up to the design criteria that matter most: performance, efficiency, cost, security, and integration.
#MFSummit2016 Operate: The race for spaceMicro Focus
The Race for Space: File Storage Challenges and Solutions Facing escalating storage requirements? Being held to ransom by your vendors? Would secure, scalable, highly-available and cost-effective file storage that works with your current infrastructure help? Micro Focus and SUSE could help. Presenters: David Shepherd, Solutions Consultant, Micro Focus and Stephen Mogg, Solutions Consultant SUSE
Se training storage grid webscale technical overviewsolarisyougood
The document provides an overview of StorageGRID Webscale, an object storage solution from NetApp. It discusses key concepts including how StorageGRID Webscale uses a distributed architecture with different node types to provide a global object namespace and scale to support billions of objects and petabytes of storage. The document also describes how StorageGRID Webscale leverages extensive metadata and policy-driven management to intelligently distribute and tier data across storage pools.
Introduction to types of cloud storage and overview and comparison of the SoftLayer Storage Services. Topics covered include Block and File offerings"Codename: Prime", Consistent Performance, Mass Storage Servers (QuantaStor), and Backup (EVault, R1Soft), Object Storage (OpenStack Swift), CDN, Data Transfer Service, and Aspera.
Big Data Architecture Workshop - Vahid Amiridatastack
Big Data Architecture Workshop
This slide is about big data tools, thecnologies and layers that can be used in enterprise solutions.
TopHPC Conference
2019
IBM Cloud Object Storage System, presented Oct 16, 2017 at IBM Systems Technical University in New Orleans, LA. This covers the object storage from IBM from the acquisition of Cleversafe, formerly known as DSnet product.
John Readey presented on HDF5 in the cloud using HDFCloud. HDF5 can provide a cost-effective cloud infrastructure by paying for what is used rather than what may be needed. HDFCloud uses an HDF5 server to enable accessing HDF5 data through a REST API, allowing users to access large datasets without downloading entire files. It maps HDF5 objects to cloud object storage for scalable performance and uses Docker containers for elastic scaling.
Cloud computing UNIT 2.1 presentation inRahulBhole12
Cloud storage allows users to store files online through cloud storage providers like Apple iCloud, Dropbox, Google Drive, Amazon Cloud Drive, and Microsoft SkyDrive. These providers offer various amounts of free storage and options to purchase additional storage. They allow files to be securely uploaded, accessed, and synced across devices. The best cloud storage provider depends on individual needs and preferences regarding storage space requirements and features offered.
NetApp Se training storage grid webscale technical overviewsolarisyougood
The document provides an overview of StorageGRID Webscale, an object storage platform from NetApp. It discusses key concepts such as object storage, metadata management, and StorageGRID's dynamic policy engine. The policy engine uses metadata and user-defined rules to intelligently place and manage objects across multiple sites, storage tiers, and protocols (e.g. S3) over their lifecycle. This allows building complex data management policies without impacting performance or capacity.
Storage Made Easy - File Fabric Use CasesHybrid Cloud
The File Fabric provides a multi-cloud solution for on-site and on-cloud data and can be used for solutions as diverse as data governance and compliance through to Big Data / Object Storage use cases.
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
Malibou Pitch Deck For Its €3M Seed Roundsjcobrien
French start-up Malibou raised a €3 million Seed Round to develop its payroll and human resources
management platform for VSEs and SMEs. The financing round was led by investors Breega, Y Combinator, and FCVC.
Microservice Teams - How the cloud changes the way we workSven Peters
A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams?
Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
14 th Edition of International conference on computer visionShulagnaSarkar2
About the event
14th Edition of International conference on computer vision
Computer conferences organized by ScienceFather group. ScienceFather takes the privilege to invite speakers participants students delegates and exhibitors from across the globe to its International Conference on computer conferences to be held in the Various Beautiful cites of the world. computer conferences are a discussion of common Inventions-related issues and additionally trade information share proof thoughts and insight into advanced developments in the science inventions service system. New technology may create many materials and devices with a vast range of applications such as in Science medicine electronics biomaterials energy production and consumer products.
Nomination are Open!! Don't Miss it
Visit: computer.scifat.com
Award Nomination: https://x-i.me/ishnom
Conference Submission: https://x-i.me/anicon
For Enquiry: Computer@scifat.com
Consistent toolbox talks are critical for maintaining workplace safety, as they provide regular opportunities to address specific hazards and reinforce safe practices.
These brief, focused sessions ensure that safety is a continual conversation rather than a one-time event, which helps keep safety protocols fresh in employees' minds. Studies have shown that shorter, more frequent training sessions are more effective for retention and behavior change compared to longer, infrequent sessions.
Engaging workers regularly, toolbox talks promote a culture of safety, empower employees to voice concerns, and ultimately reduce the likelihood of accidents and injuries on site.
The traditional method of conducting safety talks with paper documents and lengthy meetings is not only time-consuming but also less effective. Manual tracking of attendance and compliance is prone to errors and inconsistencies, leading to gaps in safety communication and potential non-compliance with OSHA regulations. Switching to a digital solution like Safelyio offers significant advantages.
Safelyio automates the delivery and documentation of safety talks, ensuring consistency and accessibility. The microlearning approach breaks down complex safety protocols into manageable, bite-sized pieces, making it easier for employees to absorb and retain information.
This method minimizes disruptions to work schedules, eliminates the hassle of paperwork, and ensures that all safety communications are tracked and recorded accurately. Ultimately, using a digital platform like Safelyio enhances engagement, compliance, and overall safety performance on site. https://safelyio.com/
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...kalichargn70th171
In today's business landscape, digital integration is ubiquitous, demanding swift innovation as a necessity rather than a luxury. In a fiercely competitive market with heightened customer expectations, the timely launch of flawless digital products is crucial for both acquisition and retention—any delay risks ceding market share to competitors.
Measures in SQL (SIGMOD 2024, Santiago, Chile)Julian Hyde
SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries.
SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL.
To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context.
A talk at SIGMOD, June 9–15, 2024, Santiago, Chile
Authors: Julian Hyde (Google) and John Fremlin (Google)
https://doi.org/10.1145/3626246.3653374
2. #ibmedge
Agenda
• Introduction to Spectrum Scale
• Introduction to Spectrum Scale Analytics
• Introduction to Spectrum Scale Object Store
• Unified File & Object Access (UFO) Feature Details
• Use Cases Enabled By UFO
• Deep Dive of In-Place Analytics Use Case
• Demo
• Q & A
1
16. #ibmedge
GPFS-FPO Advanced Storage for Map Reduce Data
15
Hadoop HDFS IBM GPFS Advantages
HDFS NameNode is a single point of failure
Large block-sizes – poor support
for small files
Non-POSIX file system – obscure
commands
Difficulty to ingest data – special
tools required
Single-purpose, Hadoop
MapReduce only
Not recommended for critical data
No single point of failure,
distributed metadata
Variable block sizes – suited to multiple
types of data and data access patterns
POSIX file system – easy to use and
manage
Policy based data ingest
Versatile, Multi-purpose
Enterprise Class advanced
storage features
17. #ibmedge
Use Case: Big Data Analytics
• Problem: Separate storage systems for ingest/distribution
and analysis
• Data movement overhead is a significant part of
my time to insight.
• Increased cost from data duplication & overhead
• Inconsistent results
• Solution: Native HDFS support
• Decreased time to results
• Run Map/Reduce directly
• No waiting for data transfer between storage
systems
• Immediately share results
16
Spectrum Scale
File/ ObjectFile/HDFS
Global
Ingest and
Distribution
Business
Analytics
Custom
Applications
Packaged
Applications
19. #ibmedge
IBM Spectrum Scale
• Avoid vendor lock-in with true Software
Defined Storage and Open Standards
• Seamless performance & capacity scaling
• Automate data management at scale
• Enable global collaboration
Data management at scale OpenStack and Spectrum Scale helps
clients manage data at scale
Business: I need virtually
unlimited storage
Operations: I need a flexible
infrastructure that supports
both object and file based
storage
Operations: I need to
minimize the time it takes to
perform common storage
management tasks
Collaboration: I need to share
data between people,
departments and sites with
low latency.
A single data plane
that supports Cinder,
Glance, Swift, Manila
as well as NFS, et. al.
A fully automated
policy based data
placement and
migration tool
An open & scalable
cloud platform
Sharing with a
variety of WAN
caching modes
Results
• Converge File and Object based storage under one roof
• Employ enterprise features to protect data, e.g.
Snapshots, Backup, and Disaster Recovery
• Support native file, block and object sharing to data.
Spectrum Scale
NFS
SMBPOSIX
SSD Fast
Disk
Slow
Disk
Tape
Swift
HDFS
Cinder
Glance Manila
Cognitive
Services
18
20. #ibmedge
Spectrum Scale Object Storage
• Basic support added in 4.1.1 release & enhanced in 4.2 and 4.2.1 release
• Based on Openstack Swift (Juno Release)
• REST-based data access
• Growing number of clients due to extremely simple protocol
• Applications can easily save & access data from anywhere using HTTP
• Simple set of atomic operations:
– PUT (upload)
– POST (update metadata)
– GET (download)
– DELETE
• Amazon S3 Protocol support
• High Availability with CES Integration
• Simple and Automated Installation Process
• Integrated authentication (Keystone) support
• Native GPFS Command Line Interface to manage Object service (mmobj command)
19
21. #ibmedge
Spectrum Scale Object Store – Additional Features
• Unified file and object support with Hadoop connectors
• Support for Encryption
• Support for Compression
• Only Object Store with Tape support for Backup
• Object store with integrated transparent cloud tiering Support
• Multi Region support
• AD/LDAP support for authentication
• ILM support for Object
• Movement of Object across storage tiers based on access heat
• Spectrum Scale Object with IBM DeepFlash becomes object store over all flash array for newer faster
workloads.
• Spectrum Scale Object with WAN caching support (AFM)
20
23. #ibmedge
The right solution for the workload
22
Ideal Workloads
• Big Data Analytics
• High Performance Computing, e.g. Engineering
Applications
• Performance optimized Backup and Restore
• Multi-Site file collaboration
• Multi-tier File Synch and Share
• Cold data archive with lowest cost data storage
tier
Differentiation
• Designed for high performance
• Unified Storage Infrastructure: Native File,
Object & Hadoop
• Robust Tiering with policy based data placement
and data movement
• Multi site collaboration with advanced routing
and caching
• Enterprise Features, e.g. Encryption,
Compression, QoS, & Disaster Recovery
Ideal Workloads
• Active Archive (warm data, mostly static)
• Cost optimized Cloud backup target
• Web app content
• Remote office storage consolidation
• Storage as a service
Differentiation
• Designed for easy deployment and
management at scale
• Always-on architecture
• Geo-dispersed erasure coding for site fault
tolerance and DR
• Simple keyless native encryption and multi-
tenant security
• Reduced cost and complexity
Spectrum Scale
IBM Cloud Object Store
(Cleversafe)
25. #ibmedge
Unified File & Object (UFO) Support
• Challenge
• The world is not converged/file/object/HDFS today!
• and never will be completely…
• Unified Scale-out Content Repository
• File or object in. Object or file out.
• Integrated big data analytics support
• Native protocol support
• High-performance that scales
• Single Management Plane
24
Spectrum Scale
NFS SMBPOSIX
SSD Fast
Disk
Slow
Disk
Tape
Swift/S3HDFS
Spectrum Scale: Redefining Unified Storage
26. #ibmedge
Spectrum Scale Unified File & Object
• Access same content both as a File & as an Object without making a copy or needing File or Object
Gateways!
• File-In-Object-Out and Object-In-File-Out Support
• Support for File Access Protocols (NFS/SMB/POSIX) and Object Access Protocols (Swift/S3)
• Objects ingested into designated Unified Container available as Files and Files ingested into it available as
Objects.
• Support for File & Object ACLs with Unified Mode ID Mapping
25
28. #ibmedge
What is Unified File and Object Access ?
• Accessing object using file interfaces
(SMB/NFS/POSIX) and accessing file using object
interfaces (REST) helps legacy applications
designed for file to seamlessly start integrating into the
object world.
• It allows object data to be accessed using
applications designed to process files. It allows file
data to be published as objects.
• Multi protocol access for file and object in the same
namespace (with common User ID management
capability) allows supporting and hosting data oceans
of different types of data with multiple access options.
• Optimizes various use cases and solution architectures
resulting in better efficiency as well as cost savings.
27
<Clustered file system>
Swift (With Swift on File)
NFS/SMB/POSIXObject(http)
2
1
<Container>
File Exports created
on container level
OR
POSIX access from
container level
Objects accessed
as FilesData ingested
as Objects
3
Data ingested
as Files4
Files accessed as
Objects
29. #ibmedge
Flexible Identity Management Modes
• Support’s Two Identity Management Modes
• Administrators can choose based on their need and use-case using CLI -------------->
28
#mmobj config change --ccrfile
object-server-sof.conf --
section DEFAULT --property
id_mgmt --value unified_mode |
local_mode
Local_Mode Unified_Mode
Identity Management Modes
Object created by Object interface
will be owned by internal “swift” user
Application processing the object data
from file interface will need the required
file ACL to access the data.
Object authentication setup
is independent of File
Authentication setup
Object created from Object interface should be
owned by the user doing the Object PUT (i.e
FILE will be owned by UID/GID of the user)
Users from Object and File are expected to be
common auth and coming from same directory
service (only AD+RFC 2307 or LDAP)
Owner of the object will own and
have access to the data from file
interface.
Suitable for unified file and object access for
end users. Leverage common ILM policies
for file and object data based on data
ownership
Suitable when auth schemes for file and
object are different and unified access
is for applications
31. #ibmedge
Use case 1 – Enabling “In-Place” analytics for Object
data repository with analytic results available as objects
30
Clustered file system
<SOF_Fileset>/<Device>
Object
(http)
Data ingested
as Objects
Spark or Hadoop
MapReduce
In-Place Analytics
Source:https://aws.amazon.com/elasticmapreduce/
Traditional object store – Data to be copied from
object store to dedicated cluster , do the analysis
and copy the result back to object store for
publishing
Object store with Unified File and Object Access –
Object Data available as File on the same fileset. Analytics systems like
Hadoop MapReduce or Spark allow the data to be directly leveraged for
analytics.
No data movement i.e. In-Place immediate data analytics.
Analytics With Unified File and Object AccessAnalytics on Traditional Object Store
Explicit Data movement
Results Published
as Objects with
no data movement
Results returned
in place
32. #ibmedge
Use case 2 : Process Object Data with File-Oriented
Applications and Publish Outcomes as Objects
31
Swift on file
Container1
Virtual
Machine
Instances
Virtual
Machine
Instances
Container2
Subsidiary 1 Subsidiary 2
NFS Export
on
Container 1
NFS Export
on
Container 2
Virtual
Machine
Instances
Virtual
Machine
Instances
VM Farm for Subsidiary 1
for video processing
VM Farm for Subsidiary 2
for video processing
…. ….
Ingest
Media Objects
Media House OpenStack Cloud Platform
(Tenant = Media House Subsidiaries)
Manila Shares (NFS) exported only for Subsidiary1
Publishing Channels
Final Video (as objects)
available for streaming
Final processed videos available as
Objects in container which is used for
external publishing
Raw media content sent for media
processing which happens over files
(Object to File access)
NFS Export
on
Container 1’
Container
1’
Manila Shares (NFS) exported only for Subsidiary2
Files converted into objects for publishing
(File to Object access)
33. #ibmedge
Use case 3 : Users read/write data via File and Object
with Common User Authentication and Identity
32
Clustered file system
Data
N
F
S
S
M
B
O
b
je
c
t
Data
N
F
S
S
M
B
O
b
je
c
t
User: John User: Riya
Access Common Data using the same User Credentials across all protocols
Corporate User
Directory
(Active Directory/LDAP)
Riya’s data Read/Written
from Object should be
owned by Riya when
accessed from File
(SMB/NFS/POSIX)
User: Riya
UID: 1001
GID: 2000
Domain: XYZ
37. #ibmedge
Setup Details
36
/dev/dm-3
viknode1
Roles – Admin,
quorum, NSD
viknode2
Roles – Quorum,
NSD, CES Node
viknode3
Roles – Quorum,
CES Node
Spectrum Scale Cluster
IBM BigInsight with Spectrum Scale Demo Setup
/dev/dm-2 Disks
Ambari Server
IBM BigInsightsYarn
Spark
HiveOozie
Slider Knox
38. #ibmedge
Prerequisites For Demo
37
Setup a Spectrum Scale Cluster with NFS, SMB and Object Protocols Enabled
Setup same authentication for File and Object
Enable unified access mode
Enable file access capabilities
Create a swift storage policy with File access enabled
Install BigInsights
Start Ambari server
Configure Cyberduck Client to access object store
41. #ibmedge
Spectrum Scale User Group
• The Spectrum Scale User Group is free
to join and open to all using, interested
in using or integrating Spectrum Scale.
• Join the User Group activities to meet
your peers and get access to experts
from partners and IBM.
• Next meetings:
- APAC: October 14, Melbourne
- Global at SC16 : November 13 1pm to 5pm, Salt Lake City
• Web page: http://www.spectrumscale.org/
• Presentations: http://www.spectrumscale.org/presentations/
• Mailing list: http://www.spectrumscale.org/join/
• Contact: http://www.spectrumscale.org/committee/
• Meet Bob Oesterlin (US Co-Principal) at Edge2016: Robert.Oesterlin@nuance.com
42. #ibmedge
Session : Futures of IBM Spectrum Scale
NDA & Customers ONLY
• Who: IBM Spectrum Scale Offering Management
• Carl Zetie, Ron Riffe
• When: Tuesday, September 20, 2016
• 1pm to 2pm
• Where: MGM Grand, Signature Tower 3
• Meeting Room D
• Contact (if any questions)
• douglasof@us.ibm.com, cmukhya@us.ibm.com
41
43. #ibmedge
Session : How to apply Flash benefits to big data
analytics and unstructured data
NDA & Customers ONLY
• Who: IBM Elastic Storage Server Offering Management
• Alex Chen
• When: Thursday, September 22, 2016
• 1:15pm to 2:15pm
• Where: Grand Garden Arena, Lower Level, MGM, Studio 10
• Contact(if any questions)
• • cmukhya@us.ibm.com, douglasof@us.ibm.co
42
44. #ibmedge
Trial VM
• Download the IBM Spectrum Scale Trial VM from : http://www-
03.ibm.com/systems/storage/spectrum/scale/trial.html
43
45. #ibmedge
References
Write a File, read as an Object: Openstack Summit, Austin, TX Apr 2016
https://www.youtube.com/watch?v=6ovLb6aktbM&feature=youtu.be&t=2
Amalgamating Manila and Swift for Unified Data Sharing: Openstack Summit, Austin, TX Apr 2016
https://www.youtube.com/watch?v=3MMrMUaA_Mg
Hadoop HDFS Vs Spectrum Scale: https://www.youtube.com/watch?v=kOeEbdO8F4A
From Archive to Insight: Debunking Myths of Analytics on Object Stores – Dean Hildebrand, Bill Owen,
Simon Lorenz, Luis Pabon, Rui Zhang. Vancouver Summit, Spring 2015.
https://www.youtube.com/watch?v=brhEUptD3JQ
Deploying Swift on a File System – Bill Owen, Thiago Da Silva. BrownBag at OpenStack Paris, Fall 2014
https://www.youtube.com/watch?v=vPn2uZF4yWo
Breaking the Mold with OpenStack Swift and GlusterFS – Jon Dickinson, Luis Pabo. Atlanta Summit, Spring 2014
https://www.youtube.com/watch?v=pSWdzjA8WuA
SNIA SDC 2015
http://www.snia.org/sites/default/files/SDC15_presentations/security/DeanHildebrand_Sasi__OpenStack%20SwiftOnFile.pdf
Spectrum Scale Infocenter
http://www.ibm.com/support/knowledgecenter/#!/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.adm.doc/bl1adm_manageunifieda
ccess.htm
44
46. #ibmedge
OpenStack Summit 2016: IBM Spectrum Scale in an
OpenStack Environment Redpaper Published.
45
http://www.redbooks.ibm.com/abstracts/redp5331.html
48. #ibmedge
IBM Spectrum Scale - Unified File and Object Access
Feature Overview
• Multi protocol access for file and object in the same namespace
• Access object as file from POSIX, NFS and SMB
• Access file as object
– Provision to convert files to object automatically via background service called ‘objectizer’
– Provision to explicitly and immediately convert files to objects using CLI
• Feature is specifically made available as an “object storage policy”
• Allows to coexists with traditional object and other policies
• Create multiple unified file and object access policies
• Since policies are applicable per container , it gives end user the flexibility to create certain containers with Unified File and Object
Access policy and certain without it.
Flexible Identity Management Mode Support
• Local Mode: Suitable when auth schemes for file and object are different and unified access is for applications
• Object created by Object interface will be owned by internal “swift” user
• Unified Mode: Suitable for unified file and object access by end users. Leverage common ILM policies for file and object data based on data
ownership.
• Object created from Object interface should be owned by the user doing the Object PUT (i.e. FILE will be owned by UID/GID of the
user)
• Ability to run in-place analytics of object data using Spectrum Scale Hadoop connectors via POSIX interface.
47
49. #ibmedge
Filesystem Layout (Traditional Vs Unified File and Object
Access)
• One of the key advantages of unified file and object access is the placement and naming of objects when stored on the file
system. In unified file and object access stores objects following the same path hierarchy as the object's URL.
• In contrast, the default object implementation stores the object following the mapping given by the ring, and its final file path
cannot be determined by the user easily.
48
ibm/gpfs0/
Object ingest
object_fileset/
o/z1device108/objects/7551/125
75fc66179f12dc513580a239e92c3125
a.jpg a.jpg
Object ingest
ibm/gpfs0/
<Sof_policy_fileset>/<device>/
AUTH_acctID/cont/
a.jpg
Traditional SWIFT Unified File and Object Access
Ingest object URL: https://swift.example.com/v1/acct/cont/a.jpg
50. #ibmedge
Easy Access Of Objects as Files via supported File
Interfaces (NFS/SMB/POSIX)
• Objects ingested are available immediately for File access via the 3 supported file protocols.
• ID management modes (explained later) gives flexibility of assigning/retaining of owners, generally required by file protocols.
• Object authorization semantics are used during object access and file authorization semantics are used during file access of
the same data – thus ensuring compatibility of object and file applications
49
<Spectrum Scale Filesystem>
<SOF_Fileset>/<Device>
NFS/SMB/POSIXObject
(http) 2
1
<AUTH_account_ID>
<Container>
File Exports created on container level
OR
POSIX access from container level
Objects accessed as Files
Data ingested as Objects
51. #ibmedge
Objectization – Making Files as Objects (Accessing File
via Object interface)
• Spectrum Scale 4.2 features with a system service called ibmobjectizer responsible for objectization.
• Objectization is a process that converts files ingested from the file interface on unified file and object access
enabled container path to be available from the object interface.
• When new files are added from the file interface, they need to be visible to the Swift database to show
correct container listing and container or account statistics.
50
Spectrum Scale Filesystem
Unified File and Object
Fileset
NFS/SMB/POSIXObject
(http)
ibmobjectizer
objectization
1
2
3 Data ingested as Files
Files accessed as Objects
52. #ibmedge
Unified File and Object Access – Policy Integration for
Flexibility
• This feature is specifically made available as an “object storage policy” as it gives the following
advantages:
• Flexibility for administrator to manage unified file and object access separately
• Allows to coexists with traditional object and other policies
• Create multiple unified file and object access policies which can vary based on underlying storage
• Since policies are applicable per container , it gives end user the flexibility to create certain containers
with Unified File and Object Access policy and certain without it.
• Example: mmobj policy create SwiftOnFileFS --enable-file-access
51
54. #ibmedge
Notices and Disclaimers Con’t.
53
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not
tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. IBM does not warrant the quality of any third-party products, or the
ability of any such third-party products to interoperate with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
The provision of the information contained h erein is not intended to, and does not, grant any right or license under any IBM patents, copyrights, trademarks or other intellectual
property right.
IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document Management System™, FASP®,
FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM SmartCloud®, IBM Social Business®, Information on Demand, ILOG,
Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®,
PureExperience®, PureFlex®, pureQuery®, pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®,
StoredIQ, Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business
Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM
trademarks is available on the Web at "Copyright and trademark information" at: www.ibm.com/legal/copytrade.shtml.