The document describes the BlueDBM architecture, which aims to optimize performance for big data analytics workloads by using FPGAs to manage flash storage and network communication in order to enable in-network processing and reduce data movement between system components. The prototype system demonstrates low latency and high bandwidth by removing data transport bottlenecks. Future work includes improving the flash controller, developing a distributed file system accelerator, and exploring medical and Hadoop applications.
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
Alluxio Monthly Webinar
Nov. 15, 2023
For more Alluxio Events: https://www.alluxio.io/events/
Speaker:
- Tarik Bennett (Senior Solutions Engineer)
- Beinan Wang (Senior Staff Engineer & Architect)
Many companies are working with development architectures for AI platforms but have concerns about efficiency at scale as data volumes increase. They use centralized cloud data lakes, like S3, to store training data for AI platforms. However, GPU shortages add more complications. Storage and compute can be separate, or even remote, making data loading slow and expensive:
1) Optimizing a developmental setup can include manual copies, which are slow and error-prone
2) Directly transferring data across regions or from cloud to on-premises can incur expensive egress fees
This webinar covers solutions to improve data loading for model training. You will learn:
- The data loading challenges with distributed infrastructure
- Typical solutions, including NFS/NAS on object storage, and why they are not the best options
- Common architectures that can improve data loading and cost efficiency
- Using Alluxio to accelerate model training and reduce costs
This document discusses changes to Hyper-V virtualization from Windows Server 2008 to 2012. Key changes include the ability to share virtual hard disks between VMs, improved quality of service controls, and more robust resource sharing between host and guest systems. The new features make Hyper-V more reliable and scalable for server virtualization needs over the next 2-3 years.
The document discusses data storage and cloud computing. It provides an overview of different types of data storage, including direct attached storage (DAS), network attached storage (NAS), and storage area networks (SANs). It also describes different classes of cloud storage, such as unmanaged and managed cloud storage. The document outlines some of the challenges of cloud storage and how cloud providers create virtual storage containers to manage data storage in the cloud.
The document discusses several Storage Area Network (SAN) configurations for different projects within EDC. It provides an overview of the goals, architecture, and experiences of SAN implementations for the CR1, Landsat, and LPDAAC projects. It also discusses some general realities and challenges of implementing and managing SANs.
How HPE 3PAR Can Help YOur Mission Critical on Cloud : Seminar Protecting Mi...PT Datacomm Diangraha
Andi Bahar's Presentation material during Seminar Protecting Mission-Critical Application Against Downtime at Grha Datacomm Jakarta, Thursday March 8 2018
Introduction to Enterprise Data Storage, Direct Attached Storage, Storage Ar...ssuserec8a711
1. Cloud storage systems store multiple copies of data across many servers in various locations so that if one system fails, the data can be accessed from another location.
2. Storage providers use virtualization software to aggregate storage assets from various devices into a single cloud storage system called StorageGRID.
3. StorageGRID creates a virtualization layer that retrieves storage from different storage devices and manages it through a common file system interface over the internet.
Webinar: Untethering Compute from StorageAvere Systems
Enterprise storage infrastructures are gradually sprawling across the globe and consumers of data increasingly require access to remote storage resources. Solutions for mitigating the pain associated with this growth are out there, but performance varies. This Webinar will take a look at these challenges, review available solutions, and compare tests of performance.
1) To show you how to spot an Aspera opportunity ! 2) To outline the Aspera portfolio (Sales overview not technical) 3) To look at the Aspera opportunity from Sharepoint 4) Summary / Q and A / Close – But interaction is welcomed throughout.. 5) But before all of that…. This… 2 AGENDA AND OBJECTIVES
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio, Inc.
Alluxio Monthly Webinar
Nov. 15, 2023
For more Alluxio Events: https://www.alluxio.io/events/
Speaker:
- Tarik Bennett (Senior Solutions Engineer)
- Beinan Wang (Senior Staff Engineer & Architect)
Many companies are working with development architectures for AI platforms but have concerns about efficiency at scale as data volumes increase. They use centralized cloud data lakes, like S3, to store training data for AI platforms. However, GPU shortages add more complications. Storage and compute can be separate, or even remote, making data loading slow and expensive:
1) Optimizing a developmental setup can include manual copies, which are slow and error-prone
2) Directly transferring data across regions or from cloud to on-premises can incur expensive egress fees
This webinar covers solutions to improve data loading for model training. You will learn:
- The data loading challenges with distributed infrastructure
- Typical solutions, including NFS/NAS on object storage, and why they are not the best options
- Common architectures that can improve data loading and cost efficiency
- Using Alluxio to accelerate model training and reduce costs
This document discusses changes to Hyper-V virtualization from Windows Server 2008 to 2012. Key changes include the ability to share virtual hard disks between VMs, improved quality of service controls, and more robust resource sharing between host and guest systems. The new features make Hyper-V more reliable and scalable for server virtualization needs over the next 2-3 years.
The document discusses data storage and cloud computing. It provides an overview of different types of data storage, including direct attached storage (DAS), network attached storage (NAS), and storage area networks (SANs). It also describes different classes of cloud storage, such as unmanaged and managed cloud storage. The document outlines some of the challenges of cloud storage and how cloud providers create virtual storage containers to manage data storage in the cloud.
The document discusses several Storage Area Network (SAN) configurations for different projects within EDC. It provides an overview of the goals, architecture, and experiences of SAN implementations for the CR1, Landsat, and LPDAAC projects. It also discusses some general realities and challenges of implementing and managing SANs.
How HPE 3PAR Can Help YOur Mission Critical on Cloud : Seminar Protecting Mi...PT Datacomm Diangraha
Andi Bahar's Presentation material during Seminar Protecting Mission-Critical Application Against Downtime at Grha Datacomm Jakarta, Thursday March 8 2018
Introduction to Enterprise Data Storage, Direct Attached Storage, Storage Ar...ssuserec8a711
1. Cloud storage systems store multiple copies of data across many servers in various locations so that if one system fails, the data can be accessed from another location.
2. Storage providers use virtualization software to aggregate storage assets from various devices into a single cloud storage system called StorageGRID.
3. StorageGRID creates a virtualization layer that retrieves storage from different storage devices and manages it through a common file system interface over the internet.
Webinar: Untethering Compute from StorageAvere Systems
Enterprise storage infrastructures are gradually sprawling across the globe and consumers of data increasingly require access to remote storage resources. Solutions for mitigating the pain associated with this growth are out there, but performance varies. This Webinar will take a look at these challenges, review available solutions, and compare tests of performance.
1) To show you how to spot an Aspera opportunity ! 2) To outline the Aspera portfolio (Sales overview not technical) 3) To look at the Aspera opportunity from Sharepoint 4) Summary / Q and A / Close – But interaction is welcomed throughout.. 5) But before all of that…. This… 2 AGENDA AND OBJECTIVES
This document proposes a seed block algorithm and remote data backup server to help users recover files if the cloud is destroyed or files are deleted. The proposed system stores a backup of user's cloud data on a remote server. It uses a seed block algorithm that breaks files into blocks, takes their XOR, and stores the output to allow data to be recovered. The system was tested on different file types and sizes, showing it could recover same-sized files and required less time than existing solutions. Its applications include secure storage and access to information even without network connectivity.
I understand that physics and hardware emmaded on the use of finete .pdfanil0878
I understand that physics and hardware emmaded on the use of finete element methods to predict
fluid flow over airplane wings,that progress is likely to continue. However, in recent years, this
progress has been achieved through greatly increased hardware complexity with the rise of
multicore and manycore processors, and this is affecting the ability of application developers to
achieve the full potential of these systems. currently performance is measured on a dense
matrix–matrix multiplication test which has questionable relevance to real applications.the
incredible advances in processor technology and all of the accompanying aspects of computer
system design, such as the memory subsystem and networking
In embedded it seems to combination of both hardware and the software , it is used to be
combined function of action in the systems .while we do that the application to developed in the
achieve the full potential of the systems in advanced processer technology.
Hardware
(1) Memory
Advances in memory technology have struggled to keep pace with the phenomenal advances in
processors. This difficulty in improving the main memory bandwidth led to the development of a
cache hierarchy with data being held in different cache levels within the processor. The idea is
that instead of fetching the required data multiple times from the main memory, it is instead
brought into the cache once and re-used multiple times. Intel allocates about half of the chip to
cache, with the largest LLC (last-level cache) being 30MB in size. IBM\'s new Power8 CPU has
an even larger L3 cache of up to 96MB [4]. By contrast, the largest L2 cache in NVIDIA\'s
GPUs is only 1.5MB.These different hardware design choices are motivated by careful
consideration of the range of applications being run by typical users.
One complication which has become more common and more important in the past few years is
non-uniform memory access. Ten years ago, most shared-memory multiprocessors would have
several CPUs sharing a memory bus to access a single main memory. A final comment on the
memory subsystem concerns the energy cost of moving data compared to performing a single
floating point computation.
(2) Processors
CPUs had a single processing core, and the increase in performance came partly from an increase
in the number of computational pipelines, but mainly through an increase in clock frequency.
Unfortunately, the power consumption is approximately proportional to the cube of the
frequency and this led to CPUs with a power consumption of up to 250W.CPUs address memory
bandwidth limitations by devoting half or more of the chip to LLC, so that small applications can
be held entirely within the cache. They address the 200-cycle latency issue by using very
complex cores which are capable of out-of-order execution , By contrast, GPUs adopt a very
different design philosophy because of the different needs of the graphical applications they
target. A GPU usually has a number of functional u.
Network Engineering for High Speed Data SharingGlobus
Network Engineering for High Speed Data Sharing
The document discusses modernizing network architecture to improve data sharing performance for science. It proposes separating portal logic from data handling by placing data on dedicated high-performance infrastructure in science DMZs. This allows data to be efficiently transferred between facilities while portals focus on search and access. The Petascale DTN project achieved over 50Gbps transfers between HPC sites using this model. Long-term, interconnected science DMZs could create a global high-performance network enabling efficient data movement for discovery.
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy
In questa sessione HPE e SUSE illustrano con casi reali come HPE Data Management Framework e SUSE Enterprise Storage permettano di risolvere i problemi di gestione della crescita esponenziale dei dati realizzando un’architettura software-defined flessibile, scalabile ed economica. (Alberto Galli, HPE Italia e SUSE)
Presentation architecting virtualized infrastructure for big datasolarisyourep
The document discusses how virtualization can help simplify big data infrastructure and analytics. Key points include:
1) Virtualization can help simplify big data infrastructure by providing a unified analytics cloud platform that allows different data frameworks and workloads to easily share resources.
2) Hadoop performance on virtualization has been proven with studies showing little performance overhead from virtualization.
3) A unified analytics cloud platform using virtualization can provide benefits like better utilization, faster provisioning of elastic resources, and multi-tenancy for secure isolation of analytics workloads.
(Speaker Notes Version) Architecting An Enterprise Storage Platform Using Obj...Niraj Tolia
This document provides an overview of MagFS, a file system designed for cloud storage. Some key points:
- MagFS uses object storage for data storage and metadata servers to manage file metadata and encryption keys. It pushes intelligence to client edges through smart client agents.
- The architecture separates data and metadata planes, with clients directly accessing object storage for data while coordinating with metadata servers. This allows it to scale without bottlenecking on the data path.
- The design goals were to deliver low-cost, high-scale storage that spans devices and networks while supporting rapid iteration through a software-defined approach using VMs and clean separation of data and control planes.
Vijayendra Shamanna from SanDisk presented on optimizing the Ceph distributed storage system for all-flash architectures. Some key points:
1) Ceph is an open-source distributed storage system that provides file, block, and object storage interfaces. It operates by spreading data across multiple commodity servers and disks for high performance and reliability.
2) SanDisk has optimized various aspects of Ceph's software architecture and components like the messenger layer, OSD request processing, and filestore to improve performance on all-flash systems.
3) Testing showed the optimized Ceph configuration delivering over 200,000 IOPS and low latency with random 8K reads on an all-flash setup.
INTELLIGENT DISK SUBSYSTEMS – 2, I/O TECHNIQUES – 1
Caching: Acceleration of Hard Disk Access; Intelligent disk subsystems; Availability of disk subsystems. The Physical I/O path from the CPU to the Storage System; SCSI.
I/O TECHNIQUES – 2, NETWORK ATTACHED STORAGE
Fibre Channel Protocol Stack; Fibre Channel SAN; IP Storage. The NAS Architecture, The NAS hardware Architecture, The NAS Software Architecture, Network connectivity, NAS as a storage system.
1. The document discusses the hardware and software requirements for setting up an institutional repository, noting that repositories can run on a variety of server types from basic to high-powered and require reasonably good server hardware, storage, and memory.
2. It provides examples of specific hardware configurations used by some repositories, including servers from HP, Sun, and Dell with various processors, RAM amounts, and storage capacities.
3. The document states that the repository software installed and resulting user interface are what primarily determine a repository's functionality and appearance to users, giving the example of DSpace which is written in Java and can run on various platforms.
HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
The document discusses methods for accelerating Perforce workspace syncs and reducing network storage usage as digital assets grow rapidly. It introduces IC Manage Views, which uses dynamic virtual workspaces and local caching to achieve near-instant workspace syncs and reduce network storage by 4x. Benchmark results show IC Manage Views delivers files 2x faster than traditional methods through intelligent file redirection that separates reads from writes. IC Manage Views is compatible with existing storage technologies and scales as users and data grow.
Walk Through a Software Defined Everything PoCCeph Community
This document summarizes a proof of concept for a software defined data center using OpenStack and Midokura MidoNet software defined networking. The POC used 4 controllers, 8 Ceph storage nodes, and 16 compute nodes with Midokura providing logical layer 2-4 networking services. Key lessons learned included planning the underlay network configuration, optimizing Zookeeper connections, and improving OpenStack deployment processes which can be complex. Performance testing showed Ceph throughput was higher for reads than writes and SSD journaling improved IOPS. The streamlined workflow provided by the software defined infrastructure could help reduce costs and management complexity for organizations.
INTRODUCTION : Server Centric IT Architecture and its Limitations; Storage – Centric IT Architecture and its advantages; Case study: Replacing a server with Storage Networks; The Data Storage and Data Access problem; The Battle for size and access.
INTELLIGENT DISK SUBSYSTEMS – 1
Architecture of Intelligent Disk Subsystems; Hard disks and Internal I/O Channels, JBOD, Storage virtualization using RAID and different RAID levels;
Architecting virtualized infrastructure for big data presentationVlad Ponomarev
This document discusses architecting virtualized infrastructure for big data. It summarizes that big data is growing exponentially and new frameworks like Hadoop are enabling analysis of large, diverse data sets. Virtualization can simplify and optimize big data platforms by providing a unified analytics cloud that elastically provisions various data systems like Hadoop and SQL clusters on shared hardware infrastructure. This improves utilization and makes big data platforms faster and easier to deploy and manage.
Network attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. It is designed to be easy to set up and maintain while providing reliable storage that scales easily. NAS uses standard network protocols like NFS and CIFS to provide file sharing capabilities without requiring dedicated server hardware or software. This makes NAS simpler and more cost-effective than traditional server-based storage while offering high performance and reliability for small or large storage needs.
Fortissimo converged super_converged_hyperEmilio Billi
Fortissimo Foundation introduces a revolutionary converged computing architecture that removes layers of inefficiency in the data path. By consolidating server nodes and allowing direct hardware access, it can deliver 10-100x higher performance than existing solutions at a fraction of the cost. The architecture introduces no virtualization overhead, enabling ultra-low latency access and linear scalability for both virtual and non-virtual workloads. This makes it suitable for converged analytics, supercomputing and hyper-computing applications.
Actian Vector is a high performance analytics database that exploits modern CPU features like SIMD instructions to process large volumes of data much faster than traditional databases. It uses CPU caches rather than RAM for execution memory and avoids overhead by processing vectors of data at once. Actian Vector also leverages industry best practices like columnar storage and compression to optimize input/output and further improve performance for data warehouse workloads. Its unique ability to fully utilize CPU features allows it to run workloads on a single server that would require multiple servers on other databases.
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
In this presentation, we will discuss in details about challenges in managing the IT infrastructure with a focus on server sizing, storage capacity planning and internet connectivity. We will also discuss about how to set up security architecture and disaster recovery plan.
To know more about Welingkar School’s Distance Learning Program and courses offered, visit:
http://www.welingkaronline.org/distance-learning/online-mba.html
This document proposes a seed block algorithm and remote data backup server to help users recover files if the cloud is destroyed or files are deleted. The proposed system stores a backup of user's cloud data on a remote server. It uses a seed block algorithm that breaks files into blocks, takes their XOR, and stores the output to allow data to be recovered. The system was tested on different file types and sizes, showing it could recover same-sized files and required less time than existing solutions. Its applications include secure storage and access to information even without network connectivity.
I understand that physics and hardware emmaded on the use of finete .pdfanil0878
I understand that physics and hardware emmaded on the use of finete element methods to predict
fluid flow over airplane wings,that progress is likely to continue. However, in recent years, this
progress has been achieved through greatly increased hardware complexity with the rise of
multicore and manycore processors, and this is affecting the ability of application developers to
achieve the full potential of these systems. currently performance is measured on a dense
matrix–matrix multiplication test which has questionable relevance to real applications.the
incredible advances in processor technology and all of the accompanying aspects of computer
system design, such as the memory subsystem and networking
In embedded it seems to combination of both hardware and the software , it is used to be
combined function of action in the systems .while we do that the application to developed in the
achieve the full potential of the systems in advanced processer technology.
Hardware
(1) Memory
Advances in memory technology have struggled to keep pace with the phenomenal advances in
processors. This difficulty in improving the main memory bandwidth led to the development of a
cache hierarchy with data being held in different cache levels within the processor. The idea is
that instead of fetching the required data multiple times from the main memory, it is instead
brought into the cache once and re-used multiple times. Intel allocates about half of the chip to
cache, with the largest LLC (last-level cache) being 30MB in size. IBM\'s new Power8 CPU has
an even larger L3 cache of up to 96MB [4]. By contrast, the largest L2 cache in NVIDIA\'s
GPUs is only 1.5MB.These different hardware design choices are motivated by careful
consideration of the range of applications being run by typical users.
One complication which has become more common and more important in the past few years is
non-uniform memory access. Ten years ago, most shared-memory multiprocessors would have
several CPUs sharing a memory bus to access a single main memory. A final comment on the
memory subsystem concerns the energy cost of moving data compared to performing a single
floating point computation.
(2) Processors
CPUs had a single processing core, and the increase in performance came partly from an increase
in the number of computational pipelines, but mainly through an increase in clock frequency.
Unfortunately, the power consumption is approximately proportional to the cube of the
frequency and this led to CPUs with a power consumption of up to 250W.CPUs address memory
bandwidth limitations by devoting half or more of the chip to LLC, so that small applications can
be held entirely within the cache. They address the 200-cycle latency issue by using very
complex cores which are capable of out-of-order execution , By contrast, GPUs adopt a very
different design philosophy because of the different needs of the graphical applications they
target. A GPU usually has a number of functional u.
Network Engineering for High Speed Data SharingGlobus
Network Engineering for High Speed Data Sharing
The document discusses modernizing network architecture to improve data sharing performance for science. It proposes separating portal logic from data handling by placing data on dedicated high-performance infrastructure in science DMZs. This allows data to be efficiently transferred between facilities while portals focus on search and access. The Petascale DTN project achieved over 50Gbps transfers between HPC sites using this model. Long-term, interconnected science DMZs could create a global high-performance network enabling efficient data movement for discovery.
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy
In questa sessione HPE e SUSE illustrano con casi reali come HPE Data Management Framework e SUSE Enterprise Storage permettano di risolvere i problemi di gestione della crescita esponenziale dei dati realizzando un’architettura software-defined flessibile, scalabile ed economica. (Alberto Galli, HPE Italia e SUSE)
Presentation architecting virtualized infrastructure for big datasolarisyourep
The document discusses how virtualization can help simplify big data infrastructure and analytics. Key points include:
1) Virtualization can help simplify big data infrastructure by providing a unified analytics cloud platform that allows different data frameworks and workloads to easily share resources.
2) Hadoop performance on virtualization has been proven with studies showing little performance overhead from virtualization.
3) A unified analytics cloud platform using virtualization can provide benefits like better utilization, faster provisioning of elastic resources, and multi-tenancy for secure isolation of analytics workloads.
(Speaker Notes Version) Architecting An Enterprise Storage Platform Using Obj...Niraj Tolia
This document provides an overview of MagFS, a file system designed for cloud storage. Some key points:
- MagFS uses object storage for data storage and metadata servers to manage file metadata and encryption keys. It pushes intelligence to client edges through smart client agents.
- The architecture separates data and metadata planes, with clients directly accessing object storage for data while coordinating with metadata servers. This allows it to scale without bottlenecking on the data path.
- The design goals were to deliver low-cost, high-scale storage that spans devices and networks while supporting rapid iteration through a software-defined approach using VMs and clean separation of data and control planes.
Vijayendra Shamanna from SanDisk presented on optimizing the Ceph distributed storage system for all-flash architectures. Some key points:
1) Ceph is an open-source distributed storage system that provides file, block, and object storage interfaces. It operates by spreading data across multiple commodity servers and disks for high performance and reliability.
2) SanDisk has optimized various aspects of Ceph's software architecture and components like the messenger layer, OSD request processing, and filestore to improve performance on all-flash systems.
3) Testing showed the optimized Ceph configuration delivering over 200,000 IOPS and low latency with random 8K reads on an all-flash setup.
INTELLIGENT DISK SUBSYSTEMS – 2, I/O TECHNIQUES – 1
Caching: Acceleration of Hard Disk Access; Intelligent disk subsystems; Availability of disk subsystems. The Physical I/O path from the CPU to the Storage System; SCSI.
I/O TECHNIQUES – 2, NETWORK ATTACHED STORAGE
Fibre Channel Protocol Stack; Fibre Channel SAN; IP Storage. The NAS Architecture, The NAS hardware Architecture, The NAS Software Architecture, Network connectivity, NAS as a storage system.
1. The document discusses the hardware and software requirements for setting up an institutional repository, noting that repositories can run on a variety of server types from basic to high-powered and require reasonably good server hardware, storage, and memory.
2. It provides examples of specific hardware configurations used by some repositories, including servers from HP, Sun, and Dell with various processors, RAM amounts, and storage capacities.
3. The document states that the repository software installed and resulting user interface are what primarily determine a repository's functionality and appearance to users, giving the example of DSpace which is written in Java and can run on various platforms.
HDFS is a Java-based file system that provides scalable and reliable data storage, and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
The document discusses methods for accelerating Perforce workspace syncs and reducing network storage usage as digital assets grow rapidly. It introduces IC Manage Views, which uses dynamic virtual workspaces and local caching to achieve near-instant workspace syncs and reduce network storage by 4x. Benchmark results show IC Manage Views delivers files 2x faster than traditional methods through intelligent file redirection that separates reads from writes. IC Manage Views is compatible with existing storage technologies and scales as users and data grow.
Walk Through a Software Defined Everything PoCCeph Community
This document summarizes a proof of concept for a software defined data center using OpenStack and Midokura MidoNet software defined networking. The POC used 4 controllers, 8 Ceph storage nodes, and 16 compute nodes with Midokura providing logical layer 2-4 networking services. Key lessons learned included planning the underlay network configuration, optimizing Zookeeper connections, and improving OpenStack deployment processes which can be complex. Performance testing showed Ceph throughput was higher for reads than writes and SSD journaling improved IOPS. The streamlined workflow provided by the software defined infrastructure could help reduce costs and management complexity for organizations.
INTRODUCTION : Server Centric IT Architecture and its Limitations; Storage – Centric IT Architecture and its advantages; Case study: Replacing a server with Storage Networks; The Data Storage and Data Access problem; The Battle for size and access.
INTELLIGENT DISK SUBSYSTEMS – 1
Architecture of Intelligent Disk Subsystems; Hard disks and Internal I/O Channels, JBOD, Storage virtualization using RAID and different RAID levels;
Architecting virtualized infrastructure for big data presentationVlad Ponomarev
This document discusses architecting virtualized infrastructure for big data. It summarizes that big data is growing exponentially and new frameworks like Hadoop are enabling analysis of large, diverse data sets. Virtualization can simplify and optimize big data platforms by providing a unified analytics cloud that elastically provisions various data systems like Hadoop and SQL clusters on shared hardware infrastructure. This improves utilization and makes big data platforms faster and easier to deploy and manage.
Network attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. It is designed to be easy to set up and maintain while providing reliable storage that scales easily. NAS uses standard network protocols like NFS and CIFS to provide file sharing capabilities without requiring dedicated server hardware or software. This makes NAS simpler and more cost-effective than traditional server-based storage while offering high performance and reliability for small or large storage needs.
Fortissimo converged super_converged_hyperEmilio Billi
Fortissimo Foundation introduces a revolutionary converged computing architecture that removes layers of inefficiency in the data path. By consolidating server nodes and allowing direct hardware access, it can deliver 10-100x higher performance than existing solutions at a fraction of the cost. The architecture introduces no virtualization overhead, enabling ultra-low latency access and linear scalability for both virtual and non-virtual workloads. This makes it suitable for converged analytics, supercomputing and hyper-computing applications.
Actian Vector is a high performance analytics database that exploits modern CPU features like SIMD instructions to process large volumes of data much faster than traditional databases. It uses CPU caches rather than RAM for execution memory and avoids overhead by processing vectors of data at once. Actian Vector also leverages industry best practices like columnar storage and compression to optimize input/output and further improve performance for data warehouse workloads. Its unique ability to fully utilize CPU features allows it to run workloads on a single server that would require multiple servers on other databases.
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
In this presentation, we will discuss in details about challenges in managing the IT infrastructure with a focus on server sizing, storage capacity planning and internet connectivity. We will also discuss about how to set up security architecture and disaster recovery plan.
To know more about Welingkar School’s Distance Learning Program and courses offered, visit:
http://www.welingkaronline.org/distance-learning/online-mba.html
Building a Raspberry Pi Robot with Dot NET 8, Blazor and SignalRPeter Gallagher
In this session delivered at NDC Oslo 2024, I talk about how you can control a 3D printed Robot Arm with a Raspberry Pi, .NET 8, Blazor and SignalR.
I also show how you can use a Unity app on an Meta Quest 3 to control the arm VR too.
You can find the GitHub repo and workshop instructions here;
https://bit.ly/dotnetrobotgithub
1. Scalable Multi-Access Flash
Store for Big Data Analytics
Sang-Woo Jun, Ming Liu
Kermin Fleming*, Arvind
Massachusetts Institute of Technology
*Intel, work while at MIT
February 27, 2014 1
This work is funded by Quanta and Samsung.
We also thank Xilinx for their generous hardware donations
2. Big Data Analytics
Analysis of previously unimaginable amount of
data provides deep insight
Examples:
Analyzing twitter information to predict flu outbreaks
weeks before the CDC
Analyzing personal genome to determine
predisposition to disease
Data mining scientific datasets to extract accurate
models
February 27, 2014 2
Likely to be the biggest economic driver for the IT
industry for the next 5 or more years
3. Big Data Analytics
Data so large and complex, that traditional
processing methods are no longer effective
Too large to fit reasonably in fast RAM
Random access intensive, making prefetching and
caching ineffective
Data is often stored in the secondary storage
of multiple machines on a cluster
Storage system and network performance become
first-order concerns
February 27, 2014 3
4. Designing Systems for Big Data
Due to terrible random read latency of disks,
analytic systems were optimized for reducing
disk access, or making them sequential
Widespread adoption of flash storage is
changing this picture
February 27, 2014 4
0
500
1000
1500
2000
2500
Disk 1Gb Ethernet DRAM
Latency of system components (μs)
0
500
1000
1500
2000
2500
Flash 1Gb Ethernet DRAM
Latency of system components (μs)
0
20
40
60
80
100
120
140
Flash 1Gb Ethernet DRAM
Latency of system components (μs)
Modern systems need to balance all aspects of the
system to gain optimal performance
rescaled
5. Designing Systems for Big Data
Network Optimizations
High-performance network links (e.g. Infiniband)
Storage Optimizations
High-performance storage devices (e.g. FusionIO)
In-Store computing (e.g. Smart SSD)
Network-attached storage device (e.g. QuickSAN)
Software Optimizations
Hardware implementations of network protocols
Flash-aware distributed file systems (e.g. CORFU)
FPGA or GPU based accelerators (e.g. Netezza)
February 27, 2014 5
Cross-layer optimization is required for
optimal performance
6. Current Architectural Solution
Previously insignificant inter-component
communications become significant
When accessing remote storage:
Read from flash to RAM
Transfer from local RAM to remote RAM
Transfer to accelerator and back
February 27, 2014 6
CPU
RAM
FusionIO
Infiniband
FPGA
CPU
RAM
FusionIO
Infiniband
FPGA
Motherboard Motherboard
Hardware and software latencies are additive
Potential bottleneck at multiple transport locations
7. BlueDBM Architecture Goals
Low-latency high-bandwidth access to scalable
distributed storage
Low-latency access to FPGA acceleration
Many interesting problems do not fit in DRAM
but can fit in flash
February 27, 2014 7
CPU
RAM
Flash
Network
FPGA
Motherboard
CPU
RAM
Flash
Network
FPGA
Motherboard
8. BlueDBM System Architecture
FPGA manages flash, network and host link
February 27, 2014 8
PCIe
…
Node 20
1TB Flash
Programmable
Controller
(Xilinx VC707)
Host PC
Ethernet
Expected to be constructed by summer, 2014
Node 19
1TB Flash
Programmable
Controller
(Xilinx VC707)
Host PC
Node 0
1TB Flash
Programmable
Controller
(Xilinx VC707)
Host PC
A proof-of-concept prototype is functional!
Controller Network
~8 High-speed (12Gb/s)
serial links (GTX)
9. Inter-Controller Network
BlueDBM provides a dedicated storage
network between controllers
Low-latency high-bandwidth serial links (i.e. GTX)
Hardware implementation of a simple packet-
switched protocol allows flexible topology
Extremely low protocol overhead (~0.5μs/hop)
February 27, 2014 9
Controller Network
~8 High-speed (12Gb/s)
serial links (GTX)
…
1TB Flash
Programmable
Controller
(Xilinx VC707)
1TB Flash
Programmable
Controller
(Xilinx VC707)
1TB Flash
Programmable
Controller
(Xilinx VC707)
10. Inter-Controller Network
BlueDBM provides a dedicated storage
network between controllers
Low-latency high-bandwidth serial links (i.e. GTX)
Hardware implementation of a simple packet-
switched protocol allows flexible topology
Extremely low protocol overhead (~0.5μs/hop)
February 27, 2014 10
…
1TB Flash
Programmable
Controller
(Xilinx VC707)
1TB Flash
Programmable
Controller
(Xilinx VC707)
1TB Flash
Programmable
Controller
(Xilinx VC707)
Protocol supports virtual channels to ease development
High-speed (12Gb/s)
serial link (GTX)
11. Controller as a
Programmable Accelerator
BlueDBM provides a platform for implementing
accelerators on the datapath of storage device
Adds almost no latency by removing the need for
additional data transport
Compression or filtering accelerators can allow more
throughput than hostside link (i.e. PCIe) allows
Example: Word counting in a document
Counting is done inside accelerators, and only the
result need to be transported to host
February 27, 2014 11
12. Two-Part Implementation of
Accelerators
Accelerators are implemented both before and
after network transport
Local accelerators parallelize across nodes
Global accelerators have global view of data
Virtualized interface to storage and network
provided in these locations
February 27, 2014 12
Host
Global Accelerator
Local Accelerator Local Accelerator
Flash Flash
e.g., count words
e.g., aggregate count
Network
e.g., count words
13. Prototype System
(Proof-of-Concept)
Prototype system built using Xilinx ML605 and
a custom-built flash daughterboard
16GB capacity, 80MB/s bandwidth per node
Network over GTX high-speed transceivers
February 27, 2014 13
Flash board is 5 years old, from an earlier project
18. Benefits of in-path acceleration
February 27, 2014 18
Less software overhead
Not bottlenecked by flash-host link
Experimented using hash counting
In-path: Words are counted on the way out
Off-path: Data copied back to FPGA for counting
Software: Counting done in software
We’re actually building both configurations
0
20
40
60
80
100
120
140
In-datapath Off-datapath Software
Bandwidth
(MB/s)
19. Work In Progress
Improved flash controller and File System
Wear leveling, garbage collection
Distributed File System as accelerator
Application-specific optimizations
Optimized data routing/network protocol
Storage/Link Compression
Database acceleration
Offload database operations to FPGA
Accelerated filtering, aggregation and compression
could improve performance
Application exploration
Including medical applications and Hadoop
February 27, 2014 19
20. Thank you
Takeaway: Analytics systems can be made
more efficient by using reconfigurable fabric to
manage data routing between system
components
February 27, 2014 20
FPGA
CPU
+
RAM
Network
Flash
FPGA
CPU
+
RAM
Network
Flash
22. New System Improvements
A new, 20-node cluster will exist by summer
New flash board with 1TB of storage at 1.6GB/s
Total 20TBs of flash storage
8 serial links will be pinned out as SATA ports,
allowing flexible topology
12Gb/s per link GTX
Rack-mounted for ease of handling
Xilinx VC707 FPGA boards
February 27, 2014 22
Rackmount
Node 0
Node 1
Node 2
Node 3
Node 4
Node 5
…
23. Example: Data driven Science
Massive amounts of scientific data is being
collected
Instead of building an hypothesis and collecting
relevant data, extract models from massive data
University of Washington N-body workshop
Multiple snapshots of particles in simulated universe
Interactive analysis provides insights to scientists
~100GBs, not very large by today’s standards
February 27, 2014 23
“Visualize clusters of particles moving faster than X”
“Visualize particles that were destroyed,
After being hotter than Y”
24. Workload Characteristics
Large volume of data to examine
Access pattern is unpredictable
Often only a random subset of snapshots are
accessed
February 27, 2014 24
Row
Row
Row
Row
Row
Row
Row
Rows of interest
“Hotter than Y”
Row
Row
Row
Row
Row
Row
Row
Snapshot 1 Snapshot 2
Access at
random intervals
“Destroyed?”
Sequential
scan
25. Constraints in System Design
Latency is important in a random-access
intensive scenario
Components of a computer system has widely
different latency characteristics
February 27, 2014 25
0
500
1000
1500
2000
2500
Disk 1Gb
Ethernet
Flash 10Gb
Ethernet
Infiniband DRAM
Latency of system components
0
20
40
60
80
100
120
140
1Gb
Ethernet
Flash 10Gb
Ethernet
Infiniband DRAM
Latency of system components
“even when DRAM in the cluster is ~40% the size of the
entire database, [... we could only process] 900 reads per
second, which is much worse than the 30,000”
“Using HBase with ioMemory” -FusionIO
26. Constraints in System Design
Latency is important in a random-access
intensive scenario
Components of a computer system has widely
different latency characteristics
February 27, 2014 26
0
500
1000
1500
2000
2500
Disk 1Gb
Ethernet
Flash 10Gb
Ethernet
Infiniband DRAM
Latency of system components
“even when DRAM in the cluster is ~40% the size of the
entire database, [... we could only process] 900 reads per
second, which is much worse than the 30,000”
“Using HBase with ioMemory” -FusionIO
27. Existing Solutions - 1
Baseline: Single PC with DRAM and Disk
Size of dataset exceeds normal DRAM capacity
Data on disk is accessed randomly
Disk seek time bottlenecks performance
Solution 1: More DRAM
Expensive, Limited capacity
Solution 2: Faster storage
Flash storage provides fast random access
PCIe flash cards like FusionIO
Solution 3: More machines
February 27, 2014 27
28. Existing Solutions - 2
Cluster of machines over a network
Network becomes a bottleneck with enough
machines
Network fabric is slow
Software stack adds overhead
Fitting everything on DRAM is expensive
Trade-off between network and storage bottleneck
Solution 1: High-performance network such as
Infiniband
February 27, 2014 28
29. Existing Solutions - 3
Cluster of expensive machines
Lots of DRAM
FusionIO storage
Infiniband network
Now computation might become the
bottleneck
Solution 1: FPGA-based Application-specific
Accelerators
February 27, 2014 29
30. Previous Work
Software Optimizations
Hardware implementations of network protocols
Flash-aware distributed file systems (e.g. CORFU)
Accelerators on FPGAs and GPUs
Storage Optimizations
High-performance storage devices (e.g. FusionIO)
Network Optimizations
High-performance network links (e.g. Infiniband)
February 27, 2014 30
0
50
100
150
Flash FusionIO 1Gb
Ethernet
10Gb
Ethernet
Infiniband
Latency of system components (us)
31. BlueDBM Goals
BlueDBM is a distributed storage architecture
with the following goals:
Low-latency high-bandwidth storage access
Low-latency hardware acceleration
Scalability via efficient network protocol & topology
Multi-accessibility via efficient routing and controller
Application compatibility via normal storage interface
February 27, 2014 31
CPU
RAM
Flash
Network
FPGA
Motherboard
32. BlueDBM Node Architecture
Address mapper determines the location of the
requested page
Nodes must agree on mapping
Currently programmatically defined, to maximize
parallelism by striping across nodes
February 27, 2014 32
33. Prototype System
Topology of prototype is restricted because
ML605 has only one SMA port
Network requires hub nodes
February 27, 2014 33
34. Controller as a
Programmable Accelerator
BlueDBM provides a platform for implementing
accelerators on the datapath of storage device
Adds almost no latency by removing the need for
additional data transport
Compression or filtering accelerators can allow more
throughput than hostside link (i.e. PCIe) allows
Example: Word counting in a document
Controller can be programmed to count words and
only return the result
February 27, 2014 34
Accelerator
Flash
Host Server
Accelerator
Flash
Host Server
We’re actually building both systems