This document discusses performance improvements to the Lustre parallel file system in versions 2.5 through large I/O patches, metadata improvements, and metadata scaling with distributed namespace (DNE). It summarizes evaluations showing improved throughput from 4MB RPC, reduced degradation with large numbers of threads using SSDs over NL-SAS, high random read performance from SSD pools, and significant metadata performance gains in Lustre 2.4 from DNE allowing nearly linear scaling. Key requirements for next-generation storage include extreme IOPS, tiered architectures using local flash with parallel file systems, and reducing infrastructure needs while maintaining throughput.
Presentation from 2016 Austin OpenStack Summit.
The Ceph upstream community is declaring CephFS stable for the first time in the recent Jewel release, but that declaration comes with caveats: while we have filesystem repair tools and a horizontally scalable POSIX filesystem, we have default-disabled exciting features like horizontally-scalable metadata servers and snapshots. This talk will present exactly what features you can expect to see, what's blocking the inclusion of other features, and what you as a user can expect and can contribute by deploying or testing CephFS.
Presentation from 2016 Austin OpenStack Summit.
The Ceph upstream community is declaring CephFS stable for the first time in the recent Jewel release, but that declaration comes with caveats: while we have filesystem repair tools and a horizontally scalable POSIX filesystem, we have default-disabled exciting features like horizontally-scalable metadata servers and snapshots. This talk will present exactly what features you can expect to see, what's blocking the inclusion of other features, and what you as a user can expect and can contribute by deploying or testing CephFS.
PostgreSQL and ZFS were made for each other. This talk dives downstack into the internals and way that PostgreSQL consumes disk resources and tricks that are available if you run PostgreSQL on ZFS (ZFS on Linux, ZFS on FreeBSD, or ZFS on Illumos). Topics covered will include:
* Performance and sizing considerations
* Workload estimation heuristics
* Standard administrative practices that leverage ZFS
* Recovery using ZFS
* Performing database migrations using ZFS
TrioNAS LX U300 consolidate NAS and SAN offers multiple enterprise-level features including DeDup & Compression, Unlimited Snapshot, Thin Provisioning, Online Capacity Expansion and SSD caching.
For years Qsan has won plenty of proven records in enterprise markets and numerous vertical industries. Based on expertise in delivering in-house iSCSI & RAID stack, TrioNAS LX U300 deliver the best price-performance value to meet enterprise IT budget and specific needs.
For more detail please visit: http://www.qsantechnology.com/en/raidsystem_view.php?RSTID=AQ000108
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed_Hat_Storage
This session describes how to get the most out of OpenStack Cinder volumes on Ceph.
We’ll discuss:
Performance configuration, tuning, and workloads.
Performance test results of Red Hat Enterprise Linux OpenStack Platform 5, Red Hat Enterprise Linux OpenStack Platform 6, Red Hat Ceph Storage 1.2.3, and Firefly.
Anticipated improvements in performance for Red Hat Ceph Storage 1.3.
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems, a subsidiary of Oracle Corporation. The features of ZFS include support for high storage capacities, integration of the concepts of file system and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. Unlike traditional file systems, which reside on single devices and thus require a volume manager to use more than one device, ZFS file systems are built on top of virtual storage pools called zpools. A zpool is constructed of virtual devices (vdevs), which are themselves constructed of block devices: files, hard drive partitions, or entire drives, with the last being the recommended usage.[7] Thus, a vdev can be viewed as a group of hard drives. This means a zpool consists of one or more groups of drives.
In addition, pools can have hot spares to compensate for failing disks. In addition, ZFS supports both read and write caching, for which special devices can be used. Solid State Devices can be used for the L2ARC, or Level 2 ARC, speeding up read operations, while NVRAM buffered SLC memory can be boosted with supercapacitors to implement a fast, non-volatile write cache, improving synchronous writes. Finally, when mirroring, block devices can be grouped according to physical chassis, so that the filesystem can continue in the face of the failure of an entire chassis. Storage pool composition is not limited to similar devices but can consist of ad-hoc, heterogeneous collections of devices, which ZFS seamlessly pools together, subsequently doling out space to diverse file systems as needed. Arbitrary storage device types can be added to existing pools to expand their size at any time. The storage capacity of all vdevs is available to all of the file system instances in the zpool. A quota can be set to limit the amount of space a file system instance can occupy, and a reservation can be set to guarantee that space will be available to a file system instance.
The Economics of Scaling Cassandra - By Alex Bordei, Techie Product Manager at Bigstep
This presentation was made during the "Cassandra Summit 2014" Event, in London.
We benchmarked Cassandra on a number of configurations and we show what's the scaling profile. We test Cassandra on Docker as well as Cassandra's In-memory feature.
Follow Alex on Twitter: @alexandrubordei
Bigstep on Twitter: @BigStepInc
If you have any questions, let us know at hello@bigstep.com and we'll do our best to answer.
Stay informed: http://blog.bigstep.com/
In-memory processing has started to become the norm in large scale data handling. This is aclose to the metal analysis of highly important but often neglected aspects of memory accesstimes and how it impacts big data and NoSQL technologies.We cover aspects such as the TLB, the Transparent Huge Pages, the QPI Link, Hyperthreading and the impact of virtualization on high-memory footprint applications. We present benchmarks of various technologies ranging from Cloudera’s Impala to Couchbase and how they are impacted by the underlying hardware.The key takeaway is a better understanding of how to size a cluster, how to choose a cloud provider and an instance type for big data and NoSQL workloads and why not every core or GB of RAM is created equal.
PostgreSQL and ZFS were made for each other. This talk dives downstack into the internals and way that PostgreSQL consumes disk resources and tricks that are available if you run PostgreSQL on ZFS (ZFS on Linux, ZFS on FreeBSD, or ZFS on Illumos). Topics covered will include:
* Performance and sizing considerations
* Workload estimation heuristics
* Standard administrative practices that leverage ZFS
* Recovery using ZFS
* Performing database migrations using ZFS
TrioNAS LX U300 consolidate NAS and SAN offers multiple enterprise-level features including DeDup & Compression, Unlimited Snapshot, Thin Provisioning, Online Capacity Expansion and SSD caching.
For years Qsan has won plenty of proven records in enterprise markets and numerous vertical industries. Based on expertise in delivering in-house iSCSI & RAID stack, TrioNAS LX U300 deliver the best price-performance value to meet enterprise IT budget and specific needs.
For more detail please visit: http://www.qsantechnology.com/en/raidsystem_view.php?RSTID=AQ000108
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed_Hat_Storage
This session describes how to get the most out of OpenStack Cinder volumes on Ceph.
We’ll discuss:
Performance configuration, tuning, and workloads.
Performance test results of Red Hat Enterprise Linux OpenStack Platform 5, Red Hat Enterprise Linux OpenStack Platform 6, Red Hat Ceph Storage 1.2.3, and Firefly.
Anticipated improvements in performance for Red Hat Ceph Storage 1.3.
In computing, ZFS is a combined file system and logical volume manager designed by Sun Microsystems, a subsidiary of Oracle Corporation. The features of ZFS include support for high storage capacities, integration of the concepts of file system and volume management, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, RAID-Z and native NFSv4 ACLs. Unlike traditional file systems, which reside on single devices and thus require a volume manager to use more than one device, ZFS file systems are built on top of virtual storage pools called zpools. A zpool is constructed of virtual devices (vdevs), which are themselves constructed of block devices: files, hard drive partitions, or entire drives, with the last being the recommended usage.[7] Thus, a vdev can be viewed as a group of hard drives. This means a zpool consists of one or more groups of drives.
In addition, pools can have hot spares to compensate for failing disks. In addition, ZFS supports both read and write caching, for which special devices can be used. Solid State Devices can be used for the L2ARC, or Level 2 ARC, speeding up read operations, while NVRAM buffered SLC memory can be boosted with supercapacitors to implement a fast, non-volatile write cache, improving synchronous writes. Finally, when mirroring, block devices can be grouped according to physical chassis, so that the filesystem can continue in the face of the failure of an entire chassis. Storage pool composition is not limited to similar devices but can consist of ad-hoc, heterogeneous collections of devices, which ZFS seamlessly pools together, subsequently doling out space to diverse file systems as needed. Arbitrary storage device types can be added to existing pools to expand their size at any time. The storage capacity of all vdevs is available to all of the file system instances in the zpool. A quota can be set to limit the amount of space a file system instance can occupy, and a reservation can be set to guarantee that space will be available to a file system instance.
The Economics of Scaling Cassandra - By Alex Bordei, Techie Product Manager at Bigstep
This presentation was made during the "Cassandra Summit 2014" Event, in London.
We benchmarked Cassandra on a number of configurations and we show what's the scaling profile. We test Cassandra on Docker as well as Cassandra's In-memory feature.
Follow Alex on Twitter: @alexandrubordei
Bigstep on Twitter: @BigStepInc
If you have any questions, let us know at hello@bigstep.com and we'll do our best to answer.
Stay informed: http://blog.bigstep.com/
In-memory processing has started to become the norm in large scale data handling. This is aclose to the metal analysis of highly important but often neglected aspects of memory accesstimes and how it impacts big data and NoSQL technologies.We cover aspects such as the TLB, the Transparent Huge Pages, the QPI Link, Hyperthreading and the impact of virtualization on high-memory footprint applications. We present benchmarks of various technologies ranging from Cloudera’s Impala to Couchbase and how they are impacted by the underlying hardware.The key takeaway is a better understanding of how to size a cluster, how to choose a cloud provider and an instance type for big data and NoSQL workloads and why not every core or GB of RAM is created equal.
Webinar: Getting Beyond Flash 101 - Flash 102 Selecting the Right Flash ArrayStorage Switzerland
Join Storage Switzerland and Data Direct Networks (DDN) for this on demand webinar: "Getting Beyond Flash 101 - Flash 102 Selecting the Right Flash Array”. We discuss the different types of flash storage and compare them, why vendors want to replace your SAN instead of enhance it and what you can do to not only protect your current storage investments but also prepare a path to the future.
Jean Thomas Acquaviva from DDN present this deck at the 2016 HPC Advisory Council Switzerland Conference.
"Thanks to the arrival of SSDs, the performance of storage systems can be boosted by orders of magnitude. While a considerable amount of software engineering has been invested in the past to circumvent the limitations of rotating media, there is a misbelief than a lightweight software approach may be sufficient for taking advantage of solid state media. Taking the data protection as an example, this talk will present some of the limitations of current storage software stacks. We will then discuss how this unfold to a more radical re-design of the software architecture and ultimately is making a case for an I/O interception layer."
Learn more: http://ddn.com
Watch the video presentation: http://wp.me/p3RLHQ-f7J
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Chris Roush presented "Investigating Private Companies" at the Donald W. Reynolds National Center of Business Journalism's free workshop, "Investigating Private Companies and Nonprofits."
For more information about free training for business journalists, please visit businessjournalism.org.
Presentation for 2013 EORI CO2 Conference in Casper, WY July 9, 2013.
Jay Christopher, Business Unit Manager of Air and Process Services for Trihydro, has over 35 years of energy industry environmental experience, specializing in air quality issues and permitting affecting complex facility operations.
Avoid Air-rors! Discuss the Air Regulations that Impact Oil and Gas DevelopmentTrihydro Corporation
Presentation about the air regulations affecting oil and gas development. Topics covered include NSPS OOOO, Leak Detection and Repair, Greenhouse Gas Inventory/Reporting, Optical Gas Imaging with Infrared Cameras
Each guest will need to call and mention the group: Kirk Family Reunion 901-522-9700 to reserve their room with a credit card after August 14, 2015. May 2016 is the deadline Hotel Rates will go up.
Email pictures to 2016kirks@gmail.com
In this deck from the DDN User Group at SC14, Tommy Minyard from TACC presents: Site-wide Storage Use Case and Early User Experience with Infinite Memory Engine.
"IME unleashes a new I/O provisioning paradigm. This breakthrough, software defined storage application introduces a whole new new tier of transparent, extendable, non-volatile memory (NVM), that provides game-changing latency reduction and greater bandwidth and IOPS performance for the next generation of performance hungry scientific, analytic and big data applications – all while offering significantly greater economic and operational efficiency than today’s traditional disk-based and all flash array storage approaches that are currently used to scale performance."
Watch the video presentation: http://insidehpc.com/2014/12/site-wide-storage-use-case-early-user-experience-infinite-memory-engine/
Accelerating HBase with NVMe and Bucket CacheNicolas Poggi
on-Volatile-Memory express (NVMe) standard promises and order of magnitude faster storage than regular SSDs, while at the same time being more economical than regular RAM on TB/$. This talk evaluates the use cases and benefits of NVMe drives for its use in Big Data clusters with HBase and Hadoop HDFS.
First, we benchmark the different drives using system level tools (FIO) to get maximum expected values for each different device type and set expectations. Second, we explore the different options and use cases of HBase storage and benchmark the different setups. And finally, we evaluate the speedups obtained by the NVMe technology for the different Big Data use cases from the YCSB benchmark.
In summary, while the NVMe drives show up to 8x speedup in best case scenarios, testing the cost-efficiency of new device technologies is not straightforward in Big Data, where we need to overcome system level caching to measure the maximum benefits.
Accelerating hbase with nvme and bucket cacheDavid Grier
This set of slides describes some initial experiments which we have designed for discovering improvements for performance in Hadoop technologies using NVMe technology
Yesterday's thinking may still believe NVMe (NVM Express) is in transition to a production ready solution. In this session, we will discuss how the evolution of NVMe is ready for production, the history and evolution of NVMe and the Linux stack to address where NVMe has progressed today to become the low latency, highly reliable database key value store mechanism that will drive the future of cloud expansion. Examples of protocol efficiencies and types of storage engines that are optimizing for NVMe will be discussed. Please join us for an exciting session where in-memory computing and persistence have evolved.
In this presentation from the DDN User Meeting at SC13, Tommy Minyard from the Texas Advanced Computing Center describes TACC's new Corral data storage system.
Watch the video presentation: http://insidehpc.com/2013/11/13/ddn-user-meeting-coming-sc13-nov-18/
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univainside-BigData.com
In this deck from the Univa Breakfast Briefing at ISC 2018, Duncan Poole from NVIDIA describes how the company is accelerating HPC in the Cloud.
Learn more: https://www.nvidia.com/en-us/data-center/dgx-systems/
and
http://univa.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Today’s groundbreaking scientific discoveries are taking place in HPC data centers. Using containers, researchers and scientists gain the flexibility to run HPC application containers on NVIDIA Volta-powered systems including Quadro-powered workstations, NVIDIA DGX Systems, and HPC clusters.
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataHitoshi Sato
Presentation Slides for ExaComm2018, Fourth International Workshop on Communication Architectures for HPC, Big Data, Deep Learning and Clouds at Extreme Scale, in conjunction with International Supercomputing Conference (ISC 2018)
http://nowlab.cse.ohio-state.edu/exacomm/
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
LUG 2014
1. Lustre2.5 Performance Evaluation:
Performance Improvements
with Large I/O Patches,
Metadata Improvements, and Metadata
Scaling with DNE
Hitoshi Sato*1, Shuichi Ihara*2, Satoshi Matsuoka*1
*1 Tokyo Institute of Technology
*2 Data Direct Networks Japan
1
2. Tokyo Tech’s TSUBAME2.0 Nov. 1, 2010
“The Greenest Production Supercomputer in the World”
TSUBAME 2.0
New Development
32nm 40nm
>400GB/s Mem BW
80Gbps NW BW
~1KW max
>1.6TB/s Mem BW >12TB/s Mem BW
35KW Max
>600TB/s Mem BW
220Tbps NW
Bisecion BW
1.4MW Max
2
3. TSUBAME2 System Overview
11PB (7PB HDD, 4PB Tape, 200TB SSD)
“Global Work Space” #1
SFA10k #5
“Global Work
Space” #2 “Global Work Space” #3
SFA10k #4SFA10k #3SFA10k #2SFA10k #1
/data0 /work0
/work1 /gscr
“cNFS/Clusterd Samba w/ GPFS”
HOME
System
application
“NFS/CIFS/iSCSI by BlueARC”
HOME
iSCSI
Infiniband QDR Networks
SFA10k #6
GPFS#1 GPFS#2 GPFS#3 GPFS#4
Parallel File System Volumes
Home Volumes
QDR IB(×4) × 20 10GbE × 2QDR IB (×4) × 8
1.2PB
3.6 PB
/data1
Thin nodes 1408nodes (32nodes x44 Racks)
HP Proliant SL390s G7 1408nodes
CPU: Intel Westmere-EP 2.93GHz
6cores × 2 = 12cores/node
GPU: NVIDIA Tesla K20X, 3GPUs/node
Mem: 54GB (96GB)
SSD: 60GB x 2 = 120GB (120GB x 2 = 240GB)
Medium nodes
HP Proliant DL580 G7 24nodes
CPU: Intel Nehalem-EX 2.0GHz
8cores × 2 = 32cores/node
GPU: NVIDIA Tesla S1070,
NextIO vCORE Express 2070
Mem:128GB
SSD: 120GB x 4 = 480GB
Fat nodes
HP Proliant DL580 G7 10nodes
CPU: Intel Nehalem-EX 2.0GHz
8cores × 2 = 32cores/node
GPU: NVIDIA Tesla S1070
Mem: 256GB (512GB)
SSD: 120GB x 4 = 480GB
・・・・・・
Computing Nodes: 17.1PFlops(SFP), 5.76PFlops(DFP), 224.69TFlops(CPU), ~100TB MEM, ~200TB SSD
Interconnets: Full-bisection Optical QDR Infiniband Network
Voltaire Grid Director 4700 ×12
IB QDR: 324 ports
Core Switch Edge Switch Edge Switch (/w 10GbE ports)
Voltaire Grid Director 4036 ×179
IB QDR : 36 ports
Voltaire Grid Director 4036E ×6
IB QDR:34ports
10GbE: 2port
12switches
6switches179switches
2.4 PB HDD +
GPFS+Tape Lustre Home
Local SSDs
3
4. TSUBAME2 System Overview
11PB (7PB HDD, 4PB Tape, 200TB SSD)
“Global Work Space” #1
SFA10k #5
“Global Work
Space” #2 “Global Work Space” #3
SFA10k #4SFA10k #3SFA10k #2SFA10k #1
/data0 /work0
/work1 /gscr
“cNFS/Clusterd Samba w/ GPFS”
HOME
System
application
“NFS/CIFS/iSCSI by BlueARC”
HOME
iSCSI
Infiniband QDR Networks
SFA10k #6
GPFS#1 GPFS#2 GPFS#3 GPFS#4
Parallel File System Volumes
Home Volumes
QDR IB(×4) × 20 10GbE × 2QDR IB (×4) × 8
1.2PB
3.6 PB
/data1
Thin nodes 1408nodes (32nodes x44 Racks)
HP Proliant SL390s G7 1408nodes
CPU: Intel Westmere-EP 2.93GHz
6cores × 2 = 12cores/node
GPU: NVIDIA Tesla K20X, 3GPUs/node
Mem: 54GB (96GB)
SSD: 60GB x 2 = 120GB (120GB x 2 = 240GB)
Medium nodes
HP Proliant DL580 G7 24nodes
CPU: Intel Nehalem-EX 2.0GHz
8cores × 2 = 32cores/node
GPU: NVIDIA Tesla S1070,
NextIO vCORE Express 2070
Mem:128GB
SSD: 120GB x 4 = 480GB
Fat nodes
HP Proliant DL580 G7 10nodes
CPU: Intel Nehalem-EX 2.0GHz
8cores × 2 = 32cores/node
GPU: NVIDIA Tesla S1070
Mem: 256GB (512GB)
SSD: 120GB x 4 = 480GB
・・・・・・
Computing Nodes: 17.1PFlops(SFP), 5.76PFlops(DFP), 224.69TFlops(CPU), ~100TB MEM, ~200TB SSD
Interconnets: Full-bisection Optical QDR Infiniband Network
Voltaire Grid Director 4700 ×12
IB QDR: 324 ports
Core Switch Edge Switch Edge Switch (/w 10GbE ports)
Voltaire Grid Director 4036 ×179
IB QDR : 36 ports
Voltaire Grid Director 4036E ×6
IB QDR:34ports
10GbE: 2port
12switches
6switches179switches
2.4 PB HDD +
GPFS+Tape Lustre Home
Local SSDs
Mostly Read I/O
(data-intensive apps, parallel workflow,
parameter survey)
• Home storage for computing nodes
• Cloud-based campus storage services
Backup
Fine-grained R/W I/O
(check point, temporal files)
Fine-grained R/W I/O
(check point, temporal files)
4
5. Towards TSUBAME 3.0
Interim Upgrade
TSUBAME2.0 to 2.5
(Sept.10th, 2013)
TSUBAME2.0 Compute Node
Fermi GPU 3 x 1408 = 4224 GPUs
TSUBAME-KFC
Upgrade the TSUBAME2.0s GPUs
NVIDIA Fermi M2050 to Kepler K20X
SFP/DFP peak
from 4.8PF/2.4PF
=> 17PF/5.7PF
A TSUBAME3.0 prototype system with
advanced cooling for next-gen.
supercomputers.
40 compute nodes are oil-submerged.
160 NVIDIA Kepler K20x
80 Ivy Bridge Xeon
FDR Infiniband
Total 210.61 Flops
System 5.21 TFlops
Storage design is also a matter!!
5
6. Existing Storage Problems
Small I/O
• PFS
– Throughput-oriented Design
– Master-Worker Configuration
• Performance Bottlenecks on
Meta-data references
• Low IOPS (1k – 10k IOPS)
I/O contention
• Shared across compute
nodes
– I/O contention from various
multiple applications
– I/O performance degradation
Small I/O ops
from user’s apps
PFS operation
down
Recovered
IOPS
TIME
Lustre I/O Monitoring
6
7. Requirements for Next
Generation I/O Sub-System 1
• Basic Requirements
– TB/sec sequential bandwidth for traditional HPC applications
– Extremely high IOPS for to cope with applications that
generate massive small I/O
• Example: graph, etc.
– Some application workflows have different I/O requirements
for each workflow component
• Reduction/consolidation of PFS I/O resources is needed
while achieving TB/sec performance
– We cannot simply use a larger number of IO servers and drives
for achieving ~TB/s throughput!
– Many constraints
• Space, power, budget, etc.
• Performance per Watt and performance per RU is crucial
7
8. Requirements for Next
Generation I/O Sub-System 2
• New Approaches
• Tsubame 2.0 has pioneered the use of local flash storage as
a high-IOPS alternative to an external PFS
• Tired and hybrid storage environments, combining (node)
local flash with an external PFS
• Industry Status
• High-performance, high-capacity flash (and other new
semiconductor devices) are becoming available at reasonable
cost
• New approaches/interface to use high-performance devices
(e.g. NVMexpress)
8
9. Key Requirements of I/O system
• Electrical Power and Space are still Challenging
– Reduce of #OSS/OST, but keep higher IO Performance and
large storage capacity
– How maximize Lustre performance
• Understand New type of Flush storage device
– Any benefits with Lustre? How use it?
– MDT on SSD helps?
9
10. Lustre 2.x Performance Evaluation
• Maximize OSS/OST Performance for large
Sequential IO
– Single OSS and OST performance
– 1MB vs 4MB RPC
– Not only Peak performance, but also sustain performance
• Small IO to shared file
– 4K random read access to the Lustre
• Metadata Performance
– CPU impacts?
– SSD helps?
10
11. Throughput per OSS Server
0
2000
4000
6000
8000
10000
12000
14000
1 2 4 8 16
Throughput(MB/sec)
Number of clients
Single OSS's Throughput (Write)
Nobonding AA-bonding AA-bonding(CPU bind)
0
2000
4000
6000
8000
10000
12000
14000
1 2 4 8 16
Throughput(MB/s)
Number of clients
Single OSS's Throughput (Read)
Nobonding AA-bonding AA-bonding(CPU bind)
Single FDR’s LNET bandwidth
6.0 (GB/sec)
Dual IB FDR LNET bandwidth
12.0 (GB/sec)
11
12. Performance comparing
(1 MB vs. 4 MB RPC, IOR FPP, asyncIO)
• 4 x OSS, 280 x NL-SAS
• 14 clients and up to 224 processes
0
5000
10000
15000
20000
25000
30000
28 56 112 224
Throughput(MB/sec)
Number of Process
IOR(Write, FPP, xfersize=4MB)
1MB RPC
4MB RPC
0
5000
10000
15000
20000
25000
28 56 112 224
Throughput(MB/sec)
Number of Process
IOR(Read, FPP, xfersize=4MB)
1MB RPC
4MB RPC
12
13. Lustre Performance Degradation with Large
Number of Threads: OSS standpoint
0
2000
4000
6000
8000
10000
12000
0 200 400 600 800 1000 1200
THroughput(MB/sec)
Number of Process
IOR(FPP, 1MB, Write)
RPC=4MB RPC=1MB
0
1000
2000
3000
4000
5000
6000
7000
8000
0 200 400 600 800 1000 1200
THroughput(MB/sec)
Number of Process
IOR(FPP, 1MB, Read)
RPC=4MB RPC=1MB
• 1 x OSS, 80 x NL-SAS
• 20 clients and up to 960 processes
4.5x
30%
55%
13
14. Lustre Performance Degradation with Large
Number of Threads: SSDs vs. Nearline Disks
• IOR (FFP, 1MB) to single OST
• 20 clients and up to 640 processes
0%
20%
40%
60%
80%
100%
120%
0 100 200 300 400 500 600 700
PerformanceperOSTas%ofPeakPerformance
Number of Process
Write: Lustre Backend Performance Degradation
(Maximum for each dataset=100%)
7.2KSAS(1MB RPC) 7.2KSAS(4MB RPC) SSD(1MB RPC)
0%
20%
40%
60%
80%
100%
120%
0 100 200 300 400 500 600 700
PerformanceperOSTas%ofPeakPerformance
Number of Process
Read : Lustre Backend Performance Degradation
(Maximum for each dataset=100%)
7.2KSAS(1MB RPC) 7.2KSAS(4MB RPC) SSD(1MB RPC)
14
15. Lustre 2.4 4k Random Read Performance
(With 10 SSDs)
• 2 x OSS, 10 x SSD(2 x RAID5), 16 clients
• Create Large random files (FFP, SSF), and run random read
access with 4KB
0
100000
200000
300000
400000
500000
600000
1n32p 2n64p 4n128p 8n256p 12n384p 16n512p
IOPS
Lustre 4K random read(FFP and SSF)
FFP(POSIX)
FFP(MPIIO)
SSF(POSIX)
SSF(MPIIO)
15
16. Conclusions: Lustre with SSDs
• SSDs pools in Lustre or SSDs
– SSDs change the behavior of pools, as “seek” latency is no longer an
issue
• Application scenarios
– Very consistent performance, independent of the IO size or number
of concurrent IOs
– Very high random access to a single shared file (millions of IOPS in
a large file system with sufficient clients)
• With SSDs, the bottleneck is no more the device, but the RAID
array (RAID stack) and the file system
– Very high random read IOPS in Lustre is possible, but only if the
metadata workload is limited (i.e. random IO to a single shared file)
• We will investigate more benchmarks of Lustre on SSD
16
17. Metadata Performance Improvements
• Very Significant Improvements (since Lustre 2.3)
– Increased performance for both unique and shared directory metadata
– Almost linear scaling for most metadata operations with DNE
0
20000
40000
60000
80000
100000
120000
140000
Directory
creation
Directory
stat
Directory
removal
File
creation
File stat File
removal
ops/sec
Performance Comparing: Lustre-1.8 vs 2.4
(metadata operation to shared dir)
lustre-1.8.8 lustre-2.4.2
* 1 x MDS, 16 clients, 768 Lustre exports(multiple mount points)
0
100000
200000
300000
400000
500000
600000
Directory
creation
Directory
stat
Directory
removal
File
creation
File stat File
removal
ops/sec
Performance Comparing
(metadata operation to unique dir)
Same Storage resource, but just add MDS resource
Single MDS Dual MDS(DNE)
17
18. Metadata Performance
CPU Impact
• Metadata
Performance is
highly dependent on
CPU performance
• Limited variation in
CPU frequency or
memory speed can
have a significant
impact on metadata
performance
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 100 200 300 400 500 600 700
ops/sec
Number of Process
File Creation (Unique dir)
3.6GHz
2.4GHz
0
10000
20000
30000
40000
50000
60000
70000
0 100 200 300 400 500 600 700
ops/sec
Number of Process
File Creation (Shared dir)
3.6GHz
2.4GHz
18
19. Conclusions: Metadata Performance
• Significant metadata improvements for shares and unique
metadata performance since Lustre 2.3
– Improvements across all metadata workloads, especially
removals
– SSDs add to performance, but the impact is limited (and
probably mostly due to latency, not the IOPS performance of
the device)
• Important considerations for metadata benchmarks
– Metadata benchmarks are very sensitive to the underlying
hardware
– Clients limitations are important when considering metadata
performance of scaling
– Only directory/file stats scale with the number of processes
per node, other metadata workloads do appear not scale with
the number of processes on a single node
19
20. Summary
• Lustre Today
– Significant performance increase in object storage
and metadata performance
– Much better spindle efficiency (which is now
reaching the physical drive limit)
– Client performance is quickly becoming the
limitation for small file and random performance,
not the server side!!
• Consider alternative scenarios
– PFS combined with fast local devices (or, even,
local instances of the PFS)
– PFS combined with a global acceleration/caching
layer
20