The National Supercomputing Centre (NSCC) operates a 1 petaflop supercomputing cluster with 1300 nodes for research and industry users. It has 13 petabytes of storage including a 265TB burst buffer and uses a Mellanox 100Gbps network. The cluster scheduler is PBS Pro and supports applications in areas like engineering, science, life sciences and more.
UberCloud HPC Experiment Introduction for Beginnershpcexperiment
UberCloud HPC Experiment Introduction for Beginners.
What is the HPC Experiment
How the HPC Experiment works
How to participate in the HPC Experiment
And an example project
Optimizing your Infrastrucure and Operating System for HadoopDataWorks Summit
Apache Hadoop is clearly one of the fastest growing big data platforms to store and analyze arbitrarily structured data in search of business insights. However, applicable commodity infrastructures have advanced greatly in the last number of years and there is not a lot of accurate, current information to assist the community in optimally designing and configuring
Hadoop platforms (Infrastructure and O/S). In this talk we`ll present guidance on Linux and Infrastructure deployment, configuration and optimization from both Red Hat and HP (derived from actual performance data) for clusters optimized for single workloads or balanced clusters that host multiple concurrent workloads.
UberCloud HPC Experiment Introduction for Beginnershpcexperiment
UberCloud HPC Experiment Introduction for Beginners.
What is the HPC Experiment
How the HPC Experiment works
How to participate in the HPC Experiment
And an example project
Optimizing your Infrastrucure and Operating System for HadoopDataWorks Summit
Apache Hadoop is clearly one of the fastest growing big data platforms to store and analyze arbitrarily structured data in search of business insights. However, applicable commodity infrastructures have advanced greatly in the last number of years and there is not a lot of accurate, current information to assist the community in optimally designing and configuring
Hadoop platforms (Infrastructure and O/S). In this talk we`ll present guidance on Linux and Infrastructure deployment, configuration and optimization from both Red Hat and HP (derived from actual performance data) for clusters optimized for single workloads or balanced clusters that host multiple concurrent workloads.
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and sometimes confusing config/tuning options, it can can tempting (and scary) for a cluster administrator to make changes to Hadoop when cluster performance isn't as expected. Learn how to improve Hadoop cluster performance and eliminate common problem areas, applicable across use cases, using a handful of simple Linux configuration changes.
In this video from the DDN User Group at SC16, Pamela Hill, Computational & Information Systems Laboratory, NCAR/UCAR, presents: NCAR’s Evolving Infrastructure for Weather and Climate Research.
Watch the video presentation: http://wp.me/p3RLHQ-g4a
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Ceph on Intel: Intel Storage Components, Benchmarks, and ContributionsColleen Corrice
At Red Hat Storage Day Minneapolis on 4/12/16, Intel's Dan Ferber presented on Intel storage components, benchmarks, and contributions as they relate to Ceph.
Introduction to GlusterFS Webinar - September 2011GlusterFS
Looking for a high performance, scale-out NAS file system? Or are you a new user of GlusterFS and want to learn more? This educational monthly webinar provides an introduction and review of the GlusterFS architecture and key functionalities. Learn how GlusterFS is deployed in the datacenter, in the cloud, or between the two.
Ceph is an open source distributed storage system designed for scalability and reliability. Ceph's block device, RADOS block device (RBD), is widely used to store virtual machines, and is the most popular block storage used with OpenStack.
In this session, you'll learn how RBD works, including how it:
* Uses RADOS classes to make access easier from user space and within the Linux kernel.
* Implements thin provisioning.
* Builds on RADOS self-managed snapshots for cloning and differential backups.
* Increases performance with caching of various kinds.
* Uses watch/notify RADOS primitives to handle online management operations.
* Integrates with QEMU, libvirt, and OpenStack.
Storage tiering and erasure coding in Ceph (SCaLE13x)Sage Weil
Ceph is designed around the assumption that all components of the system (disks, hosts, networks) can fail, and has traditionally leveraged replication to provide data durability and reliability. The CRUSH placement algorithm is used to allow failure domains to be defined across hosts, racks, rows, or datacenters, depending on the deployment scale and requirements.
Recent releases have added support for erasure coding, which can provide much higher data durability and lower storage overheads. However, in practice erasure codes have different performance characteristics than traditional replication and, under some workloads, come at some expense. At the same time, we have introduced a storage tiering infrastructure and cache pools that allow alternate hardware backends (like high-end flash) to be leveraged for active data sets while cold data are transparently migrated to slower backends. The combination of these two features enables a surprisingly broad range of new applications and deployment configurations.
This talk will cover a few Ceph fundamentals, discuss the new tiering and erasure coding features, and then discuss a variety of ways that the new capabilities can be leveraged.
Gary Grider from Los Alamos National Laboratory presented this deck at the 2016 OpenFabrics Workshop.
"Trends in computer memory/storage technology are in flux perhaps more so now than in the last two decades. Economic analysis of HPC storage hierarchies has led to new tiers of storage being added to the next fleet of supercomputers including Burst Buffers or In-System Solid State Storage and Campaign Storage. This talk will cover the background that brought us these new storage tiers and postulate what the economic crystal ball looks like for the coming decade. Further it will suggest methods of leveraging HPC workflow studies to inform the continued evolution of the HPC storage hierarchy."
Watch the video presentation: https://www.youtube.com/watch?v=iDYLIpF-6Ew
See more talks from the Open Fabrics Workshop: http://insidehpc.com/2016-open-fabrics-workshop-video-gallery/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and sometimes confusing config/tuning options, it can can tempting (and scary) for a cluster administrator to make changes to Hadoop when cluster performance isn't as expected. Learn how to improve Hadoop cluster performance and eliminate common problem areas, applicable across use cases, using a handful of simple Linux configuration changes.
In this video from the DDN User Group at SC16, Pamela Hill, Computational & Information Systems Laboratory, NCAR/UCAR, presents: NCAR’s Evolving Infrastructure for Weather and Climate Research.
Watch the video presentation: http://wp.me/p3RLHQ-g4a
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Ceph on Intel: Intel Storage Components, Benchmarks, and ContributionsColleen Corrice
At Red Hat Storage Day Minneapolis on 4/12/16, Intel's Dan Ferber presented on Intel storage components, benchmarks, and contributions as they relate to Ceph.
Introduction to GlusterFS Webinar - September 2011GlusterFS
Looking for a high performance, scale-out NAS file system? Or are you a new user of GlusterFS and want to learn more? This educational monthly webinar provides an introduction and review of the GlusterFS architecture and key functionalities. Learn how GlusterFS is deployed in the datacenter, in the cloud, or between the two.
Ceph is an open source distributed storage system designed for scalability and reliability. Ceph's block device, RADOS block device (RBD), is widely used to store virtual machines, and is the most popular block storage used with OpenStack.
In this session, you'll learn how RBD works, including how it:
* Uses RADOS classes to make access easier from user space and within the Linux kernel.
* Implements thin provisioning.
* Builds on RADOS self-managed snapshots for cloning and differential backups.
* Increases performance with caching of various kinds.
* Uses watch/notify RADOS primitives to handle online management operations.
* Integrates with QEMU, libvirt, and OpenStack.
Storage tiering and erasure coding in Ceph (SCaLE13x)Sage Weil
Ceph is designed around the assumption that all components of the system (disks, hosts, networks) can fail, and has traditionally leveraged replication to provide data durability and reliability. The CRUSH placement algorithm is used to allow failure domains to be defined across hosts, racks, rows, or datacenters, depending on the deployment scale and requirements.
Recent releases have added support for erasure coding, which can provide much higher data durability and lower storage overheads. However, in practice erasure codes have different performance characteristics than traditional replication and, under some workloads, come at some expense. At the same time, we have introduced a storage tiering infrastructure and cache pools that allow alternate hardware backends (like high-end flash) to be leveraged for active data sets while cold data are transparently migrated to slower backends. The combination of these two features enables a surprisingly broad range of new applications and deployment configurations.
This talk will cover a few Ceph fundamentals, discuss the new tiering and erasure coding features, and then discuss a variety of ways that the new capabilities can be leveraged.
Gary Grider from Los Alamos National Laboratory presented this deck at the 2016 OpenFabrics Workshop.
"Trends in computer memory/storage technology are in flux perhaps more so now than in the last two decades. Economic analysis of HPC storage hierarchies has led to new tiers of storage being added to the next fleet of supercomputers including Burst Buffers or In-System Solid State Storage and Campaign Storage. This talk will cover the background that brought us these new storage tiers and postulate what the economic crystal ball looks like for the coming decade. Further it will suggest methods of leveraging HPC workflow studies to inform the continued evolution of the HPC storage hierarchy."
Watch the video presentation: https://www.youtube.com/watch?v=iDYLIpF-6Ew
See more talks from the Open Fabrics Workshop: http://insidehpc.com/2016-open-fabrics-workshop-video-gallery/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
A quick overview on REST : what it is and what it is not. REST has strict contraints and many internet Apis are not so REST. It’s also very popular today because RESTfull services can be consumed easily by any client or device. Soap is also still valid in a few circomstaces. It has never been so easy to create Rest-like services in .net since asp.net Web Api.
Yellow Slice is a design studio in Mumbai, India. We specialise in Branding, UI (User Interface) , UX (User Experience), User Research & Usability Testing.
HUB:
Telp : 021-4759206
Mobile : 08121942042
Jl. , 081288416332
PT. JEKLINDO CONSULTING HUB Gading Raya II No. 20 Rawamangun Jakarta Timur
Email : jeklindo@gmail.com
www.izinusahaindonesia.com
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big DataHitoshi Sato
Presentation Slides for ExaComm2018, Fourth International Workshop on Communication Architectures for HPC, Big Data, Deep Learning and Clouds at Extreme Scale, in conjunction with International Supercomputing Conference (ISC 2018)
http://nowlab.cse.ohio-state.edu/exacomm/
HPC and cloud distributed computing, as a journeyPeter Clapham
Introducing an internal cloud brings new paradigms, tools and infrastructure management. When placed alongside traditional HPC the new opportunities are significant But getting to the new world with micro-services, autoscaling and autodialing is a journey that cannot be achieved in a single step.
Building a GPU-enabled OpenStack Cloud for HPC - Blair Bethwaite, Monash Univ...OpenStack
Audience Level
Intermediate
Synopsis
M3 is the latest generation system of the MASSIVE project, an HPC facility specializing in characterization science (imaging and visualization). Using OpenStack as the compute provisioning layer, M3 is a hybrid HPC/cloud system, custom-integrated by Monash’s R@CMon Research Cloud team. Built to support Monash University’s next-gen high-throughput instrument processing requirements, M3 is half-half GPU-accelerated and CPU-only.
We’ll discuss the design and tech used to build this innovative platform as well as detailing approaches and challenges to building GPU-enabled and HPC clouds. We’ll also discuss some of the software and processing pipelines that this system supports and highlight the importance of tuning for these workloads.
Speaker Bio
Blair Bethwaite: Blair has worked in distributed computing at Monash University for 10 years, with OpenStack for half of that. Having served as team lead, architect, administrator, user, researcher, and occasional hacker, Blair’s unique perspective as a science power-user, developer, and system architect has helped guide the evolution of the research computing engine central to Monash’s 21st Century Microscope.
Lance Wilson: Lance is a mechanical engineer, who has been making tools to break things for the last 20 years. His career has moved through a number of engineering subdisciplines from manufacturing to bioengineering. Now he supports the national characterisation research community in Melbourne, Australia using OpenStack to create HPC systems solving problems too large for your laptop.
OpenPOWER Acceleration of HPCC SystemsHPCC Systems
JT Kellington, IBM and Allan Cantle, Nallatech present at the 2015 HPCC Systems Engineering Summit Community Day about porting HPCC Systems to the POWER8-based ppc64el architecture.
Introduction to HPC & Supercomputing in AITyrone Systems
Catch up with our live webinar on Natural Language Processing! Learn about how it works and how it applies to you. We have provided all the information in our video recording you would not miss out on.
Watch the Natural Language Processing webinar here!
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy
In questa sessione HPE e SUSE illustrano con casi reali come HPE Data Management Framework e SUSE Enterprise Storage permettano di risolvere i problemi di gestione della crescita esponenziale dei dati realizzando un’architettura software-defined flessibile, scalabile ed economica. (Alberto Galli, HPE Italia e SUSE)
DPDK Summit 2015 - Aspera - Charles ShiflettJim St. Leger
DPDK Summit 2015 in San Francisco.
Presentation by Charles Shiflett, Aspera.
For additional details and the video recording please visit www.dpdksummit.com.
Timely genome analysis requires a fresh approach to platform design for big data problems. Louisiana State University has tested enterprise cluster deployments of Redis with a unique solution that allows flash memory to act as extended RAM. Learn about how this solution allows large amounts of data to be handled with a fraction of the memory needed for a typical deployment.
Design Considerations, Installation, and Commissioning of the RedRaider Cluster at the Texas Tech University
High Performance Computing Center
Outline of this talk
HPCC Staff and Students
Previous clusters
• History, Performance, usage Patterns, and Experience
Motivation for Upgrades
• Compute Capacity Goals
• Related Considerations
Installation and Benchmarks Conclusions and Q&A
Designing HPC & Deep Learning Middleware for Exascale Systemsinside-BigData.com
DK Panda from Ohio State University presented this deck at the 2017 HPC Advisory Council Stanford Conference.
"This talk will focus on challenges in designing runtime environments for exascale systems with millions of processors and accelerators to support various programming models. We will focus on MPI, PGAS (OpenSHMEM, CAF, UPC and UPC++) and Hybrid MPI+PGAS programming models by taking into account support for multi-core, high-performance networks, accelerators (GPGPUs and Intel MIC), virtualization technologies (KVM, Docker, and Singularity), and energy-awareness. Features and sample performance numbers from the MVAPICH2 libraries will be presented."
Watch the video: http://wp.me/p3RLHQ-glW
Learn more: http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systemsinside-BigData.com
In this deck from the Stanford HPC Conference, DK Panda from Ohio State University presents: Designing HPC, Deep Learning, and Cloud Middleware for Exascale Systems.
"This talk will focus on challenges in designing HPC, Deep Learning, and HPC Cloud middleware for Exascale systems with millions of processors and accelerators. For the HPC domain, we will discuss the challenges in designing runtime environments for MPI+X (PGAS-OpenSHMEM/UPC/CAF/UPC++, OpenMP and Cuda) programming models by taking into account support for multi-core systems (KNL and OpenPower), high networks, GPGPUs (including GPUDirect RDMA) and energy awareness. Features and sample performance numbers from MVAPICH2 libraries will be presented. For the Deep Learning domain, we will focus on popular Deep Learning framewords (Caffe, CNTK, and TensorFlow) to extract performance and scalability with MVAPICH2-GDR MPI library and RDMA-enabled Big Data stacks. Finally, we will outline the challenges in moving these middleware to the Cloud environments."
Watch the video: https://youtu.be/i2I6XqOAh_I
Learn more: http://web.cse.ohio-state.edu/~panda.2/
and
http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
4. • State-of-the-art national facility with computing, data and
resources to enable users to solve science and technological
problems, and stimulate industry to use computing for
problem solving, testing designs and advancing technologies.
• Facility will be linked by high bandwidth networks to connect
these resources and provide high speed access to users anywhere
and everyone.
Introduction:
The National Supercomputing Centre (NSCC)
4
5. Introduction:Vision & Objectives
Vision:“Democratising Access to Supercomputing”
5
Making Petascale Supercomputing accessible to the
ordinary researcher
1
Bringing Petascale Computing and Storage and
Gigabit speed networking to the ordinary person
2
Supporting National R&D Initiatives1
Objectives of NSCC
Attracting Industrial Research Collaborations2
Enhancing Singapore’s Research Capabilities3
7. 7
What is HPC?
• Term HPC stands for High Performance Computing or High
Performance Computer
• Tightly coupled personal computers with high speed interconnect
• Measured in FLOPS (FLoating point Operations Per Second)
• Architectures
– NUMA (Non-uniform memory access)
8. Major Domains where HPC is used
Engineering
Analysis
• Fluid
Dynamics
• Materials
Simulation
• Crash
simulations
• Finite Element
Analysis
Scientific
Analysis
• Molecular
modelling
• Computational
Chemistry
• High energy
physics
• Quantum
Chemistry
Life Sciences
• Genomic
Sequencing
and Analysis
• Protein
folding
• Drug design
• Metabolic
modelling
Seismic
analysis
• Reservoir
Simulations
and modelling
• Seismic data
processing
8
9. Major Domains where HPC is used
Chip design &
Semiconductor
• Transistor
simulation
• Logic Simulation
• Electromagnetic
field solver
Computational
Mathematics
• Monte-Carlo
methods
• Time stepping and
parallel time
algorithms
• Iterative methods
Media and
Animation
• VFX and
visualization
• Animation
Weather research
• Atmospheric
modelling
• Seasonal time-
scale research
• -
Major Domains where HPC is used
9
10. Major Domains where HPC is used
• And More
– Bigdata
– Information Technology
– Cyber security
– Banking and Finance
– Data mining
10
12. Executive Summary
• 1 Petaflop System
– About 1300 nodes
– Homogeneous and Heterogeneous architectures
• 13 Petabytes of Storage
– One of the Largest and state of the art Storage architecture
• Research and Industry
– A*STAR, NUS, NTU, SUTD
– And many more commercial and academic organizations
12
13. HPC Stack in NSCC
Mellanox 100 Gbps Network
Intel Parallel
studio
Allinea Tools
PBSPro
Scheduler
Lustre & GPFS
HPC Application software
Operating System
RHEL 6.6 and CentOS 6.6
Fujitsu x86 Servers NVidia Tesla K40 GPUDDN Storage
Application
Modules
13
17. 17
Genomic Institute of
Singapore (GIS)
National
Supercomputing
Center (NSCC)
2km
Connection between GIS and NSCC
Large memory
node (1TB),
Ultra high speed
500Gbps
enabled
2012:
300 Gbytes/week
2015:
4300 Gbytes/week
x 14
18. NGSP Sequencers at B2
(Illumina + PacBio)
NSCC
Gateway
STEP 2: Automated
pipeline analysis once
sequencing completes.
Processed data resides in
NSCC
500Gbps
Primary
Link
Data Manager
STEP 3: Data manager index
and annotates processed data.
Replicate metadata to GIS.
Allowing data to be search and
retrieved from GIS
Data ManagerCompute Tiered Storage
POLARIS, Genotyping &
other Platforms in L4~L8
Tiered Storage
STEP 1: Sequencers
stream directly to
NSCC Storage
(NO footprint in GIS)
Compute
1 Gbps per
sequencer
10 Gbps
1 Gbps per
machine
100 Gbps
10 Gbps
A*CRC-NSCC
GIS
A*CRC: A*Star Computational Resource Center
GIS: Genome Institute of Singapore
Direct streaming of Sequence Data from GIS
to remote Supercomputer in NSCC
2km
19. The Hardware
EDR Interconnect
• Mellanox EDR Fat
Tree within cluster
• InfiniBand connection
to all end-points (login
nodes) at three campuses
• 40/80/500 Gbps
throughput network
extend to three campuses
(NUS/NTU/GIS)
Over13PB Storage
• HSM Tiered, 3 Tiers
• I/O 500 GBps flash
burst buffer , 10x
Infinite Memory
Engine (IME)
~1 PFlops System
• 1,288 nodes (dual socket,
12 cores/CPU E5-2690v3)
• 128 GB DDR4 / node
• 10 Large memory
nodes (1x6TB, 4x2TB, 5x
1TB)
19
20. Compute nodes
20
• Large Memory Nodes
– 9 Nodes configured with high memory
– FUJITSU Server PRIMERGY RX4770 M2
– Intel(R) Xeon(R) CPU E7-4830 v3 @ 2.10GHz
– 4 x 1TB, 4x 2TB, and 1x 6TB Memory
configuration
– EDR Infiniband
• Standard Compute nodes
– 1160 nodes
– Fujitsu Server PRIMERGY CX2550 M1
– 27840 CPU Cores
– Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz
– 128 GB / Server
– EDR InfiniBand
– Liquid cooling system
21. Accelerate your computing
Accelerators nodes
• 128 nodes with NVIDIA GPUs (identical to the compute nodes)
• NVIDIA K40 (2880 cores)
• 368,640 total GPU cores
Visualization nodes
• 2 nodes Fujitsu Celsius R940 graphic workstations
• Each with 2 x NVIDIA Quadro K4200
• NVIDIA Quadro Sync support
21
22. NSCC Data Centre – Green features
Warm water cooling for CPUs
– First free-cooling system in Singapore and
South-East Asia.
– Water is maintained at a temperature of 40ºC.
Enters the racks at 40ºC, exits the racks at
45ºC.
– Equipment placed in a technical floor(18th)
cool down the water down only using fans.
– The system can easily be extended for future
expansion.
Green features of Data Centre
– PUE of 1.4 (average for Singapore is above 2.5)
22
Cool-Central® Liquid Cooling
technology
23. Parallel file system
• Components
– Burst Buffer
• 265TB Burst Buffer
• 500 GB/s throughput
• Infinite Memory Engine (IME)
– Scratch
• 4 PB scratch storage
• 210 GB/s
• SFA12KX EXAScalar storage
• Lustre file system
– home and secure
• 4 PB Persistent storage
• GridScalar storage
• 100 GB/s throughput
• IBM Spectrum Scale (formerly GPFS)
– Archive storage
• 5 PB storage
• Archive purpose only
• WOS based archive system
23
29. Why PBS Professional (Scheduler)?
29
Workload management solution that maximizes the efficiency and
utilization of high-performance computing (HPC) resources and improves
job turnaround
RobustWorkload
Management
Floating licenses
Scalability, with flexible queues
Job arrays
User and administrator interface
Job suspend/resume
Application checkpoint/restart
Automatic file staging
Accounting logs
Access control lists
Advanced Scheduling
Algorithms
Resource-based scheduling
Preemptive scheduling
Optimized node sorting
Enhanced job placement
Advance & standing reservations
Cycle harvesting across workstations
Scheduling across multiple complexes
Network topology scheduling
Manages both batch and interactive work
Backfilling
Reliability,Availability and
Scalability
Server failover feature
Automatic job recovery
System monitoring
Integration with MPI solutions
Tested to manage 1,000,000+ jobs per day
Tested to accept 30,000 Jobs per minute
EAL3+ security
Checkpoint support
30. Process Flow of a PBS Job
1. User submits job
2. PBS server returns a job ID
3. PBS scheduler requests a list of resources from the server *
4. PBS scheduler sorts all the resources and jobs *
5. PBS scheduler informs PBS server which host(s) that job can run on *
6. PBS server pushes job script to execution host(s)
7. PBS MoM executes job script
8. PBS MoM periodically reports resource usage back to PBS server *
9.When job is completed PBS MoM copies output and error files
10. Job execution completed/user notification sent
HOST A HOST B HOST C
PBS SCHEDULER
PBS SERVER
pbsworks
ncpus
mem
host
pbsworks on HOST A
pbsworks
Note: * This information is for debugging purposes
only. It may change in future releases.
30
Cluster Network
31. Compute Manager GUI: Job Submission Page
• Applications panel
– Displays the applications available on the registered PAS server
• Submission Form panel
– Displays a job submission form for the application selecting the Applications panel
• Directory Structure panel
– Displays the directory structure of the location specified in the Address box
– Files panel
– Displays the contents of the directory, files, and subdirectories selected in the Directory Structure panel
31
Directory Structure
Files
Applications
32. Job Queues & Scheduling Policies
32
Queue Name Queue type Job run
time limit
No of cores
available
Description
Long Batch 240 Hours 1024
Jobs are expected
to run longer time
Development Interactive 24 Hours 48
Coding, profiling
and debugging
Normal Default Batch 3 Days 27000 Default queue
Large Memory Batch - 360
Jobs dispatched
based on memory
requirement
GPU GPU batch -
368,640
(CUDA)
Specific for GPU
jobs
Visualization Interactive 8 Hours 1
High end graphics
card
Production Batch - 480 Cores GIS queue
35. Parallel programming OpenMP
• Available compilers (gcc/gfortran/icc/ifort)
– OpenMP (not openmpi, Used mainly in SMP programming)
• OpenMP (Open Multi-Processing)
• OpenMP is an approach and OpenMPI is an implementation of MPI
• An API for shared-memory parallel programming in C/C++ and Fortran
• Parallelization in OpenMP achieved through threads
• Programming OpenMP is easier as it involves only pragma directive
• OpenMP program cannot communicate to the processor over network
• Different stages of the program uses different number of threads
• A typical approach is demonstrated through the below image
35
36. Parallel Programming MPI
• MPI
– MPI stands for Messaging Passing Interface
– MPI is a library specification
– MPI implementation is typically a wrapper to standard compilers
such as C/Fortran/Java/Python
– Typically used in Distributed memory communication
36
38. 38
Allinea DDT
• DDT – Distributed Debugging tool from Allinea
• Graphical interface for debugging
– Serial applications/codes
– OpenMP applications/codes
– MPI applications/codes
– CUDA applications/codes
• You control the pace of the code execution and examine
execution flow and variables
• Typical Scenario
– Set a point in your code where you want execution to stop
– Let your code run until the point is reached
– Check the variables of concern
39. 39
Allinea MAP
• MAP – Application Profiling tool from Allinea
• Graphical interface for profilling
– Serial applications/codes
– OpenMP applications/codes
– MPI applications/codes
43. GPU
• GPUs – Graphic Processing Units were initially made to
render better graphics performance
• With the amount of research put on GPUs, it was identified
that GPUs can perform better with Floating Point Operations
as well
• The term GPU changed to GPGPUs (General Purpose GPUs)
• CUDAToolkit includes compiler, math libraries, tools, and
debuggers
43
44. GPU in NSCC
• GPU Configuration
– Total 128 GPU nodes
– Each server with 1 Tesla K40 GPU
– 128 GB host memory per server
– 12GB device memory
– 2880 CUDA Cores
• Connect to GPU server
– To compile GPU application:
• Submit interactive job requesting for GPU resource
• Compile job using NVCC compiler
– To submit GPU job
• Flexible to among qsub for login nodes
• OR login to compute manager
44
46. What is Environment modules
• Environment modules helps to dynamically load/unload
environment variables such as PATH, LD_LIBRARY_PATH, etc.,
• Environment modules are based on module files which are
written in TCL language
• Environment modules are shell independent
• Helpful to maintain different version of same software
• Flexibility to create module files by the users
46
51. Managed Services offered
52
• Computational resources
• Storage management
Infrastructure Services
• Hardware break fix
• Software incident resolution
Incident Resolution
• Data management
• Job management
• Software installation etc.,
General Service Requests
• Code Optimization
• Special queue configuration, etc.
Specialized Service Requests
• Introductory class
• Code optimization techniques
• Parallel Profiling etc.
Training Services
• Portal/e-Mail/Phone
• Request for a service via portal
• Interactive Job submission portal
Helpdesk
52. Where is NSCC
• NSCC Petascale
supercomputer in Connexis
building
• 40Gbps links extended to
NUS, NTU and GIS
• Login nodes are placed in
NUS, NTU and GIS
datacenters
• Access to NSCC is just like
your local HPC system
53
1 Fusionopolis Way, Level-17 Connexis South
Tower, Singapore 138632
53. Supported Login methods
• How do I login
– SSH
From aWindows PC use Putty or any standard SSH client software hostname is nscclogin.nus.edu.sg,
use NSCC Credentials
From Linux machine, use ssh username@login-astar.nscc.sg / ssh username@login-astar.nscc.sg
From MAC, open terminal and ssh username@login-astar.nscc.sg / ssh username@login-astar.nscc.sg
– File Transfer
SCP or any other secure shell file transfer software fromWindows
Use the command scp to transfer files from MAC/Linux
– Compute Manager
Open any standard web browser
In the address bar, type https://loginweb-astar.nscc.sg
Use NSCC credentials to login
– Outside campus
Connect to CampusVPN gain above mentioned services
54
54. NSCC HPC Support (Proposed to be available by 15th Mar)
• Corporate Info – web portal
http://nscc.sg
• NSCC HPC web portal
http://help.nscc.sg
• NSCC support email
help@nscc.sg
• NSCC Workshop portal
http://workshop.nscc.sg
55
59. Web Site : http://nscc.sg
Helpdesk : https://help.nscc.sg
Email : help@nscc.sg
Phone : +65 6645 3412
60
60.
61. User Enrollment
Instructions:
• Open https://help.nscc.sg
• Navigate User services -> Enrollment
• Click on Login
• Select your organization (NUS/NTU/A*Star) from the drop
down
• Input your credentials
Ref: https://help.nscc.sg -> User Guides -> User Enrollment guide
62
62. Login to NSCC Login nodes
• Download Putty form internet
• Open Putty
• Type login server name (login.nscc.sg)
• Input your credentials to login
63
63. Compute manager
• OpenWeb Browser (Firefox or IE)
• Type https://nusweb.nscc.sg / https://ntuweb.nscc.sg /
https://loginweb-astar.nscc.sg
• Use your credentials to login
• Submit a sample job
64
72. Using Scratch space
#!/bin/bash
#PBS -N My_Job
# Name of the job
#PBS -l select=1:ncpus=24:mpiprocs=24
# Setting number of nodes and CPUs to use
#PBS -W sandbox=private
# Get PBS to enter private sandbox
#PBS -W stagein=file_io@wlm01:/home/adm/sup/fsg1/<my input directory>
# Directory name where all the input files are alvailable
# files in the input directory will be copied to scratch space creating a directory file_io
#PBS -W stageout=*@wlm01:/home/adm/sup/fsg1/<myoutput directory>
# Output directory path in my home directory
# Once the job is finished, the files from file_io in scratch will be copied back to <myoutput
directory>
#PBS -q normal
cd ${PBS_O_WORKDIR}
echo " PBS_WORK_DIR is : $PBS_O_WORKDIR"
echo "PBS JOB DIR is: $PBS_JOBDIR"
#Notice that the output of pwd will be in lustre scratch space
echo "PWD is : `pwd`"
sleep 30
#mpirun ./a.out < input_file > output_file
73