In this video from Linaro Connect 2019, Satoshi Matsuoka from Riken presents: A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exascale Performance.
"Fugaku is the flagship next generation national supercomputer being developed by Riken R-CCS and Fujitsu in collaboration. Fugaku will have hyperscale datacenter class resource in a single exascale machine, with more than 150,000 nodes of sever-class Fujitsu A64fx many-core Arm CPUs with the new SVE (Scalable Vector Extension) with low precision math for the first time in the world, accelerating both HPC and AI workloads, augmented with HBM2 memory paired with each CPU, exhibiting nearly a Terabyte/s memory bandwidth for both HPC and AI rapid data movements."
Watch the video: https://wp.me/p3RLHQ-kYn
Learn more: https://postk-web.r-ccs.riken.jp/
and
https://connect.linaro.org/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
Satoshi Matsuoka from RIKEN gave this talk at the HPC User Forum in Santa Fe.
"With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. Post-K is the flagship next generation national supercomputer being developed by Riken and Fujitsu in collaboration. Post-K will have hyperscale class resource in one exascale machine, with well more than 100,000 nodes of sever-class A64fx many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community.
Rather than to focus on double precision flops that are of lesser utility, rather Post-K, especially its Arm64fx processor and the Tofu-D network is designed to sustain extreme bandwidth on realistic applications including those for oil and gas, such as seismic wave propagation, CFD, as well as structural codes, besting its rivals by several factors in measured performance. Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/ML infrastructure. Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node."
Watch the video: https://wp.me/p3RLHQ-k6G
Learn more: https://en.wikichip.org/wiki/supercomputers/post-k
and
http://hpcuserforum.com
In this deck, Yuichiro Ajima from Fujitsu presents: The Tofu Interconnect D.
"Through the development of post-K, which will be equipped with this CPU, Fujitsu will contribute to the resolution of social and scientific issues in such computer simulation fields as cutting-edge research, health and longevity, disaster prevention and mitigation, energy, as well as manufacturing, while enhancing industrial competitiveness and contributing to the creation of Society 5.0 by promoting applications in big data and AI fields."
Learn more: https://insidehpc.com/2018/08/fujitsu-unveils-details-post-k-supercomputer-processor-powered-arm/
and
http://www.fujitsu.com/jp/solutions/business-technology/tc/catalog/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...Linaro
Session ID: HKG18-411
Session Name: HKG18-411 - Introduction to OpenAMP which is an open source solution for heterogeneous system orchestration and communication
Speaker: Wendy Liang
Track: IoT, Embedded
★ Session Summary ★
Introduction to OpenAMP which is an open source solution for heterogeneous system orchestration and communication
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-411/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-411.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-411.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: IoT, Embedded
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
OSv Unikernel — Optimizing Guest OS to Run Stateless and Serverless Apps in t...ScyllaDB
Unikernels have been demonstrated to deliver excellent performance in terms of throughput and latency, while providing high isolation. However they have also been shown to underperform in some types of workloads when compared to a generic OS like Linux. In this presentation, we demonstrate that certain types of workloads - web servers, microservices, and other stateless and/or serverless apps - can greatly benefit from OSv optimized networking stack and other features. We describe number of experiments where OSv outperforms Linux guest: most notably we note 1.6 throughput (req/s) and 0.6 latency improvement (at p99 percentile) when running nginx and 1.7 throughput (req/s) and 0.6 latency improvement (at p99 percentile) when running simple microservice implemented in Golang.
We also show that OSv' small kernel, low boot time and memory consumption allow for very high density when running server-less workloads. The experiment described in this presentation shows we can boot 1,800 OSv microVMs per second on AWS c5n.metal machine with 72 CPUs (25 boots/sec on single CPU) with guest boot time recorded as low as 8.98ms at p50 and 31.49ms at p99 percentile respectively.
Lastly we also demonstrate how to automate the build process of the OSv kernel tailored exactly to the specific app and/or VMM so that only the code and symbols needed are part of the kernel and nothing more. OSv is an open source project and can be found at https://github.com/cloudius-systems/osv.
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
Satoshi Matsuoka from RIKEN gave this talk at the HPC User Forum in Santa Fe.
"With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. Post-K is the flagship next generation national supercomputer being developed by Riken and Fujitsu in collaboration. Post-K will have hyperscale class resource in one exascale machine, with well more than 100,000 nodes of sever-class A64fx many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community.
Rather than to focus on double precision flops that are of lesser utility, rather Post-K, especially its Arm64fx processor and the Tofu-D network is designed to sustain extreme bandwidth on realistic applications including those for oil and gas, such as seismic wave propagation, CFD, as well as structural codes, besting its rivals by several factors in measured performance. Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/ML infrastructure. Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node."
Watch the video: https://wp.me/p3RLHQ-k6G
Learn more: https://en.wikichip.org/wiki/supercomputers/post-k
and
http://hpcuserforum.com
In this deck, Yuichiro Ajima from Fujitsu presents: The Tofu Interconnect D.
"Through the development of post-K, which will be equipped with this CPU, Fujitsu will contribute to the resolution of social and scientific issues in such computer simulation fields as cutting-edge research, health and longevity, disaster prevention and mitigation, energy, as well as manufacturing, while enhancing industrial competitiveness and contributing to the creation of Society 5.0 by promoting applications in big data and AI fields."
Learn more: https://insidehpc.com/2018/08/fujitsu-unveils-details-post-k-supercomputer-processor-powered-arm/
and
http://www.fujitsu.com/jp/solutions/business-technology/tc/catalog/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...Linaro
Session ID: HKG18-411
Session Name: HKG18-411 - Introduction to OpenAMP which is an open source solution for heterogeneous system orchestration and communication
Speaker: Wendy Liang
Track: IoT, Embedded
★ Session Summary ★
Introduction to OpenAMP which is an open source solution for heterogeneous system orchestration and communication
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/hkg18/hkg18-411/
Presentation: http://connect.linaro.org.s3.amazonaws.com/hkg18/presentations/hkg18-411.pdf
Video: http://connect.linaro.org.s3.amazonaws.com/hkg18/videos/hkg18-411.mp4
---------------------------------------------------
★ Event Details ★
Linaro Connect Hong Kong 2018 (HKG18)
19-23 March 2018
Regal Airport Hotel Hong Kong
---------------------------------------------------
Keyword: IoT, Embedded
'http://www.linaro.org'
'http://connect.linaro.org'
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
OSv Unikernel — Optimizing Guest OS to Run Stateless and Serverless Apps in t...ScyllaDB
Unikernels have been demonstrated to deliver excellent performance in terms of throughput and latency, while providing high isolation. However they have also been shown to underperform in some types of workloads when compared to a generic OS like Linux. In this presentation, we demonstrate that certain types of workloads - web servers, microservices, and other stateless and/or serverless apps - can greatly benefit from OSv optimized networking stack and other features. We describe number of experiments where OSv outperforms Linux guest: most notably we note 1.6 throughput (req/s) and 0.6 latency improvement (at p99 percentile) when running nginx and 1.7 throughput (req/s) and 0.6 latency improvement (at p99 percentile) when running simple microservice implemented in Golang.
We also show that OSv' small kernel, low boot time and memory consumption allow for very high density when running server-less workloads. The experiment described in this presentation shows we can boot 1,800 OSv microVMs per second on AWS c5n.metal machine with 72 CPUs (25 boots/sec on single CPU) with guest boot time recorded as low as 8.98ms at p50 and 31.49ms at p99 percentile respectively.
Lastly we also demonstrate how to automate the build process of the OSv kernel tailored exactly to the specific app and/or VMM so that only the code and symbols needed are part of the kernel and nothing more. OSv is an open source project and can be found at https://github.com/cloudius-systems/osv.
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea ArcangeliAnne Nicolas
Andrea will provide a short high level view of the most notable milestones in the evolution of the Linux Virtual Memory over the years. He will then focus on the various Memory Management features such as Transparent Huge Pages(THP), automatic NUMA balancing and userfaultd/postcopy live migration of Kernel Virtual Machines (KVM). Andrea will cover best practices, providing the audience with an understanding of when and how to leverage these features in their environments.
Andrea Arcangeli, Red Hat
This presentation talks about Real Time Operating Systems (RTOS). Starting with fundamental concepts of OS, this presentation deep dives into Embedded, Real Time and related aspects of an OS. Appropriate examples are referred with Linux as a case-study. Ideal for a beginner to build understanding about RTOS.
For the full video of this presentation, please visit:
http://www.embedded-vision.com/platinum-members/altera/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Bill Jenkins, Senior Product Specialist for High Level Design Tools at Intel, presents the "Accelerating Deep Learning Using Altera FPGAs" tutorial at the May 2016 Embedded Vision Summit.
While large strides have recently been made in the development of high-performance systems for neural networks based on multi-core technology, significant challenges in power, cost and, performance scaling remain. Field-programmable gate arrays (FPGAs) are a natural choice for implementing neural networks because they can combine computing, logic, and memory resources in a single device. Intel's Programmable Solutions Group has developed a scalable convolutional neural network reference design for deep learning systems using the OpenCL programming language built with our SDK for OpenCL. The design performance is being benchmarked using several popular CNN benchmarks: CIFAR-10, ImageNet and KITTI.
Building the CNN with OpenCL kernels allows true scaling of the design from smaller to larger devices and from one device generation to the next. New designs can be sized using different numbers of kernels at each layer. Performance scaling from one generation to the next also benefits from architectural advancements, such as floating-point engines and frequency scaling. Thus, you achieve greater than linear performance and performance per watt scaling with each new series of devices.
A brief introduction to processor organization ‒ the main examined topic is the description of processor's general characteristics and their evident impacts.
This presentation was given at the Faculty of Informatics of Masaryk University in 2018 within the PA174 Design of Digital Systems II course.
Linux has emerged as a number one choice for developing OS based Embedded Systems. Open Source development model, Customizability, Portability, Tool chain availability are some reasons for this success. This course gives a practical perspective of customizing, building and bringing up Linux Kernel on an ARM based target hardware. It combines various previous modules you have learned, by combing Linux administration, Hardware knowledge, Linux as OS, C/Computer programming areas. After bringing up Linux, you can port any of the existing applications into the target hardware.
In this deck from the HPC User Forum in Austin, Yutaka Ishikawa from Riken AICS presents: Japan's post K Computer.
Watch the video presentation: http://wp.me/p3RLHQ-fJ6
Learn more: http://hpcuserforum.com
Kernel Recipes 2017 - 20 years of Linux Virtual Memory - Andrea ArcangeliAnne Nicolas
Andrea will provide a short high level view of the most notable milestones in the evolution of the Linux Virtual Memory over the years. He will then focus on the various Memory Management features such as Transparent Huge Pages(THP), automatic NUMA balancing and userfaultd/postcopy live migration of Kernel Virtual Machines (KVM). Andrea will cover best practices, providing the audience with an understanding of when and how to leverage these features in their environments.
Andrea Arcangeli, Red Hat
This presentation talks about Real Time Operating Systems (RTOS). Starting with fundamental concepts of OS, this presentation deep dives into Embedded, Real Time and related aspects of an OS. Appropriate examples are referred with Linux as a case-study. Ideal for a beginner to build understanding about RTOS.
For the full video of this presentation, please visit:
http://www.embedded-vision.com/platinum-members/altera/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Bill Jenkins, Senior Product Specialist for High Level Design Tools at Intel, presents the "Accelerating Deep Learning Using Altera FPGAs" tutorial at the May 2016 Embedded Vision Summit.
While large strides have recently been made in the development of high-performance systems for neural networks based on multi-core technology, significant challenges in power, cost and, performance scaling remain. Field-programmable gate arrays (FPGAs) are a natural choice for implementing neural networks because they can combine computing, logic, and memory resources in a single device. Intel's Programmable Solutions Group has developed a scalable convolutional neural network reference design for deep learning systems using the OpenCL programming language built with our SDK for OpenCL. The design performance is being benchmarked using several popular CNN benchmarks: CIFAR-10, ImageNet and KITTI.
Building the CNN with OpenCL kernels allows true scaling of the design from smaller to larger devices and from one device generation to the next. New designs can be sized using different numbers of kernels at each layer. Performance scaling from one generation to the next also benefits from architectural advancements, such as floating-point engines and frequency scaling. Thus, you achieve greater than linear performance and performance per watt scaling with each new series of devices.
A brief introduction to processor organization ‒ the main examined topic is the description of processor's general characteristics and their evident impacts.
This presentation was given at the Faculty of Informatics of Masaryk University in 2018 within the PA174 Design of Digital Systems II course.
Linux has emerged as a number one choice for developing OS based Embedded Systems. Open Source development model, Customizability, Portability, Tool chain availability are some reasons for this success. This course gives a practical perspective of customizing, building and bringing up Linux Kernel on an ARM based target hardware. It combines various previous modules you have learned, by combing Linux administration, Hardware knowledge, Linux as OS, C/Computer programming areas. After bringing up Linux, you can port any of the existing applications into the target hardware.
In this deck from the HPC User Forum in Austin, Yutaka Ishikawa from Riken AICS presents: Japan's post K Computer.
Watch the video presentation: http://wp.me/p3RLHQ-fJ6
Learn more: http://hpcuserforum.com
Implementing AI: High Performance Architectures: Large scale HPC hardware in ...KTN
The Implementing AI: High Performance Architectures webinar, hosted by KTN and eFutures, was the fourth event in the Implementing AI webinar series.
The focus of the webinar was the impact of processing AI data on data centres - particularly from the technology perspective. Prof. Simon McIntosh-Smith, Professor in High Performance Computing, University of Bristol, covered Large scale HPC hardware in the age of AI.
El Barcelona Supercomputing Center (BSC) fue establecido en 2005 y alberga el MareNostrum, uno de los superordenadores más potentes de España. Somos el centro pionero de la supercomputación en España. Nuestra especialidad es la computación de altas prestaciones - también conocida como HPC o High Performance Computing- y nuestra misión es doble: ofrecer infraestructuras y servicio de supercomputación a los científicos españoles y europeos, y generar conocimiento y tecnología para transferirlos a la sociedad. Somos Centro de Excelencia Severo Ochoa, miembros de primer nivel de la infraestructura de investigación europea PRACE (Partnership for Advanced Computing in Europe), y gestionamos la Red Española de Supercomputación (RES). Como centro de investigación, contamos con más de 456 expertos de 45 países, organizados en cuatro grandes áreas de investigación: Ciencias de la computación, Ciencias de la vida, Ciencias de la tierra y aplicaciones computacionales en ciencia e ingeniería.
Architecting a 35 PB distributed parallel file system for scienceSpeck&Tech
ABSTRACT: Perlmutter is the newest supercomputer at Berkeley Lab, California, and features a whopping 35 PB all-flash Lustre file system. Let's dive into its architecture, showing some early performance figures and unique performance considerations, using low-level Lustre tests that achieve over 90% of the theoretical bandwidth of the SSDs, to showcase how Perlmutter achieves the performance of a burst buffer and the resilience of a scratch file system. Lastly, some performance considerations unique to an all-flash Lustre file system, along with tips on how better I/O patterns can make the most of such powerful architectures.
BIO: Alberto Chiusole studied Data Science and Scientific Computing in Trieste when he had the opportunity to spend some months at CERN, in Geneva, benchmarking their Ceph file system against a classic Lustre file system from eXact lab, the HPC consulting company in Trieste he was working for at the time. After Trieste, he worked as a Storage and I/O Software Engineer at Berkeley Lab, California, a national scientific laboratory, where he assisted scientists with improving their I/O and data needs. He now works for Seqera Labs as an HPC DevOps Engineer, focusing on infrastructure support.
This document tells about some brief idea about Supercomputer.
The list of Supercomuters in the world.
Little bit idea about Clustering Of Computer (HPC Cluster) and about the model of it.
This slide explains about the detailed view hardware architecture which includes CPUs, GPUs, Interconnect networks and applications used by the summit supercomputer
Open Source Software on OpenPOWER systems.
With 100% open source system software (including the firmware), OpenPOWER is the most open server architecture in the market. Based on the IBM POWER8 chip, this new family of servers featuring the latest Nvidia NVLink technology runs all the software solutions presented at OPEN'16 with significant cost advantages. This session explains how Docker, EnterpriseDB and many others benefit from this advanced design, and how 200+ technology companies including Google and RackSpace are collaborating in an open development alliance to build the datacenter of the future.
In this deck from the Stanford HPC Conference, Shahin Khan from OrionX describes major market Shifts in IT.
"We will discuss the digital infrastructure of the future enterprise and the state of these trends."
"We work with clients on the impact of Digital Transformation (DX) on them, their customers, and their messages. Generally, they want to track, in one place, trends like IoT, 5G, AI, Blockchain, and Quantum Computing. And they want to know what these trends mean, how they affect each other, and when they demand action, and how to formulate and execute an effective plan. If that describes you, we can help."
Watch the video: https://wp.me/p3RLHQ-lPP
Learn more: http://orionx.net
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
In this deck from IWOCL / SYCLcon 2020, Hal Finkel from Argonne National Laboratory presents: Preparing to program Aurora at Exascale - Early experiences and future directions.
"Argonne National Laboratory’s Leadership Computing Facility will be home to Aurora, our first exascale supercomputer. Aurora promises to take scientific computing to a whole new level, and scientists and engineers from many different fields will take advantage of Aurora’s unprecedented computational capabilities to push the boundaries of human knowledge. In addition, Aurora’s support for advanced machine-learning and big-data computations will enable scientific workflows incorporating these techniques along with traditional HPC algorithms. Programming the state-of-the-art hardware in Aurora will be accomplished using state-of-the-art programming models. Some of these models, such as OpenMP, are long-established in the HPC ecosystem. Other models, such as Intel’s oneAPI, based on SYCL, are relatively-new models constructed with the benefit of significant experience. Many applications will not use these models directly, but rather, will use C++ abstraction libraries such as Kokkos or RAJA. Python will also be a common entry point to high-performance capabilities. As we look toward the future, features in the C++ standard itself will become increasingly relevant for accessing the extreme parallelism of exascale platforms.
This presentation will summarize the experiences of our team as we prepare for Aurora, exploring how to port applications to Aurora’s architecture and programming models, and distilling the challenges and best practices we’ve developed to date. oneAPI/SYCL and OpenMP are both critical models in these efforts, and while the ecosystem for Aurora has yet to mature, we’ve already had a great deal of success. Importantly, we are not passive recipients of programming models developed by others. Our team works not only with vendor-provided compilers and tools, but also develops improved open-source LLVM-based technologies that feed both open-source and vendor-provided capabilities. In addition, we actively participate in the standardization of OpenMP, SYCL, and C++. To conclude, I’ll share our thoughts on how these models can best develop in the future to support exascale-class systems."
Watch the video: https://wp.me/p3RLHQ-lPT
Learn more: https://www.iwocl.org/iwocl-2020/conference-program/
and
https://www.anl.gov/topic/aurora
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Greg Wahl from Advantech presents: Transforming Private 5G Networks.
Advantech Networks & Communications Group is driving innovation in next-generation network solutions with their High Performance Servers. We provide business critical hardware to the world's leading telecom and networking equipment manufacturers with both standard and customized products. Our High Performance Servers are highly configurable platforms designed to balance the best in x86 server-class processing performance with maximum I/O and offload density. The systems are cost effective, highly available and optimized to meet next generation networking and media processing needs.
“Advantech’s Networks and Communication Group has been both an innovator and trusted enabling partner in the telecommunications and network security markets for over a decade, designing and manufacturing products for OEMs that accelerate their network platform evolution and time to market.” Said Advantech Vice President of Networks & Communications Group, Ween Niu. “In the new IP Infrastructure era, we will be expanding our expertise in Software Defined Networking (SDN) and Network Function Virtualization (NFV), two of the essential conduits to 5G infrastructure agility making networks easier to install, secure, automate and manage in a cloud-based infrastructure.”
In addition to innovation in air interface technologies and architecture extensions, 5G will also need a new generation of network computing platforms to run the emerging software defined infrastructure, one that provides greater topology flexibility, essential to deliver on the promises of high availability, high coverage, low latency and high bandwidth connections. This will open up new parallel industry opportunities through dedicated 5G network slices reserved for specific industries dedicated to video traffic, augmented reality, IoT, connected cars etc. 5G unlocks many new doors and one of the keys to its enablement lies in the elasticity and flexibility of the underlying infrastructure.
Advantech’s corporate vision is to enable an intelligent planet. The company is a global leader in the fields of IoT intelligent systems and embedded platforms. To embrace the trends of IoT, big data, and artificial intelligence, Advantech promotes IoT hardware and software solutions with the Edge Intelligence WISE-PaaS core to assist business partners and clients in connecting their industrial chains. Advantech is also working with business partners to co-create business ecosystems that accelerate the goal of industrial intelligence."
Watch the video: https://wp.me/p3RLHQ-lPQ
* Company website: https://www.advantech.com/
* Solution page: https://www2.advantech.com/nc/newsletter/NCG/SKY/benefits.html
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
In this deck from the Stanford HPC Conference, Katie Lewis from Lawrence Livermore National Laboratory presents: The Incorporation of Machine Learning into Scientific Simulations at Lawrence Livermore National Laboratory.
"Scientific simulations have driven computing at Lawrence Livermore National Laboratory (LLNL) for decades. During that time, we have seen significant changes in hardware, tools, and algorithms. Today, data science, including machine learning, is one of the fastest growing areas of computing, and LLNL is investing in hardware, applications, and algorithms in this space. While the use of simulations to focus and understand experiments is well accepted in our community, machine learning brings new challenges that need to be addressed. I will explore applications for machine learning in scientific simulations that are showing promising results and further investigation that is needed to better understand its usefulness."
Watch the video: https://youtu.be/NVwmvCWpZ6Y
Learn more: https://computing.llnl.gov/research-area/machine-learning
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
In this deck from the Stanford HPC Conference, DK Panda from Ohio State University presents: How to Achieve High-Performance, Scalable and Distributed DNN Training on Modern HPC Systems?
"This talk will start with an overview of challenges being faced by the AI community to achieve high-performance, scalable and distributed DNN training on Modern HPC systems with both scale-up and scale-out strategies. After that, the talk will focus on a range of solutions being carried out in my group to address these challenges. The solutions will include: 1) MPI-driven Deep Learning, 2) Co-designing Deep Learning Stacks with High-Performance MPI, 3) Out-of- core DNN training, and 4) Hybrid (Data and Model) parallelism. Case studies to accelerate DNN training with popular frameworks like TensorFlow, PyTorch, MXNet and Caffe on modern HPC systems will be presented."
Watch the video: https://youtu.be/LeUNoKZVuwQ
Learn more: http://web.cse.ohio-state.edu/~panda.2/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
In this deck from the Stanford HPC Conference, Nick Nystrom and Paola Buitrago provide an update from the Pittsburgh Supercomputing Center.
Nick Nystrom is Chief Scientist at the Pittsburgh Supercomputing Center (PSC). Nick is architect and PI for Bridges, PSC's flagship system that successfully pioneered the convergence of HPC, AI, and Big Data. He is also PI for the NIH Human Biomolecular Atlas Program’s HIVE Infrastructure Component and co-PI for projects that bring emerging AI technologies to research (Open Compass), apply machine learning to biomedical data for breast and lung cancer (Big Data for Better Health), and identify causal relationships in biomedical big data (the Center for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence). His current research interests include hardware and software architecture, applications of machine learning to multimodal data (particularly for the life sciences) and to enhance simulation, and graph analytics.
Watch the video: https://youtu.be/LWEU1L1o7yY
Learn more: https://www.psc.edu/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Stanford HPC Conference, Ryan Quick from Providentia Worldwide describes how DNNs can be used to improve EDA simulation runs.
"Systems Intelligence relies on a variety of methods for providing insight into the core mechanisms for driving automated behavioral changes in self-healing command and control platforms. This talk reports on initial efforts with leveraging Semiconductor Electronic Design Automation (EDA) telemetry data from cross-domain sources including power, network, storage, nodes, and applications in neural networks as a driving method for insight into SI automation systems."
Watch the video: https://youtu.be/2WbR8tq-XbM
Learn more: http://www.providentiaworldwide.com/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
In this deck from the Stanford HPC Conference, Nicole Xu from Stanford University describes how she transformed a common jellyfish into a bionic creature that is part animal and part machine.
"Animal locomotion and bioinspiration have the potential to expand the performance capabilities of robots, but current implementations are limited. Mechanical soft robots leverage engineered materials and are highly controllable, but these biomimetic robots consume more power than corresponding animal counterparts. Biological soft robots from a bottom-up approach offer advantages such as speed and controllability but are limited to survival in cell media. Instead, biohybrid robots that comprise live animals and self- contained microelectronic systems leverage the animals’ own metabolism to reduce power constraints and body as an natural scaffold with damage tolerance. We demonstrate that by integrating onboard microelectronics into live jellyfish, we can enhance propulsion up to threefold, using only 10 mW of external power input to the microelectronics and at only a twofold increase in cost of transport to the animal. This robotic system uses 10 to 1000 times less external power per mass than existing swimming robots in literature and can be used in future applications for ocean monitoring to track environmental changes."
Watch the video: https://youtu.be/HrmJFyvInj8
Learn more: https://sanfrancisco.cbslocal.com/2020/02/05/stanford-research-project-common-jellyfish-bionic-sea-creatures/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Stanford HPC Conference, Peter Dueben from the European Centre for Medium-Range Weather Forecasts (ECMWF) presents: Machine Learning for Weather Forecasts.
"I will present recent studies that use deep learning to learn the equations of motion of the atmosphere, to emulate model components of weather forecast models and to enhance usability of weather forecasts. I will than talk about the main challenges for the application of deep learning in cutting-edge weather forecasts and suggest approaches to improve usability in the future."
Peter is contributing to the development and optimization of weather and climate models for modern supercomputers. He is focusing on a better understanding of model error and model uncertainty, on the use of reduced numerical precision that is optimised for a given level of model error, on global cloud- resolving simulations with ECMWF's forecast model, and the use of machine learning, and in particular deep learning, to improve the workflow and predictions. Peter has graduated in Physics and wrote his PhD thesis at the Max Planck Institute for Meteorology in Germany. He worked as Postdoc with Tim Palmer at the University of Oxford and has taken up a position as University Research Fellow of the Royal Society at the European Centre for Medium-Range Weather Forecasts (ECMWF) in 2017.
Watch the video: https://youtu.be/ks3fkRj8Iqc
Learn more: https://www.ecmwf.int/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Gilad Shainer from the HPC AI Advisory Council describes how this organization fosters innovation in the high performance computing community.
"The HPC-AI Advisory Council’s mission is to bridge the gap between high-performance computing (HPC) and Artificial Intelligence (AI) use and its potential, bring the beneficial capabilities of HPC and AI to new users for better research, education, innovation and product manufacturing, bring users the expertise needed to operate HPC and AI systems, provide application designers with the tools needed to enable parallel computing, and to strengthen the qualification and integration of HPC and AI system products."
Watch the video: https://wp.me/p3RLHQ-lNz
Learn more: http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Today RIKEN in Japan announced that the Fugaku supercomputer will be made available for research projects aimed to combat COVID-19.
"Fugaku is currently being installed and is scheduled to be available to the public in 2021. However, faced with the devastating disaster unfolding before our eyes, RIKEN and MEXT decided to make a portion of the computational resources of Fugaku available for COVID-19-related projects ahead of schedule while continuing the installation process.
Fugaku is being developed not only for the progress in science, but also to help build the society dubbed as the “Society 5.0” by the Japanese government, where all people will live safe and comfortable lives. The current initiative to fight against the novel coronavirus is driven by the philosophy behind the development of Fugaku."
Initial Projects
Exploring new drug candidates for COVID-19 by "Fugaku"
Yasushi Okuno, RIKEN / Kyoto University
Prediction of conformational dynamics of proteins on the surface of SARS-Cov-2 using Fugaku
Yuji Sugita, RIKEN
Simulation analysis of pandemic phenomena
Nobuyasu Ito, RIKEN
Fragment molecular orbital calculations for COVID-19 proteins
Yuji Mochizuki, Rikkyo University
In this deck from the Performance Optimisation and Productivity group, Lubomir Riha from IT4Innovations presents: Energy Efficient Computing using Dynamic Tuning.
"We now live in a world of power-constrained architectures and systems and power consumption represents a significant cost factor in the overall HPC system economy. For these reasons, in recent years researchers, supercomputing centers and major vendors have developed new tools and methodologies to measure and optimize the energy consumption of large-scale high performance system installations. Due to the link between energy consumption, power consumption and execution time of an application executed by the final user, it is important for these tools and the methodology used to consider all these aspects, empowering the final user and the system administrator with the capability of finding the best configuration given different high level objectives.
This webinar focused on tools designed to improve the energy-efficiency of HPC applications using a methodology of dynamic tuning of HPC applications, developed under the H2020 READEX project. The READEX methodology has been designed for exploiting the dynamic behaviour of software. At design time, different runtime situations (RTS) are detected and optimized system configurations are determined. RTSs with the same configuration are grouped into scenarios, forming the tuning model. At runtime, the tuning model is used to switch system configurations dynamically.
The MERIC tool, that implements the READEX methodology, is presented. It supports manual or binary instrumentation of the analysed applications to simplify the analysis. This instrumentation is used to identify and annotate the significant regions in the HPC application. Automatic binary instrumentation annotates regions with significant runtime. Manual instrumentation, which can be combined with automatic, allows code developer to annotate regions of particular interest."
Watch the video: https://wp.me/p3RLHQ-lJP
Learn more: https://pop-coe.eu/blog/14th-pop-webinar-energy-efficient-computing-using-dynamic-tuning
and
https://code.it4i.cz/vys0053/meric
Sign up for our insideHPC Newsletter: http://insidehpc.com/newslett
In this deck from GTC Digital, William Beaudin from DDN presents: HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD.
Enabling high performance computing through the use of GPUs requires an incredible amount of IO to sustain application performance. We'll cover architectures that enable extremely scalable applications through the use of NVIDIA’s SuperPOD and DDN’s A3I systems.
The NVIDIA DGX SuperPOD is a first-of-its-kind artificial intelligence (AI) supercomputing infrastructure. DDN A³I with the EXA5 parallel file system is a turnkey, AI data storage infrastructure for rapid deployment, featuring faster performance, effortless scale, and simplified operations through deeper integration. The combined solution delivers groundbreaking performance, deploys in weeks as a fully integrated system, and is designed to solve the world's most challenging AI problems.
Watch the video: https://wp.me/p3RLHQ-lIV
Learn more: https://www.ddn.com/download/nvidia-superpod-ddn-a3i-ai400-appliance-with-the-exa5-filesystem/
and
https://www.nvidia.com/en-us/gtc/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Paul Isaacs from Linaro presents: State of ARM-based HPC. This talk provides an overview of applications and infrastructure services successfully ported to Aarch64 and benefiting from scale.
"With its debut on the TOP500, the 125,000-core Astra supercomputer at New Mexico’s Sandia Labs uses Cavium ThunderX2 chips to mark Arm’s entry into the petascale world. In Japan, the Fujitsu A64FX Arm-based CPU in the pending Fugaku supercomputer has been optimized to achieve high-level, real-world application performance, anticipating up to one hundred times the application execution performance of the K computer. K was the first computer to top 10 petaflops in 2011."
Watch the video: https://wp.me/p3RLHQ-lIT
Learn more: https://www.linaro.org/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
Today Xilinx announced Versal Premium, the third series in the Versal ACAP portfolio. The Versal Premium series features highly integrated, networked and power-optimized cores and the industry’s highest bandwidth and compute density on an adaptable platform. Versal Premium is designed for the highest bandwidth networks operating in thermally and spatially constrained environments, as well as for cloud providers who need scalable, adaptable application acceleration.
Versal is the industry’s first adaptive compute acceleration platform (ACAP), a revolutionary new category of heterogeneous compute devices with capabilities that far exceed those of conventional silicon architectures. Developed on TSMC’s 7-nanometer process technology, Versal Premium combines software programmability with dynamically configurable hardware acceleration and pre-engineered connectivity and security features to enable a faster time-to- market. The Versal Premium series delivers up to 3X higher throughput compared to current generation FPGAs, with built-in Ethernet, Interlaken, and cryptographic engines that enable fast and secure networks. The series doubles the compute density of currently deployed mainstream FPGAs and provides the adaptability to keep pace with increasingly diverse and evolving cloud and networking workloads.
Learn more: https://insidehpc.com/2020/03/xilinx-announces-versal-premium-acap-for-network-and-cloud-acceleration/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
In this video from the Rice Oil & Gas Conference, Chin Fang from Zettar presents: Moving Massive Amounts of Data across Any Distance Efficiently.
The objective of this talk is to present two on-going projects aiming at improving and ensuring highly efficient bulk transferring or streaming of massive amounts of data over digital connections across any distance. It examines the current state of the art, a few very common misconceptions, the differences among the three major type of data movement solutions, a current initiative attempting to improve the data movement efficiency from the ground up, and another multi-stage project that shows how to conduct long distance large scale data movement at speed and scale internationally. Both projects have real world motivations, e.g. the ambitious data transfer requirements of Linac Coherent Light Source II (LCLS-II) [1], a premier preparation project of the U.S. DOE Exascale Computing Initiative (ECI) [2]. Their immediate goals are described and explained, together with the solution used for each. Findings and early results are reported. Possible future works are outlined.
Watch the video: https://wp.me/p3RLHQ-lBX
Learn more: https://www.zettar.com/
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Rice Oil & Gas Conference, Bradley McCredie from AMD presents: Scaling TCO in a Post Moore's Law Era.
"While foundries bravely drive forward to overcome the technical and economic challenges posed by scaling to 5nm and beyond, Moore’s law alone can provide only a fraction of the performance / watt and performance / dollar gains needed to satisfy the demands of today’s high performance computing and artificial intelligence applications. To close the gap, multiple strategies are required. First, new levels of innovation and design efficiency will supplement technology gains to continue to deliver meaningful improvements in SoC performance. Second, heterogenous compute architectures will create x-factor increases of performance efficiency for the most critical applications. Finally, open software frameworks, APIs, and toolsets will enable broad ecosystems of application level innovation."
Watch the video:
Learn more: http://amd.com
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
In this deck from the ECSS Symposium, Abe Stern from NVIDIA presents: CUDA-Python and RAPIDS for blazing fast scientific computing.
"We will introduce Numba and RAPIDS for GPU programming in Python. Numba allows us to write just-in-time compiled CUDA code in Python, giving us easy access to the power of GPUs from a powerful high-level language. RAPIDS is a suite of tools with a Python interface for machine learning and dataframe operations. Together, Numba and RAPIDS represent a potent set of tools for rapid prototyping, development, and analysis for scientific computing. We will cover the basics of each library and go over simple examples to get users started. Finally, we will briefly highlight several other relevant libraries for GPU programming."
Watch the video: https://wp.me/p3RLHQ-lvu
Learn more: https://developer.nvidia.com/rapids
and
https://www.xsede.org/for-users/ecss/ecss-symposium
Sign up for our insideHPC Newsletter: http://insidehp.com/newsletter
In this deck from FOSDEM 2020, Colin Sauze from Aberystwyth University describes the development of a RaspberryPi cluster for teaching an introduction to HPC.
"The motivation for this was to overcome four key problems faced by new HPC users:
* The availability of a real HPC system and the effect running training courses can have on the real system, conversely the availability of spare resources on the real system can cause problems for the training course.
* A fear of using a large and expensive HPC system for the first time and worries that doing something wrong might damage the system.
* That HPC systems are very abstract systems sitting in data centres that users never see, it is difficult for them to understand exactly what it is they are using.
* That new users fail to understand resource limitations, in part because of the vast resources in modern HPC systems a lot of mistakes can be made before running out of resources. A more resource constrained system makes it easier to understand this.
The talk will also discuss some of the technical challenges in deploying an HPC environment to a Raspberry Pi and attempts to keep that environment as close to a "real" HPC as possible. The issue to trying to automate the installation process will also be covered."
Learn more: https://github.com/colinsauze/pi_cluster
and
https://fosdem.org/2020/schedule/events/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from ATPESC 2019, Ken Raffenetti from Argonne presents an overview of HPC interconnects.
"The Argonne Training Program on Extreme-Scale Computing (ATPESC) provides intensive, two-week training on the key skills, approaches, and tools to design, implement, and execute computational science and engineering applications on current high-end computing systems and the leadership-class computing systems of the future."
Watch the video: https://wp.me/p3RLHQ-luc
Learn more: https://extremecomputingtraining.anl.gov/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
A64fx and Fugaku - A Game Changing, HPC / AI Optimized Arm CPU to enable Exascale Performance
1. 1
Satoshi Matsuoka
Director, RIKEN Center for Computational Science
& Professor, Tokyo Institute of Technology
20190925 Linaro Connect @ San Diego, USA
A64fx and Fugaku - A Game Changing, HPC / AI Opti
mized Arm CPU to enable Exascale Performance
2. Riken R-CCS: Leadership HPC Research Center
“Science of Computing by Computing for Computing”
3. Tokyo
Kobe
423 km (263 miles)
west of Tokyo
Research
Building
Computer
building
Substation Supply
Chillers
K Computer Mae Station
Computer room 50 m x 60 m = 3,000 m2
Electric power up to 15 MW
Water cooling system
Gas-turbine co-generation 5 MW x 2
RIKEN R-CCS
R-CCS with K Computer
4. Specifications
Massively parallel, general purpose supercomputer
No. of nodes: 88,128
Peak speed: 11.28 Petaflops
Memory: 1.27 PB
Network: 6-dim mesh-torus (Tofu)
Top 500 ranking
LINPACK measures the speed and efficiency of linear
equation calculations.
Real applications require more complex computations.
No.1 in Jun. & Nov. 2011
No.20 in Jun 2019
HPCG ranking
Measures the speed and efficiency of solving linear
equation using HPCG
Better correlate to actual applications
No. 1 in Nov. 2017, No. 3 since Jun 2018
Graph 500 ranking
“Big Data” supercomputer ranking
Measures the ability of data-intensive loads
No. 1 for 9 consecutive editions
since 2015
K computer (decommissioned Aug. 16, 2019)
ACM Gordon Bell Prize
“Best HPC Application of the Year”
Winner 2011 & 2012. several finalists
First supercomputer in the world
to retire as #1 in major rankings
(Graph 500)
6. Broad Base --- Applicability & Capacity
Broad Applications: Simulation, Data Science, AI, …
Broad User Bae: Academia, Industry, Cloud Startups, …
High-Peak---Acceleration
ofLargeScaleApplication
(Capability)
Mt. Fuji representing
the ideal of supercomputing
The Nex-Gen “Fugaku”
⇥⇤⌅Supercomptuer
7. Arm64fx & Fugaku ( 富岳 ) are:
7
Fujitsu-Riken design A64fx ARM v8.2 (SVE), 48/52 core CPU
HPC Optimized: Extremely high package high memory BW (1TByte/s), on-
die Tofu-D network BW (~400Gbps), high SVE FLOPS (~3Teraflops), vario
us AI support (FP16, INT8, etc.)
Gen purpose CPU – Linux, Windows (Word), other SCs/Clouds
Extremely power efficient – > 10x power/perf efficiency for CFD benchma
rk over current mainstream x86 CPU
Largest and fastest supercomputer to be ever built circa 2020
> 150,000 nodes, superseding LLNL Sequoia
> 150 PetaByte/s memory BW
Tofu-D 6D Torus NW, 60 Petabps injection BW (10x global IDC traffic)
25~30PB NVMe L1 storage
~10,000 endpoint 100Gbps I/O network into Lustre
The first ‘exascale’ machine (not exa64bitflops but in apps perf.)
9. Co-Design Activities in Fugaku
Extremely tight collabrations between the Co-Design apps centers, Riken, and
Fujitsu, etc.
Chose 9 representative apps as “target application” scenario
Achieve up to x100 speedup c.f. K-Computer
Also ease-of-programming, broad SW ecosystem, very low power, …
⇥⇤⌅⇧⌃⇤⌥ ⌦⌅⇧↵⇧⌅⇧⌥ ⇧⌦⌥ ✏⇣⇣
⇥ ⌘✓◆⇧◆⇧⌅ ⌃⌃ ◆⌥ ⌫⇠⇧⇡⇢ ⌦⌥◆ ⌅
⌧⌥ ⌥◆⇤✓⇥ ⇤⇧⌦⌫ ⌥ ⇧⌦⇤✓⇢ ◆ !
" ↵⇧◆ ⌥ ⌅#⇧ ⌅⌥◆!" ⌥◆⇡!
⇥$⌦⌅⇥◆⇧⇡!%
Select
representatives
from 100s of
applications
signifying various
computational
characteristics
Design systems with
parameters that
consider various
application
characteristics
Science by
Computing
Science of
Computing
A 6 4 f x
For the
Post-K
supercomputer
10. Research Subjects of the Post-K Computer
11
The post K computer will expand the +elds pioneered by the K computer, and also challenge new areas.
11. Protein simulation before
K
Simulation of a protein in isolation
Folding simulation of Villin, a small protein
with 36 amino acids
Protein simulation with K
DNA
GROEL
Ribosome
ProteinsRibosome
GROEL
400nm
TRNA
100nm
ATP
water
ion
metabolites
Genesis MD: proteins in a cell environment
11
all atom simulation of a cell interior
cytoplasm of Mycoplasma genitalium
12. NICAM: Global Climate Simulation
Global cloud resolving model with 0.87 km-mesh which allows
resolution of cumulus clouds
Month-long forecasts of Madden-Julian oscillations in the tropics is
realized.
Global cloud
resolving model
Miyamoto et al (2013) , Geophys. Res. Lett., 40, 4922–4926,
doi:10.1002/grl.50944. 13
13. 13
Heart Simulator
Heartbeat, blood ejection, coronary
circulation are simulated consistently.
Applications explored
congenital heart diseases
Screening for drug-induced irregular
heartbeat risk
Multi-scale
simulator of
heart starting
from molecules
and building up
cells, tissues,
and heart
electrocardiogram
(ECG)
ultrasonic waves
UT-Heart, Inc., Fujitsu Limited
14. “Big Data Assimilation” NICAM+LETKF
Mutual feedback
High-precision Simulations
High-
precision
observatio
ns
Future-generation
technologies
available 10 years in
advance
15. 15
1. Heritage of the K-Computer, HP in simulation via extensive Co
• High performance: up to x100 performance of K in real applications
• Retain BYTES/FLOP of K (0.4~0.5) for real application performance
• Simultaneous high performance and ease-of-programming
@ 5. %⌅✓✓'⌥!⌅⌅✓0⇤✏⌘✓⌅ ✓*)⇧'⇤A⇧
• (⌘'%⌫ ⌃*✓⌃ ⇤⌅ I @0⌘⇤%⌘'% ✓⌃⌥47
✓⌥◆$◆ ⌦⌥ ⌅ &$⌦⌅◆'⌦($( ⇧ ⌅◆⌥ ✓) ⇧ ⇠✓
*+⌦⇧⌥⌅,(✏ ⌃⌃ ↵⇧-.*/⌥⌦⌅◆ ⌦⌦⌥⇤⌥◆⌅⇧
•J ⌃⌥⇣⌃ ⌅ @'@ ✏⌃ ✓5 ⌃ K ⌘⌅ ⌥
)⇤⌅◆✓0⌥◆⌥1⌦⇧⌥ ⌅ ⌥⇧⇡ *↵◆⇧⇥ ⌃0⌥◆⌦ ⌅◆⇤2
• #⌃ ⇣✓$⇤"✓⌥✏ L◆J" ✓⌅✏⌃⌘$⇧✏⌘✓⌅
3⌃ ✓)⇧ 4 "⌦ ⌅⌥ $⇣ ⇧⇤⇤⇧ ⌦⇢⇧⌃ ⌥◆!+/"⌦5⌥⇧⇡
0◆⇤67◆⌅⇧ ⌃⇤⌥ ⌥ ⌅⌅⇧ 8⇥9⇧⌅⇥
• (⌘'%⌫ ⌃*@✓⌅◆✓ ⌘✏⌥/@⌦⇤ ⌘⌅ @#!
◆⌦⇢⇧⌅⌥⌦⌅⇥◆⇤$⌥⌅⇥◆⌥ $◆⇢⇧⇡⇢⌃⌥◆$ +⌦⇧⌥⌅,(✏ ⌃⌃ ⌥
-⇧⇡# ⌅!: ;! ""# !-⇤⌦2⌦⇢ ⇧ ⌥⌦⇥◆⇧⌅!⌥⌅⌦(
Fugaku: The Game Changer
ARM: Massive
ecosystem from
embedded to HPC
Global leadership not
just in the machine &
apps, but as cutting
edge IT
ogy not just limited to Fugaku, but into societal IT infrastructures e.g. C
25. “Fugaku” Chronology
25
(Disclaimer: below includes speculative schedules and subject to change)
May 2018 A0 Chip came out, almost bug free
1Q2019 B0 Chip on hand, bug free, exceeded perf. target
Mar 2019 “Fugaku” manufacturing budget approval by the Diet, actual m
anufacturing contract signed (now w/Society 5.0 AI mission also)
Aug 2019 End of K-Computer operations
4Q2019 “Fugaku” installation starts
1H2020 “Fugaku” preproduction operation starts
1~2Q2021 “Fugaku” production operation starts (hopefully)
And of course we move on…
26. Overview of Fugaku System & Storage
26
3-level hierarchical storage
1st
Layer: GFS Cache + Temp FS (25~30 PB NVMe)
2nd
Layer: Lustre-based GFS (a few hundred PB HDD)
3rd
Layer: Off-site Cloud Storage
Full Machine Spec
>150,000 nodes
~8 million High Perf. Arm v8.2 Cores
> 150PB/s memory BW
Tofu-D 10x Global IDC traffic @ 60Pbps
~10,000 I/O fabric endpoints
> 400 racks
~40 MegaWatts Machine+IDC
PUE ~ 1.1 High Pressure DLC
NRE pays off: ~= 15~30 million
state-of-the art competing CPU
Cores for HPC workloads
(both dense and sparse problems)
28. ✓⌅< <
✓⌥2#✓
=⇥ ⇤⌥⌃◆⌥⌦⇧⇧ >
?@✏✏A✓B⌃
=C@DA>
⇣⇣(C✓B⌃
✓⌥2+✓
=⇧⇡⇤⌥⌃◆⌥⌦⇧⇧ >
?E✏✏A✓B⌃
=F✏DA>
⇣⇣(C✓B⌃
✓⌥2⇠✓
=⇢ ⇤$⌃◆⌥⌦⇧⇧ >
?⇣G✏✏A✓B⌃
=⇣@⇣DA>
55
3⌅⇤ ⌥ ◆
0⇧⌅⇢
?⇣,✏A✓- ⌥⌦
=⌘DA>
,!⇣E@3- ⌥⌦
Catego
ry Priority Issue Area
Performance
Speedup over K
Application Brief description
Health
and
longevit
y
1. Innovative computing
infrastructure for drug
discovery
125x + GENESIS MD for proteins
2. Personalized and
preventive medicine using big
data
8x + Genomon
Genome processing
(Genome alignment)
Disaste
r
prevent
ion and
Environ
ment
3. Integrated simulation
systems induced by
earthquake and tsunami
45x + GAMERA
Earthquake simulator (FEM in
unstructured & structured grid)
4. Meteorological and global
environmental prediction
using big data
120x +
NICAM+
LETKF
Weather prediction system using
Big data (structured grid stencil &
ensemble Kalman filter)
Energy
issue
5. New technologies for
energy creation, conversion /
storage, and use
40x + NTChem
Molecular electronic simulation
(structure calculation)
6. Accelerated development
of innovative clean energy
systems
35x +
Adventur
e
Computational Mechanics
System for Large Scale Analysis
and Design (unstructured grid)
Industri
al
compet
itivenes
s
enhanc
ement
7. Creation of new functional
devices and high-
performance materials
30x + RSDFT
Ab-initio simulation
(density functional theory)
8. Development of innovative
design and production
processes
25x + FFB
Large Eddy Simulation
(unstructured grid)
Basic
science
9. Elucidation of the
fundamental laws and
evolution of the universe
25x + LQCD
Lattice QCD simulation
(structured grid Monte Carlo)
Fugaku Performance Estimate on 9 Co-Design Target Apps
Performance target goal
Peak performance to be achieved
100 times faster than K for some
applications (tuning included)
30 to 40 MW power consumption
As of 2019/05/14
Geometric Mean of Performance
Speedup of the 9 Target Applications
over the K-Computer
> 37x+
29. Fugaku Programming Environment
29
Programing Languages and Compilers provided by Fujits
u
Fortran2008 & Fortran2018 subset
C11 & GNU and Clang extensions
C++14 & C++17 subset and GNU and Clang extensio
ns
OpenMP 4.5 & OpenMP 5.0 subset
Java
Parallel Programming Language & Domain Specific Librar
y provided by RIKEN
XcalableMP
FDPS (Framework for Developing Particle Simulator)
Process/Thread Library provided by RIKEN
PiP (Process in Process)
Script Languages provided by Linux distributor
E.g., Python+NumPy, SciPy
Communication Libraries
MPI 3.1 & MPI4.0 subset
Open MPI base (Fujitsu), MPICH (RIKEN )
Low-level Communication Libraries
uTofu (Fujitsu), LLC(RIKEN )
File I/O Libraries provided by RIKEN
Lustre
pnetCDF, DTF, FTAR
Math Libraries
BLAS, LAPACK, ScaLAPACK, SSL II ( Fujitsu )
EigenEXA, Batched BLAS ( RIKEN )
Programming Tools provided by Fujitsu
Profiler, Debugger, GUI
NEW: Containers (Singularity) and other Cloud APIs
NEW: AI software stacks (w/ARM)
NEW: DoE Spack Package Manager
GCC and LLVM will be also
available
30. OSS Application Porting @ Arm HPC Users Group
http://arm-hpc.gitlab.io/
(http://arm-hpc.gitlab.io/)
Application Lang. GCC LLVM Arm Fujitsu
LAMMPS C++ Modi+ed Modi+ed Modi+ed Modi+ed
GROMACS C Modi+ed Modi+ed Modi+ed Modi+ed
GAMESS* Fortran Modi+ed Modi+ed Modi+ed Modi+ed
OpenFOAM C++ Modi+ed Modi+ed Modi+ed Modi+ed
NAMD C++ Modi+ed Modi+ed Modi+ed Modi+ed
WRF Fortran Modi+ed Modi+ed Modi+ed Modi+ed
Quantum
ESPRESSO
Fortran Ok in as is Ok in as is Ok in as is Modi+ed
NWChem Fortran Ok in as is Modi+ed Modi+ed Modi+ed
ABINIT Fortran Modi+ed Modi+ed Modi+ed Modi+ed
CP2K Fortran Ok in as is Issues found Issues found Modi+ed
NEST* C++ Ok in as is Modi+ed Modi+ed Modi+ed
BLAST* C++ Ok in as is Modi+ed Modi+ed Modi+ed
31. 31
Industry use of Fugaku via inter
mediary cloud SaaS vendors, F
ugaku as IaaS
A64fx and other Fugaku Techn
ology being incorporated into t
he Cloud
Fugaku Cloud Strategy
HPC SaaS
Provider 1
HPC SaaS
Provider 2
HPC SaaS
Provider 3
Industr
y User
1
Industr
y User
2
Industr
y User
3
Various Cloud
Service API for
HPC
Other
IaaS
Commer
cial
Cloud
Extreme
Performanc
e
Advantage
KVM/
Singularity,
Kubernetes,
Cloud
Vendor 1
Cloud
Vendor 2
Cloud
Vendor 3
Cloud Workload
Becoming HPC
(including AI)
↓
SigniYcant
Performance
Advantage
↓
Millions of Units
shipped to Cloud
⇥⇤
33. Pursuing Convergence of HPC & AI (1)
33
Acceleration of Simulation (first principles methods) with AI (empiric
al method) : AI for HPC
Interpolation & Extrapolation of long trajectory MD
Reducing parameter space on Paretho optimization of results
Adjusting convergence parameters for iterative methods etc.
AI replacing simulation when exact physical models are unclear, or
excessively costly to compute
Acceleration of AI with HPC: HPC for AI
HPC Processing of training data -data cleansing
Acceleration of (Parallel) Training: Deeper networks, bigger trainin
g sets, complicated networks, high dimensional data…
Acceleration of Inference: above + real time streaming data
Various modern training algorithms: Reinforcement learning, GAN,
Dilated Convolution, etc.
34. Deep Learning Meets HPC
6 orders of magnitude compute increase in 5 years
[Slide Courtesy Rick Stevens @ ANL]
Exascale Needs for Deep Learnin
g
• Automated Model Discovery
• Hyper Parameter Optimization
• Uncertainty Quantification
• Flexible Ensembles
• Cross-Study Model Transfer
• Data Augmentation
• Synthetic Data Generation
• Reinforcement Learning
Exaop/s-day
35. Large Scale simulation and AI coming together
[Ichimura et. al. Univ. of Tokyo, IEEE/ACM SC17 Best Poster
2018 Gordon Bell Finalist]
130 billion freedom
earthquake of entire Tokyo
on K-Computer (2018 ACM
Gordon Bell Prize Finalist,
SC16,17 Best Poster)
35Too Many Instances
Earthquake
Soft Soil <100m
⇧ Candidate
Underground
Structure 1
Candidate
Undergrou
Structure 2
AI Trained by Simulation
to generate candidate
soft soil structure
36. 4 Layers of Parallelism in DNN Training
• Hyper Parameter Search
• Searching optimal network configs & parameters
• Parallel search, massive parallelism required
• Data Parallelism
• Copy the network to compute nodes, feed different bat
ch data, average => network reduction bound
• TOFU: Extremely strong reduction, x6 EDR Infiniband
• Model Parallelism (domain decomposition)
• Split and parallelize the layer calculations in propagatio
n
• Low latency required (bad for GPU) -> strong latency tol
erant cores + low latency TOFU network
• Intra-Chip ILP, Vector and other low level Pa
rallelism
• Parallelize the convolution operations etc.
• SVE FP16+INT8 vectorization support + extremely high
memory bandwidth w/HBM2
36
Intra-Node
Inter-Node
Massive
amount of
total
parallelism,
only possible
37. 37
Post-K Processor
◆High perf FP16&Int8
◆High mem BW for convolution
◆Built-in scalable Tofu network
Unprecedened DL scalability
High Performance DNN Convolution
;0✓◆⌥⌦⇧⇧ ;)A⇠⇧⇡⇢ ⌥ ◆ - 0⇧⌅⇢
A ↵ ⌦⌥ ⇧⇧⇡$ ↵⇤⇥⌅⇧
⇤⇡◆⇧⌅⇢ =883A.⇧⇡◆ A⌧" >
High Performance and Ultra-Scalable Networ
for massive scaling model & data parallelism
) ⌃◆⌥⌦⌥ ⌥ ⌅⌥ +⌦⇤ ⇧⇤⇧⌅$# ⌅
Massive Scale Deep Learning on Post-K
C P U
For the
Post-K
supercomputer
C P U
For the
Post-K
supercomputer
C P U
For the
Post-K
supercomputer
C P U
For the
Post-K
supercomputer
TOFU Network w/
high injection BW
for fast
reduction