In this deck from the RISC-V Workshop in Barcelona, Mateo Valero, Director of the Barcelona Supercomputer center, explains how the RISC-V architecture can play a main role in new supercomputer architectures.
"RISC-V is an open, free ISA enabling a new era of processor innovation through open standard collaboration. Born in academia and research, RISC-V ISA delivers a new level of free, extensible software and hardware freedom on architecture, paving the way for the next 50 years of computing design and innovation."
Watch the video interview:
Learn more: https://tmt.knect365.com/risc-v-workshop-barcelona/
The European Commission recently announced the creation of the European Processor Initiative (EPI), a European consortium to co-design, develop and bring to the market a European low-power microprocessor. EPI will start in 2018 and will develop the first European High Performance Computing (HPC) Systems on Chip (SoC) and accelerators. Both elements will be implemented and validated in a prototype system that will become the basis for a full Exascale machine based on European technology
2017 Atlanta Regional User Seminar - Residential Battery Storage Systems. Des...OPAL-RT TECHNOLOGIES
Sonnen is a leading manufacturer of residential battery storage systems in Europe and the US. They use Opal-RT hardware-in-the-loop systems to test the dynamic operation of bi-directional inverters, optimize battery charging and discharging algorithms using real weather and demand data, validate and test new software releases, and develop algorithms to monitor battery health by measuring impedance. The Opal-RT systems allow accelerated testing without external hardware.
El nuevo superordenador Mare Nostrum y el futuro procesador europeoAMETIC
Presentación a cargo de Mateo Valero, del BSC-CNC, en el 33er Encuentro de la Economía Digital y las Telecomunicaciones organizado por AMETIC y Santander Empresas en colaboración con la UIMP
OPAL-RT RT14: Power Hardware-In-the-Loop (PHIL) with EtherCAT ProtocolOPAL-RT TECHNOLOGIES
This document summarizes presentations at the 7th International Conference on Real-Time Simulation Technologies in Montreal from June 9-12, 2014. It includes presentations from OPAL-RT and TRIPHASE on their EtherCAT amplifier solution for power hardware-in-the-loop simulations using OPAL-RT's OP5600 product. There will also be a hands-on demonstration of power hardware-in-the-loop with EtherCAT networking and a discussion of future partnerships between OPAL-RT and TRIPHASE.
OPAL-RT RT14 Conference: Power System Monitoring and Operator TrainingOPAL-RT TECHNOLOGIES
The document summarizes a presentation given at the 7th International Conference on Real-Time Simulation Technologies in Montreal from June 9-12, 2014. The presentation was given by Frank Carrera and Vahid Jalili-Marandi from Electric Power Group and discussed their synchrophasor solutions called ePHASORsim and RTDMS. ePHASORsim is a real-time transient stability simulator and RTDMS is a synchrophasor-based software system for real-time monitoring, visualization, and analysis of power systems. The presentation demonstrated how ePHASORsim can be used to simulate phasor data which is then sent to RTDMS for real-time visualization and analysis during operator training
Preliminary Test Results: High Performance Optically Pumped Cesium Beam ClockADVA
Patrick Berthoud’s presentation, delivered at WSTS 2016 in San Jose, reveals design specifications and the results of initial testing of Oscilloquartz's new high-performance optically pumped cesium beam clock.
Polar Use Case - ExtremeEarth Open WorkshopExtremeEarth
This document provides an overview of an ExtremeEarth project that aims to apply deep learning techniques to classify sea ice in polar regions using satellite imagery. The project has received funding from the European Union. It discusses challenges in classifying sea ice from SAR imagery compared to optical imagery. It outlines user requirements for sea ice products, including high resolution (300m or better) and frequent updates (near real-time). The document describes workflows using the Polar Thematic Exploitation Platform (Polar TEP) for large-scale sea ice mapping using Copernicus satellite data and machine learning algorithms. It also discusses exploitation of results, including the impact of Polar TEP and efforts to facilitate the polar machine learning community.
1. The document describes the hardware implementation of a QPSK modulator for satellite communication.
2. It discusses the design and simulation of the QPSK modulator in Matlab, and the implementation of the modulator on a Virtex-4 FPGA using VHDL.
3. The results show that the QPSK modulator was successfully implemented on the FPGA board and matched the simulated waveform and spectrum.
The European Commission recently announced the creation of the European Processor Initiative (EPI), a European consortium to co-design, develop and bring to the market a European low-power microprocessor. EPI will start in 2018 and will develop the first European High Performance Computing (HPC) Systems on Chip (SoC) and accelerators. Both elements will be implemented and validated in a prototype system that will become the basis for a full Exascale machine based on European technology
2017 Atlanta Regional User Seminar - Residential Battery Storage Systems. Des...OPAL-RT TECHNOLOGIES
Sonnen is a leading manufacturer of residential battery storage systems in Europe and the US. They use Opal-RT hardware-in-the-loop systems to test the dynamic operation of bi-directional inverters, optimize battery charging and discharging algorithms using real weather and demand data, validate and test new software releases, and develop algorithms to monitor battery health by measuring impedance. The Opal-RT systems allow accelerated testing without external hardware.
El nuevo superordenador Mare Nostrum y el futuro procesador europeoAMETIC
Presentación a cargo de Mateo Valero, del BSC-CNC, en el 33er Encuentro de la Economía Digital y las Telecomunicaciones organizado por AMETIC y Santander Empresas en colaboración con la UIMP
OPAL-RT RT14: Power Hardware-In-the-Loop (PHIL) with EtherCAT ProtocolOPAL-RT TECHNOLOGIES
This document summarizes presentations at the 7th International Conference on Real-Time Simulation Technologies in Montreal from June 9-12, 2014. It includes presentations from OPAL-RT and TRIPHASE on their EtherCAT amplifier solution for power hardware-in-the-loop simulations using OPAL-RT's OP5600 product. There will also be a hands-on demonstration of power hardware-in-the-loop with EtherCAT networking and a discussion of future partnerships between OPAL-RT and TRIPHASE.
OPAL-RT RT14 Conference: Power System Monitoring and Operator TrainingOPAL-RT TECHNOLOGIES
The document summarizes a presentation given at the 7th International Conference on Real-Time Simulation Technologies in Montreal from June 9-12, 2014. The presentation was given by Frank Carrera and Vahid Jalili-Marandi from Electric Power Group and discussed their synchrophasor solutions called ePHASORsim and RTDMS. ePHASORsim is a real-time transient stability simulator and RTDMS is a synchrophasor-based software system for real-time monitoring, visualization, and analysis of power systems. The presentation demonstrated how ePHASORsim can be used to simulate phasor data which is then sent to RTDMS for real-time visualization and analysis during operator training
Preliminary Test Results: High Performance Optically Pumped Cesium Beam ClockADVA
Patrick Berthoud’s presentation, delivered at WSTS 2016 in San Jose, reveals design specifications and the results of initial testing of Oscilloquartz's new high-performance optically pumped cesium beam clock.
Polar Use Case - ExtremeEarth Open WorkshopExtremeEarth
This document provides an overview of an ExtremeEarth project that aims to apply deep learning techniques to classify sea ice in polar regions using satellite imagery. The project has received funding from the European Union. It discusses challenges in classifying sea ice from SAR imagery compared to optical imagery. It outlines user requirements for sea ice products, including high resolution (300m or better) and frequent updates (near real-time). The document describes workflows using the Polar Thematic Exploitation Platform (Polar TEP) for large-scale sea ice mapping using Copernicus satellite data and machine learning algorithms. It also discusses exploitation of results, including the impact of Polar TEP and efforts to facilitate the polar machine learning community.
1. The document describes the hardware implementation of a QPSK modulator for satellite communication.
2. It discusses the design and simulation of the QPSK modulator in Matlab, and the implementation of the modulator on a Virtex-4 FPGA using VHDL.
3. The results show that the QPSK modulator was successfully implemented on the FPGA board and matched the simulated waveform and spectrum.
El Barcelona Supercomputing Center (BSC) fue establecido en 2005 y alberga el MareNostrum, uno de los superordenadores más potentes de España. Somos el centro pionero de la supercomputación en España. Nuestra especialidad es la computación de altas prestaciones - también conocida como HPC o High Performance Computing- y nuestra misión es doble: ofrecer infraestructuras y servicio de supercomputación a los científicos españoles y europeos, y generar conocimiento y tecnología para transferirlos a la sociedad. Somos Centro de Excelencia Severo Ochoa, miembros de primer nivel de la infraestructura de investigación europea PRACE (Partnership for Advanced Computing in Europe), y gestionamos la Red Española de Supercomputación (RES). Como centro de investigación, contamos con más de 456 expertos de 45 países, organizados en cuatro grandes áreas de investigación: Ciencias de la computación, Ciencias de la vida, Ciencias de la tierra y aplicaciones computacionales en ciencia e ingeniería.
The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be on to something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.
The TOP500 list is compiled by Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; Jack Dongarra of the University of Tennessee, Knoxville; and Martin Meuer of Prometeus, Germany.
¿Es posible construir el Airbus de la Supercomputación en Europa?AMETIC
Presentación a cargo de Mateo Valero, Director del Barcelona Supercomputing Center, en el marco de la 30ª edición de los Encuentros de Telecomunicaciones y Economía Digital.
In this deck from the HPC User Forum in Detroit, Bob Sorensen from Hyperion Research presents: Exascale Update. As a research firm, Hyperion is tracking the development of Exascale supercomputers worldwide.
"The four geographies actively developing Exascale machines are: USA, China, Europe, and Japan. While it is important to emphasize that this is not a race, the first machine to achieve Exascale in terms of sustained LINPACK should be the A21 Aurora system at Argonne in 2021. It will be followed soon after by machines from all the other active projects."
Watch the video: https://wp.me/p3RLHQ-j1U
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses the emergence of computation for interdisciplinary large data analysis. It notes that exponential increases in computational power and data are driving changes in science and engineering. Computational modeling is becoming a third pillar of science alongside theory and experimentation. However, continued increases in clock speeds are no longer feasible due to power constraints, necessitating the use of multi-core processors and parallelism. This is driving changes in software design to expose parallelism.
Check out the latest in OpenACC this month including the PGI 18.1 release, GTC 2018 activity, paper highlights, upcoming events and a call for paper submissions.
Stay up-to-date with the OpenACC Monthly Highlights. July's edition covers the OpenACC Summit 2021, upcoming GPU Hackathons and Bootcamps, PEARC21 panel review , recent research, new resources and more!
Fugaku is designed to be the centerpiece of Japan's Society 5.0 vision. It is being constructed to be the world's first exascale supercomputer, with a target speed of over 100 times faster than the previous K supercomputer for some applications. Fugaku will have over 150,000 nodes, 150 petaflops of memory bandwidth, and a peak performance of over 400 petaflops for double precision calculations. It aims to accelerate high performance computing, big data, and AI workloads for important societal domains like healthcare, disaster prevention, energy, and manufacturing.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the first remote GPU Hackathons, a complete schedule of upcoming events, using OpenACC for a biophysics problem, NVIDIA HPC SDK, GCC 10, new resources and more!
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
Satoshi Matsuoka from RIKEN gave this talk at the HPC User Forum in Santa Fe.
"With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. Post-K is the flagship next generation national supercomputer being developed by Riken and Fujitsu in collaboration. Post-K will have hyperscale class resource in one exascale machine, with well more than 100,000 nodes of sever-class A64fx many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community.
Rather than to focus on double precision flops that are of lesser utility, rather Post-K, especially its Arm64fx processor and the Tofu-D network is designed to sustain extreme bandwidth on realistic applications including those for oil and gas, such as seismic wave propagation, CFD, as well as structural codes, besting its rivals by several factors in measured performance. Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/ML infrastructure. Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node."
Watch the video: https://wp.me/p3RLHQ-k6G
Learn more: https://en.wikichip.org/wiki/supercomputers/post-k
and
http://hpcuserforum.com
Stay up-to-date with the OpenACC Monthly Highlights. February's edition covers the updated specification OpenACC 3.2, upcoming GPU Hackathons and Bootcamps, OpenACC's BOF at SC21 , recent research, new resources and more!
CINECA for HCP and e-infrastructures infrastructuresCineca
Sanzio Bassini. Head of the HPC Department of Cineca. Cineca is the technological partner of the Ministry of Education, and takes part in the Italian commitment for the development of e-infrastrcuture in Italy and in Europe for HCP and HCP technologies; scientific data repository and management, cloud computing for industries and Public administration, for the development of computing intensive and data intensive methods for science and engineering
Cineca offers a unique offer for: open access of integrated tier0 and tier1 HCP national infrastructure; of education and training activities under the umbrella of PRACE Training
advanced center action; integrated help desk and scale up process for HCP users support
Exploring the Performance Impact of Virtualization on an HPC CloudRyousei Takano
The document evaluates the performance impact of virtualization on high-performance computing (HPC) clouds. Experiments were conducted on the AIST Super Green Cloud, a 155-node HPC cluster. Benchmark results show that while PCI passthrough mitigates I/O overhead, virtualization still incurs performance penalties for MPI collectives as node counts increase. Application benchmarks demonstrate overhead is limited to around 5%. The study concludes HPC clouds are promising due to utilization improvements from virtualization, but further optimization of virtual machine placement and pass-through technologies could help reduce overhead.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers working on applications for the new Frontier supercomputer, using OpenACC for weather forecasting, upcoming GPU Hackathons and Bootcamps, and new resources!
Achitecture Aware Algorithms and Software for Peta and Exascaleinside-BigData.com
Jack Dongarra from the University of Tennessee presented these slides at Ken Kennedy Institute of Information Technology on Feb 13, 2014.
Listen to the podcast review of this talk: http://insidehpc.com/2014/02/13/week-hpc-jack-dongarra-talks-algorithms-exascale/
Opportunities of ML-based data analytics in ABCIRyousei Takano
This document discusses opportunities for using machine learning-based data analytics on the ABCI supercomputer system. It summarizes:
1) An introduction to the ABCI system and how it is being used for AI research.
2) How sensor data from the ABCI system and job logs could be analyzed using machine learning to optimize data center operation and improve resource utilization and scheduling.
3) Two potential use cases - using workload prediction to enable more efficient cooling system control, and applying machine learning to better predict job execution times to improve scheduling.
Scaling Green Instrumentation to more than 10 Million Coresinside-BigData.com
In this deck from SC18, Satoshi Matsuoka from RIKEN presents: Scaling Green Instrumentation to more than 10 Million Cores.
"We now have some sites that are extremely well instrumented for measuring power and energy. Lawrence Berkeley National Laboratory’s supercomputing center is a prime example, with sensors, data collection and analysis capabilities that span the facility and computing equipment. We have made major gains in improving the energy efficiency of the facility as well as computing hardware, but there are still large gains to be had with software- particularly application software. Just tuning code for performance isn’t enough; the same time to solution can have very different power profiles. We are at the point where measurement capabilities are allowing us to see cross-cutting issues- such as the cost of spin waits. These new measurement capabilities should provide a wealth of information to see the tall poles in the tent. This panel will explore how we can identify these emerging tall poles."
Learn more: http://sc18.supercomputing.org/proceedings/panel/panel_pages/pan101.html
and
https://wp.me/p3RLHQ-jHZ
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impactinside-BigData.com
In this deck from the 2014 HPC User Forum in Seattle, John A. Turner from Oak Ridge National Laboratory presents: Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact.
Necesidades de supercomputacion en las empresas españolasCein
This document discusses enterprise needs for supercomputing in Spain. It finds that while usage is currently low, major sectors like energy, finance, aeronautics and automobiles have needs for applications like modeling, simulation, CFD and electromagnetism. However, enterprises face barriers to supercomputing like proprietary software licenses, confidentiality concerns, and issues moving large amounts of data. The document proposes addressing these barriers through open source software and making supercomputing resources more accessible to private companies engaged in research and development.
The document discusses the top 5 technologies that all organizations must understand: digital transformation, quantum computing, IoT, 5G, and AI/HPC. It provides an overview of each technology including opportunities and threats to organizations. The document emphasizes that understanding these emerging technologies is mandatory as the information revolution changes many aspects of life and business.
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
In this deck from IWOCL / SYCLcon 2020, Hal Finkel from Argonne National Laboratory presents: Preparing to program Aurora at Exascale - Early experiences and future directions.
"Argonne National Laboratory’s Leadership Computing Facility will be home to Aurora, our first exascale supercomputer. Aurora promises to take scientific computing to a whole new level, and scientists and engineers from many different fields will take advantage of Aurora’s unprecedented computational capabilities to push the boundaries of human knowledge. In addition, Aurora’s support for advanced machine-learning and big-data computations will enable scientific workflows incorporating these techniques along with traditional HPC algorithms. Programming the state-of-the-art hardware in Aurora will be accomplished using state-of-the-art programming models. Some of these models, such as OpenMP, are long-established in the HPC ecosystem. Other models, such as Intel’s oneAPI, based on SYCL, are relatively-new models constructed with the benefit of significant experience. Many applications will not use these models directly, but rather, will use C++ abstraction libraries such as Kokkos or RAJA. Python will also be a common entry point to high-performance capabilities. As we look toward the future, features in the C++ standard itself will become increasingly relevant for accessing the extreme parallelism of exascale platforms.
This presentation will summarize the experiences of our team as we prepare for Aurora, exploring how to port applications to Aurora’s architecture and programming models, and distilling the challenges and best practices we’ve developed to date. oneAPI/SYCL and OpenMP are both critical models in these efforts, and while the ecosystem for Aurora has yet to mature, we’ve already had a great deal of success. Importantly, we are not passive recipients of programming models developed by others. Our team works not only with vendor-provided compilers and tools, but also develops improved open-source LLVM-based technologies that feed both open-source and vendor-provided capabilities. In addition, we actively participate in the standardization of OpenMP, SYCL, and C++. To conclude, I’ll share our thoughts on how these models can best develop in the future to support exascale-class systems."
Watch the video: https://wp.me/p3RLHQ-lPT
Learn more: https://www.iwocl.org/iwocl-2020/conference-program/
and
https://www.anl.gov/topic/aurora
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
El Barcelona Supercomputing Center (BSC) fue establecido en 2005 y alberga el MareNostrum, uno de los superordenadores más potentes de España. Somos el centro pionero de la supercomputación en España. Nuestra especialidad es la computación de altas prestaciones - también conocida como HPC o High Performance Computing- y nuestra misión es doble: ofrecer infraestructuras y servicio de supercomputación a los científicos españoles y europeos, y generar conocimiento y tecnología para transferirlos a la sociedad. Somos Centro de Excelencia Severo Ochoa, miembros de primer nivel de la infraestructura de investigación europea PRACE (Partnership for Advanced Computing in Europe), y gestionamos la Red Española de Supercomputación (RES). Como centro de investigación, contamos con más de 456 expertos de 45 países, organizados en cuatro grandes áreas de investigación: Ciencias de la computación, Ciencias de la vida, Ciencias de la tierra y aplicaciones computacionales en ciencia e ingeniería.
The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be on to something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.
The TOP500 list is compiled by Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; Jack Dongarra of the University of Tennessee, Knoxville; and Martin Meuer of Prometeus, Germany.
¿Es posible construir el Airbus de la Supercomputación en Europa?AMETIC
Presentación a cargo de Mateo Valero, Director del Barcelona Supercomputing Center, en el marco de la 30ª edición de los Encuentros de Telecomunicaciones y Economía Digital.
In this deck from the HPC User Forum in Detroit, Bob Sorensen from Hyperion Research presents: Exascale Update. As a research firm, Hyperion is tracking the development of Exascale supercomputers worldwide.
"The four geographies actively developing Exascale machines are: USA, China, Europe, and Japan. While it is important to emphasize that this is not a race, the first machine to achieve Exascale in terms of sustained LINPACK should be the A21 Aurora system at Argonne in 2021. It will be followed soon after by machines from all the other active projects."
Watch the video: https://wp.me/p3RLHQ-j1U
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses the emergence of computation for interdisciplinary large data analysis. It notes that exponential increases in computational power and data are driving changes in science and engineering. Computational modeling is becoming a third pillar of science alongside theory and experimentation. However, continued increases in clock speeds are no longer feasible due to power constraints, necessitating the use of multi-core processors and parallelism. This is driving changes in software design to expose parallelism.
Check out the latest in OpenACC this month including the PGI 18.1 release, GTC 2018 activity, paper highlights, upcoming events and a call for paper submissions.
Stay up-to-date with the OpenACC Monthly Highlights. July's edition covers the OpenACC Summit 2021, upcoming GPU Hackathons and Bootcamps, PEARC21 panel review , recent research, new resources and more!
Fugaku is designed to be the centerpiece of Japan's Society 5.0 vision. It is being constructed to be the world's first exascale supercomputer, with a target speed of over 100 times faster than the previous K supercomputer for some applications. Fugaku will have over 150,000 nodes, 150 petaflops of memory bandwidth, and a peak performance of over 400 petaflops for double precision calculations. It aims to accelerate high performance computing, big data, and AI workloads for important societal domains like healthcare, disaster prevention, energy, and manufacturing.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the first remote GPU Hackathons, a complete schedule of upcoming events, using OpenACC for a biophysics problem, NVIDIA HPC SDK, GCC 10, new resources and more!
Arm A64fx and Post-K: Game-Changing CPU & Supercomputer for HPC, Big Data, & AIinside-BigData.com
Satoshi Matsuoka from RIKEN gave this talk at the HPC User Forum in Santa Fe.
"With rapid rise and increase of Big Data and AI as a new breed of high-performance workloads on supercomputers, we need to accommodate them at scale, and thus the need for R&D for HW and SW Infrastructures where traditional simulation-based HPC and BD/AI would converge, in a BYTES-oriented fashion. Post-K is the flagship next generation national supercomputer being developed by Riken and Fujitsu in collaboration. Post-K will have hyperscale class resource in one exascale machine, with well more than 100,000 nodes of sever-class A64fx many-core Arm CPUs, realized through extensive co-design process involving the entire Japanese HPC community.
Rather than to focus on double precision flops that are of lesser utility, rather Post-K, especially its Arm64fx processor and the Tofu-D network is designed to sustain extreme bandwidth on realistic applications including those for oil and gas, such as seismic wave propagation, CFD, as well as structural codes, besting its rivals by several factors in measured performance. Post-K is slated to perform 100 times faster on some key applications c.f. its predecessor, the K-Computer, but also will likely to be the premier big data and AI/ML infrastructure. Currently, we are conducting research to scale deep learning to more than 100,000 nodes on Post-K, where we would obtain near top GPU-class performance on each node."
Watch the video: https://wp.me/p3RLHQ-k6G
Learn more: https://en.wikichip.org/wiki/supercomputers/post-k
and
http://hpcuserforum.com
Stay up-to-date with the OpenACC Monthly Highlights. February's edition covers the updated specification OpenACC 3.2, upcoming GPU Hackathons and Bootcamps, OpenACC's BOF at SC21 , recent research, new resources and more!
CINECA for HCP and e-infrastructures infrastructuresCineca
Sanzio Bassini. Head of the HPC Department of Cineca. Cineca is the technological partner of the Ministry of Education, and takes part in the Italian commitment for the development of e-infrastrcuture in Italy and in Europe for HCP and HCP technologies; scientific data repository and management, cloud computing for industries and Public administration, for the development of computing intensive and data intensive methods for science and engineering
Cineca offers a unique offer for: open access of integrated tier0 and tier1 HCP national infrastructure; of education and training activities under the umbrella of PRACE Training
advanced center action; integrated help desk and scale up process for HCP users support
Exploring the Performance Impact of Virtualization on an HPC CloudRyousei Takano
The document evaluates the performance impact of virtualization on high-performance computing (HPC) clouds. Experiments were conducted on the AIST Super Green Cloud, a 155-node HPC cluster. Benchmark results show that while PCI passthrough mitigates I/O overhead, virtualization still incurs performance penalties for MPI collectives as node counts increase. Application benchmarks demonstrate overhead is limited to around 5%. The study concludes HPC clouds are promising due to utilization improvements from virtualization, but further optimization of virtual machine placement and pass-through technologies could help reduce overhead.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers working on applications for the new Frontier supercomputer, using OpenACC for weather forecasting, upcoming GPU Hackathons and Bootcamps, and new resources!
Achitecture Aware Algorithms and Software for Peta and Exascaleinside-BigData.com
Jack Dongarra from the University of Tennessee presented these slides at Ken Kennedy Institute of Information Technology on Feb 13, 2014.
Listen to the podcast review of this talk: http://insidehpc.com/2014/02/13/week-hpc-jack-dongarra-talks-algorithms-exascale/
Opportunities of ML-based data analytics in ABCIRyousei Takano
This document discusses opportunities for using machine learning-based data analytics on the ABCI supercomputer system. It summarizes:
1) An introduction to the ABCI system and how it is being used for AI research.
2) How sensor data from the ABCI system and job logs could be analyzed using machine learning to optimize data center operation and improve resource utilization and scheduling.
3) Two potential use cases - using workload prediction to enable more efficient cooling system control, and applying machine learning to better predict job execution times to improve scheduling.
Scaling Green Instrumentation to more than 10 Million Coresinside-BigData.com
In this deck from SC18, Satoshi Matsuoka from RIKEN presents: Scaling Green Instrumentation to more than 10 Million Cores.
"We now have some sites that are extremely well instrumented for measuring power and energy. Lawrence Berkeley National Laboratory’s supercomputing center is a prime example, with sensors, data collection and analysis capabilities that span the facility and computing equipment. We have made major gains in improving the energy efficiency of the facility as well as computing hardware, but there are still large gains to be had with software- particularly application software. Just tuning code for performance isn’t enough; the same time to solution can have very different power profiles. We are at the point where measurement capabilities are allowing us to see cross-cutting issues- such as the cost of spin waits. These new measurement capabilities should provide a wealth of information to see the tall poles in the tent. This panel will explore how we can identify these emerging tall poles."
Learn more: http://sc18.supercomputing.org/proceedings/panel/panel_pages/pan101.html
and
https://wp.me/p3RLHQ-jHZ
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impactinside-BigData.com
In this deck from the 2014 HPC User Forum in Seattle, John A. Turner from Oak Ridge National Laboratory presents: Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact.
Necesidades de supercomputacion en las empresas españolasCein
This document discusses enterprise needs for supercomputing in Spain. It finds that while usage is currently low, major sectors like energy, finance, aeronautics and automobiles have needs for applications like modeling, simulation, CFD and electromagnetism. However, enterprises face barriers to supercomputing like proprietary software licenses, confidentiality concerns, and issues moving large amounts of data. The document proposes addressing these barriers through open source software and making supercomputing resources more accessible to private companies engaged in research and development.
Similar to European Processor Initiative & RISC-V (20)
The document discusses the top 5 technologies that all organizations must understand: digital transformation, quantum computing, IoT, 5G, and AI/HPC. It provides an overview of each technology including opportunities and threats to organizations. The document emphasizes that understanding these emerging technologies is mandatory as the information revolution changes many aspects of life and business.
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
In this deck from IWOCL / SYCLcon 2020, Hal Finkel from Argonne National Laboratory presents: Preparing to program Aurora at Exascale - Early experiences and future directions.
"Argonne National Laboratory’s Leadership Computing Facility will be home to Aurora, our first exascale supercomputer. Aurora promises to take scientific computing to a whole new level, and scientists and engineers from many different fields will take advantage of Aurora’s unprecedented computational capabilities to push the boundaries of human knowledge. In addition, Aurora’s support for advanced machine-learning and big-data computations will enable scientific workflows incorporating these techniques along with traditional HPC algorithms. Programming the state-of-the-art hardware in Aurora will be accomplished using state-of-the-art programming models. Some of these models, such as OpenMP, are long-established in the HPC ecosystem. Other models, such as Intel’s oneAPI, based on SYCL, are relatively-new models constructed with the benefit of significant experience. Many applications will not use these models directly, but rather, will use C++ abstraction libraries such as Kokkos or RAJA. Python will also be a common entry point to high-performance capabilities. As we look toward the future, features in the C++ standard itself will become increasingly relevant for accessing the extreme parallelism of exascale platforms.
This presentation will summarize the experiences of our team as we prepare for Aurora, exploring how to port applications to Aurora’s architecture and programming models, and distilling the challenges and best practices we’ve developed to date. oneAPI/SYCL and OpenMP are both critical models in these efforts, and while the ecosystem for Aurora has yet to mature, we’ve already had a great deal of success. Importantly, we are not passive recipients of programming models developed by others. Our team works not only with vendor-provided compilers and tools, but also develops improved open-source LLVM-based technologies that feed both open-source and vendor-provided capabilities. In addition, we actively participate in the standardization of OpenMP, SYCL, and C++. To conclude, I’ll share our thoughts on how these models can best develop in the future to support exascale-class systems."
Watch the video: https://wp.me/p3RLHQ-lPT
Learn more: https://www.iwocl.org/iwocl-2020/conference-program/
and
https://www.anl.gov/topic/aurora
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Greg Wahl from Advantech presents: Transforming Private 5G Networks.
Advantech Networks & Communications Group is driving innovation in next-generation network solutions with their High Performance Servers. We provide business critical hardware to the world's leading telecom and networking equipment manufacturers with both standard and customized products. Our High Performance Servers are highly configurable platforms designed to balance the best in x86 server-class processing performance with maximum I/O and offload density. The systems are cost effective, highly available and optimized to meet next generation networking and media processing needs.
“Advantech’s Networks and Communication Group has been both an innovator and trusted enabling partner in the telecommunications and network security markets for over a decade, designing and manufacturing products for OEMs that accelerate their network platform evolution and time to market.” Said Advantech Vice President of Networks & Communications Group, Ween Niu. “In the new IP Infrastructure era, we will be expanding our expertise in Software Defined Networking (SDN) and Network Function Virtualization (NFV), two of the essential conduits to 5G infrastructure agility making networks easier to install, secure, automate and manage in a cloud-based infrastructure.”
In addition to innovation in air interface technologies and architecture extensions, 5G will also need a new generation of network computing platforms to run the emerging software defined infrastructure, one that provides greater topology flexibility, essential to deliver on the promises of high availability, high coverage, low latency and high bandwidth connections. This will open up new parallel industry opportunities through dedicated 5G network slices reserved for specific industries dedicated to video traffic, augmented reality, IoT, connected cars etc. 5G unlocks many new doors and one of the keys to its enablement lies in the elasticity and flexibility of the underlying infrastructure.
Advantech’s corporate vision is to enable an intelligent planet. The company is a global leader in the fields of IoT intelligent systems and embedded platforms. To embrace the trends of IoT, big data, and artificial intelligence, Advantech promotes IoT hardware and software solutions with the Edge Intelligence WISE-PaaS core to assist business partners and clients in connecting their industrial chains. Advantech is also working with business partners to co-create business ecosystems that accelerate the goal of industrial intelligence."
Watch the video: https://wp.me/p3RLHQ-lPQ
* Company website: https://www.advantech.com/
* Solution page: https://www2.advantech.com/nc/newsletter/NCG/SKY/benefits.html
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
In this deck from the Stanford HPC Conference, Katie Lewis from Lawrence Livermore National Laboratory presents: The Incorporation of Machine Learning into Scientific Simulations at Lawrence Livermore National Laboratory.
"Scientific simulations have driven computing at Lawrence Livermore National Laboratory (LLNL) for decades. During that time, we have seen significant changes in hardware, tools, and algorithms. Today, data science, including machine learning, is one of the fastest growing areas of computing, and LLNL is investing in hardware, applications, and algorithms in this space. While the use of simulations to focus and understand experiments is well accepted in our community, machine learning brings new challenges that need to be addressed. I will explore applications for machine learning in scientific simulations that are showing promising results and further investigation that is needed to better understand its usefulness."
Watch the video: https://youtu.be/NVwmvCWpZ6Y
Learn more: https://computing.llnl.gov/research-area/machine-learning
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
In this deck from the Stanford HPC Conference, DK Panda from Ohio State University presents: How to Achieve High-Performance, Scalable and Distributed DNN Training on Modern HPC Systems?
"This talk will start with an overview of challenges being faced by the AI community to achieve high-performance, scalable and distributed DNN training on Modern HPC systems with both scale-up and scale-out strategies. After that, the talk will focus on a range of solutions being carried out in my group to address these challenges. The solutions will include: 1) MPI-driven Deep Learning, 2) Co-designing Deep Learning Stacks with High-Performance MPI, 3) Out-of- core DNN training, and 4) Hybrid (Data and Model) parallelism. Case studies to accelerate DNN training with popular frameworks like TensorFlow, PyTorch, MXNet and Caffe on modern HPC systems will be presented."
Watch the video: https://youtu.be/LeUNoKZVuwQ
Learn more: http://web.cse.ohio-state.edu/~panda.2/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
In this deck from the Stanford HPC Conference, Nick Nystrom and Paola Buitrago provide an update from the Pittsburgh Supercomputing Center.
Nick Nystrom is Chief Scientist at the Pittsburgh Supercomputing Center (PSC). Nick is architect and PI for Bridges, PSC's flagship system that successfully pioneered the convergence of HPC, AI, and Big Data. He is also PI for the NIH Human Biomolecular Atlas Program’s HIVE Infrastructure Component and co-PI for projects that bring emerging AI technologies to research (Open Compass), apply machine learning to biomedical data for breast and lung cancer (Big Data for Better Health), and identify causal relationships in biomedical big data (the Center for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence). His current research interests include hardware and software architecture, applications of machine learning to multimodal data (particularly for the life sciences) and to enhance simulation, and graph analytics.
Watch the video: https://youtu.be/LWEU1L1o7yY
Learn more: https://www.psc.edu/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses using systems intelligence and artificial intelligence/neural networks to enhance semiconductor electronic design automation (EDA) workflows by collecting telemetry data from EDA jobs and infrastructure and analyzing it using complex event processing, machine learning models, and messaging substrates to provide insights that could optimize EDA pipelines and infrastructure. The approach aims to allow both internal and external augmentation of EDA processes and environments through unsupervised and incremental learning.
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
In this deck from the Stanford HPC Conference, Nicole Xu from Stanford University describes how she transformed a common jellyfish into a bionic creature that is part animal and part machine.
"Animal locomotion and bioinspiration have the potential to expand the performance capabilities of robots, but current implementations are limited. Mechanical soft robots leverage engineered materials and are highly controllable, but these biomimetic robots consume more power than corresponding animal counterparts. Biological soft robots from a bottom-up approach offer advantages such as speed and controllability but are limited to survival in cell media. Instead, biohybrid robots that comprise live animals and self- contained microelectronic systems leverage the animals’ own metabolism to reduce power constraints and body as an natural scaffold with damage tolerance. We demonstrate that by integrating onboard microelectronics into live jellyfish, we can enhance propulsion up to threefold, using only 10 mW of external power input to the microelectronics and at only a twofold increase in cost of transport to the animal. This robotic system uses 10 to 1000 times less external power per mass than existing swimming robots in literature and can be used in future applications for ocean monitoring to track environmental changes."
Watch the video: https://youtu.be/HrmJFyvInj8
Learn more: https://sanfrancisco.cbslocal.com/2020/02/05/stanford-research-project-common-jellyfish-bionic-sea-creatures/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Stanford HPC Conference, Peter Dueben from the European Centre for Medium-Range Weather Forecasts (ECMWF) presents: Machine Learning for Weather Forecasts.
"I will present recent studies that use deep learning to learn the equations of motion of the atmosphere, to emulate model components of weather forecast models and to enhance usability of weather forecasts. I will than talk about the main challenges for the application of deep learning in cutting-edge weather forecasts and suggest approaches to improve usability in the future."
Peter is contributing to the development and optimization of weather and climate models for modern supercomputers. He is focusing on a better understanding of model error and model uncertainty, on the use of reduced numerical precision that is optimised for a given level of model error, on global cloud- resolving simulations with ECMWF's forecast model, and the use of machine learning, and in particular deep learning, to improve the workflow and predictions. Peter has graduated in Physics and wrote his PhD thesis at the Max Planck Institute for Meteorology in Germany. He worked as Postdoc with Tim Palmer at the University of Oxford and has taken up a position as University Research Fellow of the Royal Society at the European Centre for Medium-Range Weather Forecasts (ECMWF) in 2017.
Watch the video: https://youtu.be/ks3fkRj8Iqc
Learn more: https://www.ecmwf.int/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Gilad Shainer from the HPC AI Advisory Council describes how this organization fosters innovation in the high performance computing community.
"The HPC-AI Advisory Council’s mission is to bridge the gap between high-performance computing (HPC) and Artificial Intelligence (AI) use and its potential, bring the beneficial capabilities of HPC and AI to new users for better research, education, innovation and product manufacturing, bring users the expertise needed to operate HPC and AI systems, provide application designers with the tools needed to enable parallel computing, and to strengthen the qualification and integration of HPC and AI system products."
Watch the video: https://wp.me/p3RLHQ-lNz
Learn more: http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Today RIKEN in Japan announced that the Fugaku supercomputer will be made available for research projects aimed to combat COVID-19.
"Fugaku is currently being installed and is scheduled to be available to the public in 2021. However, faced with the devastating disaster unfolding before our eyes, RIKEN and MEXT decided to make a portion of the computational resources of Fugaku available for COVID-19-related projects ahead of schedule while continuing the installation process.
Fugaku is being developed not only for the progress in science, but also to help build the society dubbed as the “Society 5.0” by the Japanese government, where all people will live safe and comfortable lives. The current initiative to fight against the novel coronavirus is driven by the philosophy behind the development of Fugaku."
Initial Projects
Exploring new drug candidates for COVID-19 by "Fugaku"
Yasushi Okuno, RIKEN / Kyoto University
Prediction of conformational dynamics of proteins on the surface of SARS-Cov-2 using Fugaku
Yuji Sugita, RIKEN
Simulation analysis of pandemic phenomena
Nobuyasu Ito, RIKEN
Fragment molecular orbital calculations for COVID-19 proteins
Yuji Mochizuki, Rikkyo University
In this deck from the Performance Optimisation and Productivity group, Lubomir Riha from IT4Innovations presents: Energy Efficient Computing using Dynamic Tuning.
"We now live in a world of power-constrained architectures and systems and power consumption represents a significant cost factor in the overall HPC system economy. For these reasons, in recent years researchers, supercomputing centers and major vendors have developed new tools and methodologies to measure and optimize the energy consumption of large-scale high performance system installations. Due to the link between energy consumption, power consumption and execution time of an application executed by the final user, it is important for these tools and the methodology used to consider all these aspects, empowering the final user and the system administrator with the capability of finding the best configuration given different high level objectives.
This webinar focused on tools designed to improve the energy-efficiency of HPC applications using a methodology of dynamic tuning of HPC applications, developed under the H2020 READEX project. The READEX methodology has been designed for exploiting the dynamic behaviour of software. At design time, different runtime situations (RTS) are detected and optimized system configurations are determined. RTSs with the same configuration are grouped into scenarios, forming the tuning model. At runtime, the tuning model is used to switch system configurations dynamically.
The MERIC tool, that implements the READEX methodology, is presented. It supports manual or binary instrumentation of the analysed applications to simplify the analysis. This instrumentation is used to identify and annotate the significant regions in the HPC application. Automatic binary instrumentation annotates regions with significant runtime. Manual instrumentation, which can be combined with automatic, allows code developer to annotate regions of particular interest."
Watch the video: https://wp.me/p3RLHQ-lJP
Learn more: https://pop-coe.eu/blog/14th-pop-webinar-energy-efficient-computing-using-dynamic-tuning
and
https://code.it4i.cz/vys0053/meric
Sign up for our insideHPC Newsletter: http://insidehpc.com/newslett
The document discusses how DDN A3I storage solutions and Nvidia's SuperPOD platform can enable HPC at scale. It provides details on DDN's A3I appliances that are optimized for AI and deep learning workloads and validated for Nvidia's DGX-2 SuperPOD reference architecture. The solutions are said to deliver the fastest performance, effortless scaling, reliability and flexibility for data-intensive workloads.
In this deck, Paul Isaacs from Linaro presents: State of ARM-based HPC. This talk provides an overview of applications and infrastructure services successfully ported to Aarch64 and benefiting from scale.
"With its debut on the TOP500, the 125,000-core Astra supercomputer at New Mexico’s Sandia Labs uses Cavium ThunderX2 chips to mark Arm’s entry into the petascale world. In Japan, the Fujitsu A64FX Arm-based CPU in the pending Fugaku supercomputer has been optimized to achieve high-level, real-world application performance, anticipating up to one hundred times the application execution performance of the K computer. K was the first computer to top 10 petaflops in 2011."
Watch the video: https://wp.me/p3RLHQ-lIT
Learn more: https://www.linaro.org/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
Today Xilinx announced Versal Premium, the third series in the Versal ACAP portfolio. The Versal Premium series features highly integrated, networked and power-optimized cores and the industry’s highest bandwidth and compute density on an adaptable platform. Versal Premium is designed for the highest bandwidth networks operating in thermally and spatially constrained environments, as well as for cloud providers who need scalable, adaptable application acceleration.
Versal is the industry’s first adaptive compute acceleration platform (ACAP), a revolutionary new category of heterogeneous compute devices with capabilities that far exceed those of conventional silicon architectures. Developed on TSMC’s 7-nanometer process technology, Versal Premium combines software programmability with dynamically configurable hardware acceleration and pre-engineered connectivity and security features to enable a faster time-to- market. The Versal Premium series delivers up to 3X higher throughput compared to current generation FPGAs, with built-in Ethernet, Interlaken, and cryptographic engines that enable fast and secure networks. The series doubles the compute density of currently deployed mainstream FPGAs and provides the adaptability to keep pace with increasingly diverse and evolving cloud and networking workloads.
Learn more: https://insidehpc.com/2020/03/xilinx-announces-versal-premium-acap-for-network-and-cloud-acceleration/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
In this video from the Rice Oil & Gas Conference, Chin Fang from Zettar presents: Moving Massive Amounts of Data across Any Distance Efficiently.
The objective of this talk is to present two on-going projects aiming at improving and ensuring highly efficient bulk transferring or streaming of massive amounts of data over digital connections across any distance. It examines the current state of the art, a few very common misconceptions, the differences among the three major type of data movement solutions, a current initiative attempting to improve the data movement efficiency from the ground up, and another multi-stage project that shows how to conduct long distance large scale data movement at speed and scale internationally. Both projects have real world motivations, e.g. the ambitious data transfer requirements of Linac Coherent Light Source II (LCLS-II) [1], a premier preparation project of the U.S. DOE Exascale Computing Initiative (ECI) [2]. Their immediate goals are described and explained, together with the solution used for each. Findings and early results are reported. Possible future works are outlined.
Watch the video: https://wp.me/p3RLHQ-lBX
Learn more: https://www.zettar.com/
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Rice Oil & Gas Conference, Bradley McCredie from AMD presents: Scaling TCO in a Post Moore's Law Era.
"While foundries bravely drive forward to overcome the technical and economic challenges posed by scaling to 5nm and beyond, Moore’s law alone can provide only a fraction of the performance / watt and performance / dollar gains needed to satisfy the demands of today’s high performance computing and artificial intelligence applications. To close the gap, multiple strategies are required. First, new levels of innovation and design efficiency will supplement technology gains to continue to deliver meaningful improvements in SoC performance. Second, heterogenous compute architectures will create x-factor increases of performance efficiency for the most critical applications. Finally, open software frameworks, APIs, and toolsets will enable broad ecosystems of application level innovation."
Watch the video:
Learn more: http://amd.com
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
In this deck from the ECSS Symposium, Abe Stern from NVIDIA presents: CUDA-Python and RAPIDS for blazing fast scientific computing.
"We will introduce Numba and RAPIDS for GPU programming in Python. Numba allows us to write just-in-time compiled CUDA code in Python, giving us easy access to the power of GPUs from a powerful high-level language. RAPIDS is a suite of tools with a Python interface for machine learning and dataframe operations. Together, Numba and RAPIDS represent a potent set of tools for rapid prototyping, development, and analysis for scientific computing. We will cover the basics of each library and go over simple examples to get users started. Finally, we will briefly highlight several other relevant libraries for GPU programming."
Watch the video: https://wp.me/p3RLHQ-lvu
Learn more: https://developer.nvidia.com/rapids
and
https://www.xsede.org/for-users/ecss/ecss-symposium
Sign up for our insideHPC Newsletter: http://insidehp.com/newsletter
In this deck from FOSDEM 2020, Colin Sauze from Aberystwyth University describes the development of a RaspberryPi cluster for teaching an introduction to HPC.
"The motivation for this was to overcome four key problems faced by new HPC users:
* The availability of a real HPC system and the effect running training courses can have on the real system, conversely the availability of spare resources on the real system can cause problems for the training course.
* A fear of using a large and expensive HPC system for the first time and worries that doing something wrong might damage the system.
* That HPC systems are very abstract systems sitting in data centres that users never see, it is difficult for them to understand exactly what it is they are using.
* That new users fail to understand resource limitations, in part because of the vast resources in modern HPC systems a lot of mistakes can be made before running out of resources. A more resource constrained system makes it easier to understand this.
The talk will also discuss some of the technical challenges in deploying an HPC environment to a Raspberry Pi and attempts to keep that environment as close to a "real" HPC as possible. The issue to trying to automate the installation process will also be covered."
Learn more: https://github.com/colinsauze/pi_cluster
and
https://fosdem.org/2020/schedule/events/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from ATPESC 2019, Ken Raffenetti from Argonne presents an overview of HPC interconnects.
"The Argonne Training Program on Extreme-Scale Computing (ATPESC) provides intensive, two-week training on the key skills, approaches, and tools to design, implement, and execute computational science and engineering applications on current high-end computing systems and the leadership-class computing systems of the future."
Watch the video: https://wp.me/p3RLHQ-luc
Learn more: https://extremecomputingtraining.anl.gov/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Generating privacy-protected synthetic data using Secludy and MilvusZilliz
During this demo, the founders of Secludy will demonstrate how their system utilizes Milvus to store and manipulate embeddings for generating privacy-protected synthetic data. Their approach not only maintains the confidentiality of the original data but also enhances the utility and scalability of LLMs under privacy constraints. Attendees, including machine learning engineers, data scientists, and data managers, will witness first-hand how Secludy's integration with Milvus empowers organizations to harness the power of LLMs securely and efficiently.
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
2. Barcelona Supercomputing Center
Centro Nacional de Supercomputación
Spanish Government 60%
Catalan Government 30%
Univ. Politècnica de Catalunya (UPC) 10%
BSC-CNS is
a consortium
that includes
BSC-CNS objectives
Supercomputing services
to Spanish and
EU researchers
R&D in Computer,
Life, Earth and
Engineering Sciences
PhD programme,
technology transfer,
public engagement
4. Context: The international Exascale challenge
• Sustained real life application performances, not just
Linpack…
• Exascale will not just allow present solutions to run faster,
but will enable new solutions not affordable with today HPC
technology
• From simulation to high predictability for precise medicine,
energy, climate change, autonomous driven vehicles…
• The International context (US, China, Japan and EU…)
• The European HPC programme
• The European Processor initiative
• BSC role
6. According to HPL According to HPCG benchmark
Top5
Rank Name Country Rmax Rpeak
%
Efficiency
1
Sunway
TaihuLight
China 93,015 125,436 74.15 %
2 Tianhe-2
China
33,863 54,902
61.68%
3 Piz Daint
Switzerland
19,590 25,326
77.35%
4 Gyoukou
Japan
19,135 28,192
67.87%
5 Titan
US
17,590 27,113
64.87%
6 Sequoia US 17,173 20,133 85.30%
7 Trinity US 14,137 43,902 32.20%
8 Cori US 14,015 27,881 50.27%
9
Oakforest-
PACS
Japan 13,555 24,913 54.41%
10 K Computer Japan 10,510 11,280 93.17%
Rank Name Rmax Rpeak
%
Efficiency
1 K Computer 603 11,280 5.34%
2 Tianhe-2 580 54,902 1.06%
3 Trinity 546
43,902
1.24%
4 Piz Daint 486
25,326
1.92%
5
Sunway
TaihuLight
481
125,436
0.38%
6
Oakforest-
PACS
385
24,913
1.54%
7 Cori 355
27,881
1.27%
8 Sequoia 330
20,133
1.64%
9 Titan 322
27,113
1.19%
10 Mira 167 10,066 1.66%
7. According to HPL According to HPCG benchmark
Top5
Rank Name Country Rmax Rpeak
%
Efficiency
1
Sunway
TaihuLight
China 93,015 125,436 74.15 %
2 Tianhe-2
China
33,863 54,902
61.68%
3 Piz Daint
Switzerland
19,590 25,326
77.35%
4 Gyoukou
Japan
19,135 28,192
67.87%
5 Titan
US
17,590 27,113
64.87%
6 Sequoia US 17,173 20,133 85.30%
7 Trinity US 14,137 43,902 32.20%
8 Cori US 14,015 27,881 50.27%
9
Oakforest-
PACS
Japan 13,555 24,913 54.41%
10 K Computer Japan 10,510 11,280 93.17%
16
Mare
nostrum
Spain
6,471
10,296 62,85%
Rank Name Rmax Rpeak
%
Efficiency
1 K Computer 603 11,280 5.34%
2 Tianhe-2 580 54,902 1.06%
3 Trinity 546
43,902
1.24%
4 Piz Daint 486
25,326
1.92%
5
Sunway
TaihuLight
481
125,436
0.38%
6
Oakforest-
PACS
385
24,913
1.54%
7 Cori 355
27,881
1.27%
8 Sequoia 330
20,133
1.64%
9 Titan 322
27,113
1.19%
10 Mira 167 10,066 1.66%
15
Mare
Nostrum
122 10,296 1,18%
8. Rank Previous rank Machine Country Number of cores GTEPS
1 1 K computer Japan 663,552 38,621
2 2 Sunway TaihuLight China 10,599,680 23,755
3 3 DOE/NNSA/LLNL Sequoia USA 1,572,864 23,751
4 4
DOE/SC/Argonne National
Laboratory Mira
USA 786,432 14,982
5 5 JUQUEEN Germany 262,144 5,848
6 new ALCF Mira - 8192 partition United States 131,072 4,212
7 6 ALCF Mira - 8192 partition USA 131,072 3,556
8 7 Fermi Italy 131,072 2,567
9 new ALCF Mira - 4096 partition United States 65,536 2,348
10 8 Tianhe-2 (MilkyWay-2) China 196,608 2,061
8
Graph500
9. Rank
TOP500
Rank
System Cores
Rmax
(TFlop/s)
Power
(kW)
Power Efficiency
(GFlops/watts)
1 259
Shoubu system B - PEZY Computing
RIKEN -Japan
794,400 842.0 50 16.84
2 307
Suiren2 - PEZY Computing
KEK -Japan
762,624 788.2 47 16.77
3 276
Sakura - PEZY Computing
PEZY Computing K.K. -Japan
794,400 824.7 50 16.49
4 149
DGX SaturnV Volta - NVIDIA Tesla V100
NVIDIA Corporation -United States
22,440 1,070.0 97 11.03
5 4
Gyoukou - PEZY-SC2 700Mhz
Japan
19,860,000 19,135.8 1,350 14.17
6 13
TSUBAME3.0 - NVIDIA Tesla P100 SXM2
Japan
135,828 8,125.0 792 10.26
7 195
AIST AI Cloud - NVIDIA Tesla P100 SXM2
Japan
23,400 961.0 76 12.64
8 419
RAIDEN GPU subsystem - NVIDIA Tesla
P100
Japan
11,712 635.1 60 10.59
9 115
Wilkes-2 - NVIDIA Tesla P100
University of Cambridge - United Kingdom
21,240 1,193.0 114 10.46
10 3
Piz Daint - NVIDIA Tesla P100
Switzerland
361,760 19,590.0 2,272 8.62
33 16
MareNostrum- Lenovo SD530
Barcelona Supercomputing Center
Spain
153,216 6,470.8 1,632 3.97
9
Green500
12. MareNostrum4
Total peak performance: 13,7 Pflops
General Purpose Cluster: 11.15 Pflops (1.07.2017)
CTE1-P9+Volta: 1.57 Pflops (1.03.2018)
CTE2-Arm V8: 0.5 Pflops (????)
CTE3-KNH?: 0.5 Pflops (????)
MareNostrum 1
2004 – 42,3 Tflops
1st Europe / 4th World
New technologies
MareNostrum 2
2006 – 94,2 Tflops
1st Europe / 5th World
New technologies
MareNostrum 3
2012 – 1,1 Pflops
12th Europe / 36th World
MareNostrum 4
2017 – 11,1 Pflops
2nd Europe / 13th World
New technologies
13. Worldwide HPC roadmaps
From Tianhe-2..
…to Tianhe-2A
with domestic
technology.
From K computer…
… to Post K
with domestic
technology.
From the PPP for
HPC…
to future PRACE
systems…
…with domestic
technology
with domestic
technology.
IPCEI on HPC
?
14. US launched RFP for Exascale (April 2018)
• To develop at least two new exascale supercomputers for the DOE at
a cost of up to $1.8 billion
• The deployment timeline for these new systems begins in the third
quarter of 2021, with ORNL’s exascale supercomputer, followed by a
third quarter 2022 system installation at LLNL. The ANL addition or
upgrade, if it happens, will also take place in the third quarter of
2022.
• The new systems can’t exceed 40 MW, with the preferred power
draw in the 20 to 30 MW (including exascale, counting storage,
cooling and any other auxiliary equipment )
• The other critical requirement is that the ORNL and ANL systems are
architecturally diverse from one other
• Proposals are due in May, the bidders will be selected before the end
of the Q2
• Each system is expected to cost between $400 to $600 million
second quarter.
15. Worldwide HPC roadmaps
From Tianhe-2..
…to Tianhe-2A
with domestic
technology.
From K computer…
… to Post K
with domestic
technology.
From the PPP for
HPC…
to future PRACE
systems…
…with domestic
technology
with domestic
technology.
IPCEI on HPC
?
16. EU HPC Ecosystem
• Specifications of exascale prototypes
• Technological options for future systems
• Identify applications for co-
design of exascale systems
• Innovative methods and
algorithms for extreme
parallelism of traditional &
emerging applications
• Collaboration of HPC
Supercomputing
Centres and application CoEs
• Provision of HPC capabilities
and expertise
Centers of Excellence in HPC applications
18. 18
A big challenge, and a huge opportunity for Europe
18
Extend current mobile chips with the needed HPC features
– Explore the use vector architectures in mobile accelerators (vector processor ARM-based, 15+ Teraflops chip, 150
watts)… unique opportunity for Europe
– One design for all market segments: mobile, data centers, supercomputers
2011 2012 2013 2014 2015 2016 2017
256 nodes
250 GFLOPS
1.7 Kwatt
120 TFLOPS
80 Kwatt
200 PFLOPS
~10 MWatt
Built with the best
of the market
Built with the best
that is coming
What is the best
that we could do?
GFLOPS/W
Integrated
ARM + GPU
19. Mont-Blanc HPC Stack for ARM
Industrial applications
System software
Hardware
Applications
20. World Top 20 machines (status November 2017)
Europe has only 4 machines in world top 20
■ Italy (CINECA) – Nr 14
■ UK (Meteorological office) – Nr 15
■ Spain (BSC, Barcelona) – Nr 16
■ Germany (HLRS, Stuttgart) – Nr 19
EU not in HPC world leaders
21. BSC and the European Commission
Final plenary panel at ICT -
Innovate, Connect, Transform
conference, 22 October 2015
Lisbon, Portugal.
The transformational impact of excellent science in
research and innovation
22. Paris, 27 October 2015
European Commission President
Jean-Claude Juncker
"Our ambition is for Europe to become one of
the top 3 world leaders in high-performance
computing by 2020"
The European Commission and HPC
Vice-President Andrus Ansip
"I encourage even more EU countries to
engage in this ambitious endeavour"
• Ministers from seven MS (France, Germany, Italy,
Luxembourg, Netherlands, Portugal and Spain) sign a
declaration to support the next generation of
computing and data infrastructures
Digital Day Rome, 23 March 2017
23. The EuroHPC Declaration
Declaration signed in Rome, March 23rd, 2017 by:
Agree to work towards the
establishment of a
cooperation framework -
EuroHPC - for acquiring and
deploying an integrated
exascale supercomputing
infrastructure that will be
available across the EU for
scientific communities as well
as public and private partners
France Germany Italy Luxembourg Netherlands Portugal Spain
Belgium Slovenia Bulgaria Switzerland Greece Croatia
Six more countries signed the Declaration:
25. FET 2014-2017
HW/SW building blocks and co-design
2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
l l l l l l l l ll
pre-exascale
HPC Ecosystem development
FET 2020: extreme scale HPC
systems and applications
exascale
FET 2019: extreme scale computing
& data for key applications
LEIT-ICT 2017-2020: Framework Programme Agreement (FPA)
Low-power/Microprocessor HPC
LEIT-ICT 2018: Extreme Scale
Demonstrators
Technology &
Applications
development
integration
in co-design
procurement
Widening access
and services
LEIT-ICT 2018: HPC / Big Data enabled
Large-scale Test-beds and Applications
HPC timeline in H2020 LEIT/FET
(indicative)
26. EPI 23 partners, from research to industry
from consortium to EU high
tech fabless
EU - FPA Semiconductor
EPI
Common
Platform
Fabless company
Industrial hand of EPI
Incorporated by a
couple EPI members
and external investors
1st EPI production
27. Three streams
General purpose and Common Platform
• ARM SVE or other candidates…
• BULL: System integrator chip integrator
Accelerator
• RISC-V
• EU design: BSC, CEA, Chalmers, ETHZ, EXTOLL, E4,
FORTH, Fraunhofer, IST, UNIBO, UNIZG, Semidynamics
Automotive
• Infineon, BMW…
28. 2018 2019 2020 2021 2022 2023 2024 2025
CPU?
ACCEL.
HPC System
PreExascale
HPC
Chip &
system
Accel.
Chip &
system
Core
Technolog
y
KeyMarkets
HPC
Chip &
system
Accel.
Chip &
system
Gen 1 Gen 2
HPCCARS
Automotive CPU
Proof of Concept
Gen 3
Automotive CPU
Product
HPC System
Exascale
SGA 3 SGA4SGA 1&2
EPI ROADMAP
29. RISC-V accelerator vision @ EPI
• High throughput devices
• Long Vectors (a la Cray? A la Cyber205? ...)
• Decouple Front end - Back end engines
• Optimize memory throughput ([Command vector, 98])
• Explicit locality management (long register file)
• ISA is important
• Decouple/hide again hardware details, reuse SW technologies (compilers, OS,…),
• Specific instructions?
• “limited” number of control flows
• Hierarchical Acceleration
• Nesting
• Low power: ~ low voltage x ~ low frequency
• MPI+OpenMP
• Task based, throughput oriented programming approach
• Malleability in application + Dynamic resource (cores, power, BW) management
• Intelligent runtimes & Runtime Aware Architectures
• Architectural support for the runtime
• Accelerator for ML
• Specialized “non Von-Neumann” compute and data motion engines (neural/stencil)
• Tuned numerical precision
30. BSC and EPI
• EPI is a H2020 EU funded initiative restricted to the 23 original
partners, selected according to EU rules
• EPI plans considering additional participants in future, provided
resources will become available
• In EPI BSC is the leader of the Accelerator activities and contributor
in the rest of the technical programme, including the Common
Platform
• BSC will promote the EPI agenda within its vast academic network
• BSC is open to additional collaboration outside and within EPI to
anyone in the world interested in producing RISC-V IP in Europe and
especially in Barcelona
• Collaboration with the HPC global vendors will remain a key element
of BSC strategy
• Everybody interested in RISC-V is welcome! Just come and talk to
us…
32. BSC is Hiring
BSC is looking for talented and motivated professionals with expertise in
the design and verification of IPs to be integrated into top-level HPC SoC
designs. The immediate responsibilities of this group will be related to
The European Processor Initiative.
Experienced professionals (Engineers
and/or PhD holders) wanted for:
• RTL/Microarchitecture
• Verification
• FPGA Design
Find out more:
https://www.bsc.es/join-
us/job-opportunities/103csrre
Or contact: rrhh@bsc.es