In this deck from the 2014 HPC User Forum in Seattle, Michael Brown from Intel presents: Advancing Science in Alternative Energy and Bioengineering with Many-Core Processors.
Watch the video presentation: http://wp.me/p3RLHQ-d1C
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, MasheryDelyn Simons
Seems like most companies are exploring what it means to build an API and open that to the developer community. But what does it mean to have an API? How should you be thinking about your API strategically for your business? Hear from companies who have successfully launched APIs for their business and how they approach their API from strategy to execution.
Shifting market forces are disrupting entire industries, forcing product managers to think about APIs and services more strategically to remain nimble and relevant. Thinking about the products you manage in a platform way allows you extend the reach of your connected products -- in ways you may never been able to predict.
This document summarizes key information from a seminar on getting medical devices approved by the FDA. It discusses common reasons why companies fail FDA approval, including inadequate software documentation. The document outlines FDA guidance and standards for software documentation, design, validation, and human factors. It also discusses FDA concerns about cybersecurity for networked devices and provides an overview of FDA guidance on managing cybersecurity risks.
This presentation is intended for the customer facing risk managers, sales staff, and IT staff of a medical device manufacturer and their medical doctors and IT hospital and clinical counterparts.
It is intended to give an overview and highlight process considerations for incident management and reporting of cybersecurity issues.
It is based on the technical paper published by Pam Gilmore and Valdez Ladd in the ISSA Journal in 2014.
A presentation delivered June 4, 2009 describing the impact of the economic crisis on venture and angel investing and common sense steps for fundraising for Medical Device Startups. No one is an expert now.
Karthikeyan Gopal has extensive experience in product design, development, risk management and project management for medical devices. He has worked on projects involving infusion pumps, consumables, combination products and more. His skills include systems engineering, design engineering, validation engineering, risk management, CAD, requirements management, and experience with various medical device standards and regulations. He has held roles at companies such as HCL Technologies, Hospira, American Medical Systems, and Baxter.
Terrat | Aug-15 | Where are smart villages? 2Smart Villages
The East Africa Masterclass at Terrat focused on the village level experience of off-grid energy. We have invited local leaders and rural energy providers from Ethiopia, Kenya, Rwanda, Uganda, Malawi and Tanzania.
We were keen for village headmen and headwomen to share their village experiences of energy provision and to tell us about the outcomes and impacts of productive energy use in relation to standards of living, education, heath and employment in the village.
The workshop heard from the off grid energy providers about their achievements and challenges in bringing off-grid energy to villages and how they have worked with village leaders and the village community.
Pilbara Pulse 2014 - Bureau of Resource and Energy Economics - John BarberKDCCI
Bureau of Resource and Energy Economics Resource Program Manager John Barber presented "Outlook for Australian Resources and Energy Commodities" at the Karratha and Districts Chamber of Commerce and Industry's fourth annual Pilbara Pulse Economic Summit.
TwilioCon 2013 API Panel with Capital One, ESPN, Accenture, MasheryDelyn Simons
Seems like most companies are exploring what it means to build an API and open that to the developer community. But what does it mean to have an API? How should you be thinking about your API strategically for your business? Hear from companies who have successfully launched APIs for their business and how they approach their API from strategy to execution.
Shifting market forces are disrupting entire industries, forcing product managers to think about APIs and services more strategically to remain nimble and relevant. Thinking about the products you manage in a platform way allows you extend the reach of your connected products -- in ways you may never been able to predict.
This document summarizes key information from a seminar on getting medical devices approved by the FDA. It discusses common reasons why companies fail FDA approval, including inadequate software documentation. The document outlines FDA guidance and standards for software documentation, design, validation, and human factors. It also discusses FDA concerns about cybersecurity for networked devices and provides an overview of FDA guidance on managing cybersecurity risks.
This presentation is intended for the customer facing risk managers, sales staff, and IT staff of a medical device manufacturer and their medical doctors and IT hospital and clinical counterparts.
It is intended to give an overview and highlight process considerations for incident management and reporting of cybersecurity issues.
It is based on the technical paper published by Pam Gilmore and Valdez Ladd in the ISSA Journal in 2014.
A presentation delivered June 4, 2009 describing the impact of the economic crisis on venture and angel investing and common sense steps for fundraising for Medical Device Startups. No one is an expert now.
Karthikeyan Gopal has extensive experience in product design, development, risk management and project management for medical devices. He has worked on projects involving infusion pumps, consumables, combination products and more. His skills include systems engineering, design engineering, validation engineering, risk management, CAD, requirements management, and experience with various medical device standards and regulations. He has held roles at companies such as HCL Technologies, Hospira, American Medical Systems, and Baxter.
Terrat | Aug-15 | Where are smart villages? 2Smart Villages
The East Africa Masterclass at Terrat focused on the village level experience of off-grid energy. We have invited local leaders and rural energy providers from Ethiopia, Kenya, Rwanda, Uganda, Malawi and Tanzania.
We were keen for village headmen and headwomen to share their village experiences of energy provision and to tell us about the outcomes and impacts of productive energy use in relation to standards of living, education, heath and employment in the village.
The workshop heard from the off grid energy providers about their achievements and challenges in bringing off-grid energy to villages and how they have worked with village leaders and the village community.
Pilbara Pulse 2014 - Bureau of Resource and Energy Economics - John BarberKDCCI
Bureau of Resource and Energy Economics Resource Program Manager John Barber presented "Outlook for Australian Resources and Energy Commodities" at the Karratha and Districts Chamber of Commerce and Industry's fourth annual Pilbara Pulse Economic Summit.
The document discusses including wood stoves in home energy audits and the challenges involved. It notes that most wood and pellet stoves do not report reliable efficiency numbers and cannot be tested while a blower door test is being done. It also states that pellet stoves and boilers are exempt from EPA regulation so they cannot be checked for certification. The document advocates for standards to properly assess wood stoves during energy audits.
9 Testimonio Colegio de Postgraduados PueblaPedro Morlet
La Unión Europea ha acordado un paquete de sanciones contra Rusia por su invasión de Ucrania. Las sanciones incluyen restricciones a las importaciones de productos rusos clave como el acero y la madera, así como medidas contra bancos y funcionarios rusos. Los líderes de la UE esperan que las sanciones aumenten la presión económica sobre Rusia y la disuadan de continuar su agresión contra Ucrania.
Presentation by Dr. Chris Skinner, Director Product Platforms, Owens Corning, at CAMX on October 16, 2014.
Future market options for alternative energy – wind, geothermal, solar, ocean/tidal, flywheel technology, battery technology, and biofuels – are a growing area of interest for composites and advanced materials businesses. Knowing how to determine which source provides the most promise for composites applications, navigating the regulatory issues, and determining what design, materials, and manufacturing issues should be kept top of mind are discussed during this session.
This document discusses using data mining techniques for energy resource management. It outlines objectives like classification, regression, forecasting and anomaly detection. Techniques covered include cluster analysis, classification trees, neural networks, genetic algorithms and Bayesian models. Applications involve meeting strategic objectives, making business/engineering decisions and energy budgets. Challenges with large data include integration, wasted time and disconnects. The document proposes solutions like Extract-Transform-Load, Hadoop and cloud computing. The methodology uses a geographic information system, forecasting engines and an application programming interface to transfer and analyze data.
SaveCal (SaveCal.com) Save Cal Affordable Financial Programs for Energy Effic...Savecal.com
SaveCal (SaveCal.com) Save Cal Affordable Financial Programs for Energy Efficiency - Savecal.com was founded in 1997 and we are recognized leaders in the field of commercial and residential construction. The company was started on the principle of hard work at a fair price and completing jobs correctly. We are well-known for quality workmanship in all types of construction such as electrical, plumbing, and also landscaping. Not only is our principle completing jobs correctly but also being environmental friendly. Our goal is to bring recognition to environmental issues that hurt healthy living in California. Our experienced team is always here to provide you with the upmost recent information on how to conserve energy, save money and most importantly save California
Livio Nichilo of Internat Energy Solutions presents integrated renewable energy strategies for buildings. Presented to the Toronto Certified Sustainable Building Advisor program.
"Business Plan 2007-2011, the Gulf of Mexico, and Renewable Fuels"Petrobras
Petrobras' business plan for 2007-2011 outlines $87.1 billion in investments across its operations. 56% or $49.3 billion will go towards expanding production in exploration and production (E&P). Key goals include optimizing development of existing reserves and maintaining a minimum reserve to production ratio of 15 years. The plan also details investments in refining and downstream operations, including a $2.5 billion new refinery in Pernambuco, Brazil and $1 billion for upgrading a refinery in Texas, USA. Financial targets include accrued economic profit of $83.4 billion by 2015 and maintaining a debt to capitalization ratio below 30%.
Seth Kaplan presentation at EUCI conference on Renewable Energy Markets in Ne...Seth Kaplan
The document discusses the history and future of renewable energy markets, arguing that [1] environmental concerns, especially around climate change, have been and will continue to be the key driver, [2] current policy instability stems partly from ambivalence but climate change requires bold action, and [3] the future will be defined by a struggle between preserving fossil fuel industries and building out clean energy technologies sufficiently to stabilize the climate.
I recently gave this presentation at the World Energy Engineering Congress In Washington DC (Oct13) and it was well received. It shows a "big picture" approach and gives plenty of real world examples.
Auctioning RE projects: Lessons learned from auction design for renewable ele...Leonardo ENERGY
Dr. David Jacobs gave a presentation on lessons learned from auction design for renewable electricity projects. He discussed key questions for policymakers to consider when designing renewable energy auctions, including what is being auctioned and when, who can participate, how bids will be evaluated, what price determination mechanism will be used, what payments winners will receive, and how to ensure projects are actually built. Auctions can promote cost efficiency but require significant administration; feed-in tariffs are simpler to administer but lack price competition. An optimal approach may combine elements of both auctions and feed-in tariffs.
The Low Carbon Finland 2050 project by VTT Technical Research Centre of Finland aims to assess the technological opportunities and challenges involved in reducing Finland’s greenhouse gas emissions. A target for reduction is set as at least 80% from the 1990 level by 2050 as part of an international effort, which requires strong RD&D in clean energy technologies. Key findings of the project are presented in this publication, which aims to stimulate enlightening and multidisciplinary discussions on low-carbon futures for Finland.
The project gathered together VTT’s technology experts in clean energy production, smart energy infrastructures, transport, buildings, and industrial systems as well as experts in energy system modelling and foresight. VTT’s leading edge “Low Carbon and Smart Energy” enables new solutions with a demonstration that is the first of its kind in Finland, and the introduction of new energy technology onto national and global markets.
Improving the performance of OpenSubdiv* on Intel ArchitectureIntel® Software
This document discusses improving the performance of OpenSubdiv on Intel architecture processors. It provides copyright information and legal disclaimers related to using Intel products. The document contains information about optimizing OpenSubdiv to take advantage of features in Intel processors that can improve performance.
DreamWorks Animation achieved a 4x speedup in skin deformation for character animation by utilizing SIMD (Single Instruction, Multiple Data) instructions. Anson Chu, Alex Wells, and Martin Watt of DreamWorks Animation presented on how they optimized their skinning deformation algorithm using SIMD at an Intel conference in August 2015. The presentation covered legal disclaimers and information regarding the use of Intel products and benchmarks.
DreamWorks Animation was working to optimize the performance of its 3D character animation system. The system relies heavily on hierarchical transformations represented by matrices to define the positions and orientations of character skeleton joints. Evaluating these hierarchical transformations was found to be on the critical path and not well suited to parallelization. DreamWorks investigated representing transformations differently using "X-Form blocks" to allow for deferred evaluation, achieving a 1.6x speedup in the motion system.
Elijah Charles from Intel presented this deck at the 2016 HPC Advisory Council Switzerland Conference.
"The Exascale computing challenge is the current Holy Grail for high performance computing. It envisages building HPC systems capable of 10^18 floating point operations under a power input in the range of 20-40 MW. To achieve this feat, several barriers need to be overcome. These barriers or “walls” are not completely independent of each other, but present a lens through which HPC system design can be viewed as a whole, and its composing sub-systems optimized to overcome the persistent bottlenecks."
Watch the video presentation: http://wp.me/p3RLHQ-f7X
See more talks in the Switzerland HPC Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses Intel's HPC portfolio and roadmap update. It provides an overview of the new Intel Xeon E5-2600 v2 processor family, highlighting its efficiency, performance, and security features. The Xeon E5-2600 v2 is expected to deliver up to 30% more performance using the same or less power compared to the previous generation. It offers up to 12 cores, 30MB of cache, and support for the latest I/O and memory technologies to provide powerful and efficient processing for modern data centers.
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...Ceph Community
VSM (Virtual Storage Manager) is an open source tool developed by Intel to simplify Ceph storage cluster management and foster greater adoption of Ceph. It allows administrators to deploy, manage, monitor, and integrate Ceph clusters with cloud orchestrators like OpenStack in an easy to use manner through features like automated deployment, visual cluster management, health and performance monitoring, and OpenStack integration. Written in Python, VSM is released under the Apache 2.0 license and supports both Ceph and OpenStack on various Linux distributions.
The document discusses including wood stoves in home energy audits and the challenges involved. It notes that most wood and pellet stoves do not report reliable efficiency numbers and cannot be tested while a blower door test is being done. It also states that pellet stoves and boilers are exempt from EPA regulation so they cannot be checked for certification. The document advocates for standards to properly assess wood stoves during energy audits.
9 Testimonio Colegio de Postgraduados PueblaPedro Morlet
La Unión Europea ha acordado un paquete de sanciones contra Rusia por su invasión de Ucrania. Las sanciones incluyen restricciones a las importaciones de productos rusos clave como el acero y la madera, así como medidas contra bancos y funcionarios rusos. Los líderes de la UE esperan que las sanciones aumenten la presión económica sobre Rusia y la disuadan de continuar su agresión contra Ucrania.
Presentation by Dr. Chris Skinner, Director Product Platforms, Owens Corning, at CAMX on October 16, 2014.
Future market options for alternative energy – wind, geothermal, solar, ocean/tidal, flywheel technology, battery technology, and biofuels – are a growing area of interest for composites and advanced materials businesses. Knowing how to determine which source provides the most promise for composites applications, navigating the regulatory issues, and determining what design, materials, and manufacturing issues should be kept top of mind are discussed during this session.
This document discusses using data mining techniques for energy resource management. It outlines objectives like classification, regression, forecasting and anomaly detection. Techniques covered include cluster analysis, classification trees, neural networks, genetic algorithms and Bayesian models. Applications involve meeting strategic objectives, making business/engineering decisions and energy budgets. Challenges with large data include integration, wasted time and disconnects. The document proposes solutions like Extract-Transform-Load, Hadoop and cloud computing. The methodology uses a geographic information system, forecasting engines and an application programming interface to transfer and analyze data.
SaveCal (SaveCal.com) Save Cal Affordable Financial Programs for Energy Effic...Savecal.com
SaveCal (SaveCal.com) Save Cal Affordable Financial Programs for Energy Efficiency - Savecal.com was founded in 1997 and we are recognized leaders in the field of commercial and residential construction. The company was started on the principle of hard work at a fair price and completing jobs correctly. We are well-known for quality workmanship in all types of construction such as electrical, plumbing, and also landscaping. Not only is our principle completing jobs correctly but also being environmental friendly. Our goal is to bring recognition to environmental issues that hurt healthy living in California. Our experienced team is always here to provide you with the upmost recent information on how to conserve energy, save money and most importantly save California
Livio Nichilo of Internat Energy Solutions presents integrated renewable energy strategies for buildings. Presented to the Toronto Certified Sustainable Building Advisor program.
"Business Plan 2007-2011, the Gulf of Mexico, and Renewable Fuels"Petrobras
Petrobras' business plan for 2007-2011 outlines $87.1 billion in investments across its operations. 56% or $49.3 billion will go towards expanding production in exploration and production (E&P). Key goals include optimizing development of existing reserves and maintaining a minimum reserve to production ratio of 15 years. The plan also details investments in refining and downstream operations, including a $2.5 billion new refinery in Pernambuco, Brazil and $1 billion for upgrading a refinery in Texas, USA. Financial targets include accrued economic profit of $83.4 billion by 2015 and maintaining a debt to capitalization ratio below 30%.
Seth Kaplan presentation at EUCI conference on Renewable Energy Markets in Ne...Seth Kaplan
The document discusses the history and future of renewable energy markets, arguing that [1] environmental concerns, especially around climate change, have been and will continue to be the key driver, [2] current policy instability stems partly from ambivalence but climate change requires bold action, and [3] the future will be defined by a struggle between preserving fossil fuel industries and building out clean energy technologies sufficiently to stabilize the climate.
I recently gave this presentation at the World Energy Engineering Congress In Washington DC (Oct13) and it was well received. It shows a "big picture" approach and gives plenty of real world examples.
Auctioning RE projects: Lessons learned from auction design for renewable ele...Leonardo ENERGY
Dr. David Jacobs gave a presentation on lessons learned from auction design for renewable electricity projects. He discussed key questions for policymakers to consider when designing renewable energy auctions, including what is being auctioned and when, who can participate, how bids will be evaluated, what price determination mechanism will be used, what payments winners will receive, and how to ensure projects are actually built. Auctions can promote cost efficiency but require significant administration; feed-in tariffs are simpler to administer but lack price competition. An optimal approach may combine elements of both auctions and feed-in tariffs.
The Low Carbon Finland 2050 project by VTT Technical Research Centre of Finland aims to assess the technological opportunities and challenges involved in reducing Finland’s greenhouse gas emissions. A target for reduction is set as at least 80% from the 1990 level by 2050 as part of an international effort, which requires strong RD&D in clean energy technologies. Key findings of the project are presented in this publication, which aims to stimulate enlightening and multidisciplinary discussions on low-carbon futures for Finland.
The project gathered together VTT’s technology experts in clean energy production, smart energy infrastructures, transport, buildings, and industrial systems as well as experts in energy system modelling and foresight. VTT’s leading edge “Low Carbon and Smart Energy” enables new solutions with a demonstration that is the first of its kind in Finland, and the introduction of new energy technology onto national and global markets.
Improving the performance of OpenSubdiv* on Intel ArchitectureIntel® Software
This document discusses improving the performance of OpenSubdiv on Intel architecture processors. It provides copyright information and legal disclaimers related to using Intel products. The document contains information about optimizing OpenSubdiv to take advantage of features in Intel processors that can improve performance.
DreamWorks Animation achieved a 4x speedup in skin deformation for character animation by utilizing SIMD (Single Instruction, Multiple Data) instructions. Anson Chu, Alex Wells, and Martin Watt of DreamWorks Animation presented on how they optimized their skinning deformation algorithm using SIMD at an Intel conference in August 2015. The presentation covered legal disclaimers and information regarding the use of Intel products and benchmarks.
DreamWorks Animation was working to optimize the performance of its 3D character animation system. The system relies heavily on hierarchical transformations represented by matrices to define the positions and orientations of character skeleton joints. Evaluating these hierarchical transformations was found to be on the critical path and not well suited to parallelization. DreamWorks investigated representing transformations differently using "X-Form blocks" to allow for deferred evaluation, achieving a 1.6x speedup in the motion system.
Elijah Charles from Intel presented this deck at the 2016 HPC Advisory Council Switzerland Conference.
"The Exascale computing challenge is the current Holy Grail for high performance computing. It envisages building HPC systems capable of 10^18 floating point operations under a power input in the range of 20-40 MW. To achieve this feat, several barriers need to be overcome. These barriers or “walls” are not completely independent of each other, but present a lens through which HPC system design can be viewed as a whole, and its composing sub-systems optimized to overcome the persistent bottlenecks."
Watch the video presentation: http://wp.me/p3RLHQ-f7X
See more talks in the Switzerland HPC Conference Video Gallery: http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses Intel's HPC portfolio and roadmap update. It provides an overview of the new Intel Xeon E5-2600 v2 processor family, highlighting its efficiency, performance, and security features. The Xeon E5-2600 v2 is expected to deliver up to 30% more performance using the same or less power compared to the previous generation. It offers up to 12 cores, 30MB of cache, and support for the latest I/O and memory technologies to provide powerful and efficient processing for modern data centers.
Ceph Day Shanghai - VSM (Virtual Storage Manager) - Simplify Ceph Management ...Ceph Community
VSM (Virtual Storage Manager) is an open source tool developed by Intel to simplify Ceph storage cluster management and foster greater adoption of Ceph. It allows administrators to deploy, manage, monitor, and integrate Ceph clusters with cloud orchestrators like OpenStack in an easy to use manner through features like automated deployment, visual cluster management, health and performance monitoring, and OpenStack integration. Written in Python, VSM is released under the Apache 2.0 license and supports both Ceph and OpenStack on various Linux distributions.
Transforming Business with Advanced AnalyticsIntel IT Center
The document discusses Intel's new Xeon E7 v2 processor family, which provides significantly better performance and lower total cost of ownership for in-memory analytics workloads compared to IBM Power processors. Key capabilities of the Xeon E7 v2 include up to 2x higher performance, 3x greater memory capacity, 4x higher I/O bandwidth, and reliability features. It established 20 new world records across various analytics benchmarks for 2-socket, 4-socket, and 8-socket configurations. The processor is aimed at transforming businesses through advanced analytics in industries like retail, healthcare, and manufacturing.
Yocto Project Open Source Build System and Collaboration InitiativeMarcelo Sanz
The Yocto Project creates a custom embedded Linux distribution for a device by building recipes that define how to obtain, patch, compile and package software, rather than using an existing Linux distribution. It provides a common build environment and tools to configure, build and test embedded systems across different processor architectures.
The document discusses Intel's Network Builders program, which aims to accelerate software-defined infrastructure adoption through open standards and platforms. It does this by investing in strong ecosystems, committing to open source, and leveraging Intel's technology leadership. The program enables partners through technical resources, matchmaking opportunities, and marketing support. It also works with network operators on proofs of concept and trials. The goal is to move the industry from early SDN/NFV trials to commercial deployments through this ecosystem collaboration.
Intel: мобильность и трансформация рабочего местаExpolink
The document discusses the transition to the third technological platform of ICT and the impact on workplaces and mobility. It notes that millions of applications, services, information sources, content and user experiences will be available to billions of users across thousands of applications on hundreds of platforms. Charts show growth in tablet PC shipments, CPU shipments and Intel's revenue and investment returns. The strategic directions of integration and common development are discussed.
2014-vol18-iss-2-intel-technology-journalRyan M. Cohen
This document provides a summary of the history of Android on Intel processors from 2009 to present day. It discusses the unofficial Android-x86 project in 2009 that first ported Android to x86 chips. In 2010, Intel announced the Moorestown platform to target Android devices. In 2011, Intel and Google partnered to optimize future Android versions for Intel processors. Since then several Android phones and tablets with Intel chips have been released, including the first Intel-powered Android phone, the Lava Xolo X900, in 2012.
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)IntelAPAC
This document discusses architecting cloud infrastructure for the future. It addresses requirements through workload optimized technologies, composable resources, and software defined infrastructure. Workload optimized technologies match different workloads with optimized server, storage, and network technologies. Composable resources allow flexible, efficient data centers through modular compute, memory, storage, and fabric resources that can be pooled and shared. Software defined infrastructure involves re-architecting the network through software defined networking and re-architecting storage through software defined storage.
This document discusses the DPDK roadmap. Version 2.1 is scheduled for release at the end of July 2015 and will include features like dynamic memory management, cuckoo hashing, packet framework enhancements, and driver support for additional NICs. Version 2.2 is scheduled for November 2015 and aims to support bifurcated drivers assuming their availability in the Linux kernel. The document also provides information on getting involved with the DPDK community through the dpdk.org website.
Reinforcement Learning Coach — an open source research framework for training and evaluating reinforcement learning (RL) agents by harnessing the power of multi-core CPU processing to achieve state-of-the-art results.
AI & Computer Vision (OpenVINO) - CPBR12Jomar Silva
This document discusses Intel's compiler optimizations. It states that Intel's compilers may optimize Intel microprocessors differently than non-Intel microprocessors. Some optimizations like SSE2, SSE3, and SSSE3 instructions are designed for Intel microprocessors. Intel does not guarantee the availability, functionality, effectiveness of optimizations on non-Intel microprocessors. The document advises checking product guides for specific instruction set coverage. It provides a notice revision date of August 4, 2011.
Similar to Advancing Science in Alternative Energy and Bioengineering with Many-Core Processors (20)
The document discusses the top 5 technologies that all organizations must understand: digital transformation, quantum computing, IoT, 5G, and AI/HPC. It provides an overview of each technology including opportunities and threats to organizations. The document emphasizes that understanding these emerging technologies is mandatory as the information revolution changes many aspects of life and business.
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
In this deck from IWOCL / SYCLcon 2020, Hal Finkel from Argonne National Laboratory presents: Preparing to program Aurora at Exascale - Early experiences and future directions.
"Argonne National Laboratory’s Leadership Computing Facility will be home to Aurora, our first exascale supercomputer. Aurora promises to take scientific computing to a whole new level, and scientists and engineers from many different fields will take advantage of Aurora’s unprecedented computational capabilities to push the boundaries of human knowledge. In addition, Aurora’s support for advanced machine-learning and big-data computations will enable scientific workflows incorporating these techniques along with traditional HPC algorithms. Programming the state-of-the-art hardware in Aurora will be accomplished using state-of-the-art programming models. Some of these models, such as OpenMP, are long-established in the HPC ecosystem. Other models, such as Intel’s oneAPI, based on SYCL, are relatively-new models constructed with the benefit of significant experience. Many applications will not use these models directly, but rather, will use C++ abstraction libraries such as Kokkos or RAJA. Python will also be a common entry point to high-performance capabilities. As we look toward the future, features in the C++ standard itself will become increasingly relevant for accessing the extreme parallelism of exascale platforms.
This presentation will summarize the experiences of our team as we prepare for Aurora, exploring how to port applications to Aurora’s architecture and programming models, and distilling the challenges and best practices we’ve developed to date. oneAPI/SYCL and OpenMP are both critical models in these efforts, and while the ecosystem for Aurora has yet to mature, we’ve already had a great deal of success. Importantly, we are not passive recipients of programming models developed by others. Our team works not only with vendor-provided compilers and tools, but also develops improved open-source LLVM-based technologies that feed both open-source and vendor-provided capabilities. In addition, we actively participate in the standardization of OpenMP, SYCL, and C++. To conclude, I’ll share our thoughts on how these models can best develop in the future to support exascale-class systems."
Watch the video: https://wp.me/p3RLHQ-lPT
Learn more: https://www.iwocl.org/iwocl-2020/conference-program/
and
https://www.anl.gov/topic/aurora
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Greg Wahl from Advantech presents: Transforming Private 5G Networks.
Advantech Networks & Communications Group is driving innovation in next-generation network solutions with their High Performance Servers. We provide business critical hardware to the world's leading telecom and networking equipment manufacturers with both standard and customized products. Our High Performance Servers are highly configurable platforms designed to balance the best in x86 server-class processing performance with maximum I/O and offload density. The systems are cost effective, highly available and optimized to meet next generation networking and media processing needs.
“Advantech’s Networks and Communication Group has been both an innovator and trusted enabling partner in the telecommunications and network security markets for over a decade, designing and manufacturing products for OEMs that accelerate their network platform evolution and time to market.” Said Advantech Vice President of Networks & Communications Group, Ween Niu. “In the new IP Infrastructure era, we will be expanding our expertise in Software Defined Networking (SDN) and Network Function Virtualization (NFV), two of the essential conduits to 5G infrastructure agility making networks easier to install, secure, automate and manage in a cloud-based infrastructure.”
In addition to innovation in air interface technologies and architecture extensions, 5G will also need a new generation of network computing platforms to run the emerging software defined infrastructure, one that provides greater topology flexibility, essential to deliver on the promises of high availability, high coverage, low latency and high bandwidth connections. This will open up new parallel industry opportunities through dedicated 5G network slices reserved for specific industries dedicated to video traffic, augmented reality, IoT, connected cars etc. 5G unlocks many new doors and one of the keys to its enablement lies in the elasticity and flexibility of the underlying infrastructure.
Advantech’s corporate vision is to enable an intelligent planet. The company is a global leader in the fields of IoT intelligent systems and embedded platforms. To embrace the trends of IoT, big data, and artificial intelligence, Advantech promotes IoT hardware and software solutions with the Edge Intelligence WISE-PaaS core to assist business partners and clients in connecting their industrial chains. Advantech is also working with business partners to co-create business ecosystems that accelerate the goal of industrial intelligence."
Watch the video: https://wp.me/p3RLHQ-lPQ
* Company website: https://www.advantech.com/
* Solution page: https://www2.advantech.com/nc/newsletter/NCG/SKY/benefits.html
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
In this deck from the Stanford HPC Conference, Katie Lewis from Lawrence Livermore National Laboratory presents: The Incorporation of Machine Learning into Scientific Simulations at Lawrence Livermore National Laboratory.
"Scientific simulations have driven computing at Lawrence Livermore National Laboratory (LLNL) for decades. During that time, we have seen significant changes in hardware, tools, and algorithms. Today, data science, including machine learning, is one of the fastest growing areas of computing, and LLNL is investing in hardware, applications, and algorithms in this space. While the use of simulations to focus and understand experiments is well accepted in our community, machine learning brings new challenges that need to be addressed. I will explore applications for machine learning in scientific simulations that are showing promising results and further investigation that is needed to better understand its usefulness."
Watch the video: https://youtu.be/NVwmvCWpZ6Y
Learn more: https://computing.llnl.gov/research-area/machine-learning
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
In this deck from the Stanford HPC Conference, DK Panda from Ohio State University presents: How to Achieve High-Performance, Scalable and Distributed DNN Training on Modern HPC Systems?
"This talk will start with an overview of challenges being faced by the AI community to achieve high-performance, scalable and distributed DNN training on Modern HPC systems with both scale-up and scale-out strategies. After that, the talk will focus on a range of solutions being carried out in my group to address these challenges. The solutions will include: 1) MPI-driven Deep Learning, 2) Co-designing Deep Learning Stacks with High-Performance MPI, 3) Out-of- core DNN training, and 4) Hybrid (Data and Model) parallelism. Case studies to accelerate DNN training with popular frameworks like TensorFlow, PyTorch, MXNet and Caffe on modern HPC systems will be presented."
Watch the video: https://youtu.be/LeUNoKZVuwQ
Learn more: http://web.cse.ohio-state.edu/~panda.2/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
In this deck from the Stanford HPC Conference, Nick Nystrom and Paola Buitrago provide an update from the Pittsburgh Supercomputing Center.
Nick Nystrom is Chief Scientist at the Pittsburgh Supercomputing Center (PSC). Nick is architect and PI for Bridges, PSC's flagship system that successfully pioneered the convergence of HPC, AI, and Big Data. He is also PI for the NIH Human Biomolecular Atlas Program’s HIVE Infrastructure Component and co-PI for projects that bring emerging AI technologies to research (Open Compass), apply machine learning to biomedical data for breast and lung cancer (Big Data for Better Health), and identify causal relationships in biomedical big data (the Center for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence). His current research interests include hardware and software architecture, applications of machine learning to multimodal data (particularly for the life sciences) and to enhance simulation, and graph analytics.
Watch the video: https://youtu.be/LWEU1L1o7yY
Learn more: https://www.psc.edu/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses using systems intelligence and artificial intelligence/neural networks to enhance semiconductor electronic design automation (EDA) workflows by collecting telemetry data from EDA jobs and infrastructure and analyzing it using complex event processing, machine learning models, and messaging substrates to provide insights that could optimize EDA pipelines and infrastructure. The approach aims to allow both internal and external augmentation of EDA processes and environments through unsupervised and incremental learning.
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
In this deck from the Stanford HPC Conference, Nicole Xu from Stanford University describes how she transformed a common jellyfish into a bionic creature that is part animal and part machine.
"Animal locomotion and bioinspiration have the potential to expand the performance capabilities of robots, but current implementations are limited. Mechanical soft robots leverage engineered materials and are highly controllable, but these biomimetic robots consume more power than corresponding animal counterparts. Biological soft robots from a bottom-up approach offer advantages such as speed and controllability but are limited to survival in cell media. Instead, biohybrid robots that comprise live animals and self- contained microelectronic systems leverage the animals’ own metabolism to reduce power constraints and body as an natural scaffold with damage tolerance. We demonstrate that by integrating onboard microelectronics into live jellyfish, we can enhance propulsion up to threefold, using only 10 mW of external power input to the microelectronics and at only a twofold increase in cost of transport to the animal. This robotic system uses 10 to 1000 times less external power per mass than existing swimming robots in literature and can be used in future applications for ocean monitoring to track environmental changes."
Watch the video: https://youtu.be/HrmJFyvInj8
Learn more: https://sanfrancisco.cbslocal.com/2020/02/05/stanford-research-project-common-jellyfish-bionic-sea-creatures/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Stanford HPC Conference, Peter Dueben from the European Centre for Medium-Range Weather Forecasts (ECMWF) presents: Machine Learning for Weather Forecasts.
"I will present recent studies that use deep learning to learn the equations of motion of the atmosphere, to emulate model components of weather forecast models and to enhance usability of weather forecasts. I will than talk about the main challenges for the application of deep learning in cutting-edge weather forecasts and suggest approaches to improve usability in the future."
Peter is contributing to the development and optimization of weather and climate models for modern supercomputers. He is focusing on a better understanding of model error and model uncertainty, on the use of reduced numerical precision that is optimised for a given level of model error, on global cloud- resolving simulations with ECMWF's forecast model, and the use of machine learning, and in particular deep learning, to improve the workflow and predictions. Peter has graduated in Physics and wrote his PhD thesis at the Max Planck Institute for Meteorology in Germany. He worked as Postdoc with Tim Palmer at the University of Oxford and has taken up a position as University Research Fellow of the Royal Society at the European Centre for Medium-Range Weather Forecasts (ECMWF) in 2017.
Watch the video: https://youtu.be/ks3fkRj8Iqc
Learn more: https://www.ecmwf.int/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck, Gilad Shainer from the HPC AI Advisory Council describes how this organization fosters innovation in the high performance computing community.
"The HPC-AI Advisory Council’s mission is to bridge the gap between high-performance computing (HPC) and Artificial Intelligence (AI) use and its potential, bring the beneficial capabilities of HPC and AI to new users for better research, education, innovation and product manufacturing, bring users the expertise needed to operate HPC and AI systems, provide application designers with the tools needed to enable parallel computing, and to strengthen the qualification and integration of HPC and AI system products."
Watch the video: https://wp.me/p3RLHQ-lNz
Learn more: http://hpcadvisorycouncil.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Today RIKEN in Japan announced that the Fugaku supercomputer will be made available for research projects aimed to combat COVID-19.
"Fugaku is currently being installed and is scheduled to be available to the public in 2021. However, faced with the devastating disaster unfolding before our eyes, RIKEN and MEXT decided to make a portion of the computational resources of Fugaku available for COVID-19-related projects ahead of schedule while continuing the installation process.
Fugaku is being developed not only for the progress in science, but also to help build the society dubbed as the “Society 5.0” by the Japanese government, where all people will live safe and comfortable lives. The current initiative to fight against the novel coronavirus is driven by the philosophy behind the development of Fugaku."
Initial Projects
Exploring new drug candidates for COVID-19 by "Fugaku"
Yasushi Okuno, RIKEN / Kyoto University
Prediction of conformational dynamics of proteins on the surface of SARS-Cov-2 using Fugaku
Yuji Sugita, RIKEN
Simulation analysis of pandemic phenomena
Nobuyasu Ito, RIKEN
Fragment molecular orbital calculations for COVID-19 proteins
Yuji Mochizuki, Rikkyo University
In this deck from the Performance Optimisation and Productivity group, Lubomir Riha from IT4Innovations presents: Energy Efficient Computing using Dynamic Tuning.
"We now live in a world of power-constrained architectures and systems and power consumption represents a significant cost factor in the overall HPC system economy. For these reasons, in recent years researchers, supercomputing centers and major vendors have developed new tools and methodologies to measure and optimize the energy consumption of large-scale high performance system installations. Due to the link between energy consumption, power consumption and execution time of an application executed by the final user, it is important for these tools and the methodology used to consider all these aspects, empowering the final user and the system administrator with the capability of finding the best configuration given different high level objectives.
This webinar focused on tools designed to improve the energy-efficiency of HPC applications using a methodology of dynamic tuning of HPC applications, developed under the H2020 READEX project. The READEX methodology has been designed for exploiting the dynamic behaviour of software. At design time, different runtime situations (RTS) are detected and optimized system configurations are determined. RTSs with the same configuration are grouped into scenarios, forming the tuning model. At runtime, the tuning model is used to switch system configurations dynamically.
The MERIC tool, that implements the READEX methodology, is presented. It supports manual or binary instrumentation of the analysed applications to simplify the analysis. This instrumentation is used to identify and annotate the significant regions in the HPC application. Automatic binary instrumentation annotates regions with significant runtime. Manual instrumentation, which can be combined with automatic, allows code developer to annotate regions of particular interest."
Watch the video: https://wp.me/p3RLHQ-lJP
Learn more: https://pop-coe.eu/blog/14th-pop-webinar-energy-efficient-computing-using-dynamic-tuning
and
https://code.it4i.cz/vys0053/meric
Sign up for our insideHPC Newsletter: http://insidehpc.com/newslett
The document discusses how DDN A3I storage solutions and Nvidia's SuperPOD platform can enable HPC at scale. It provides details on DDN's A3I appliances that are optimized for AI and deep learning workloads and validated for Nvidia's DGX-2 SuperPOD reference architecture. The solutions are said to deliver the fastest performance, effortless scaling, reliability and flexibility for data-intensive workloads.
In this deck, Paul Isaacs from Linaro presents: State of ARM-based HPC. This talk provides an overview of applications and infrastructure services successfully ported to Aarch64 and benefiting from scale.
"With its debut on the TOP500, the 125,000-core Astra supercomputer at New Mexico’s Sandia Labs uses Cavium ThunderX2 chips to mark Arm’s entry into the petascale world. In Japan, the Fujitsu A64FX Arm-based CPU in the pending Fugaku supercomputer has been optimized to achieve high-level, real-world application performance, anticipating up to one hundred times the application execution performance of the K computer. K was the first computer to top 10 petaflops in 2011."
Watch the video: https://wp.me/p3RLHQ-lIT
Learn more: https://www.linaro.org/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
Today Xilinx announced Versal Premium, the third series in the Versal ACAP portfolio. The Versal Premium series features highly integrated, networked and power-optimized cores and the industry’s highest bandwidth and compute density on an adaptable platform. Versal Premium is designed for the highest bandwidth networks operating in thermally and spatially constrained environments, as well as for cloud providers who need scalable, adaptable application acceleration.
Versal is the industry’s first adaptive compute acceleration platform (ACAP), a revolutionary new category of heterogeneous compute devices with capabilities that far exceed those of conventional silicon architectures. Developed on TSMC’s 7-nanometer process technology, Versal Premium combines software programmability with dynamically configurable hardware acceleration and pre-engineered connectivity and security features to enable a faster time-to- market. The Versal Premium series delivers up to 3X higher throughput compared to current generation FPGAs, with built-in Ethernet, Interlaken, and cryptographic engines that enable fast and secure networks. The series doubles the compute density of currently deployed mainstream FPGAs and provides the adaptability to keep pace with increasingly diverse and evolving cloud and networking workloads.
Learn more: https://insidehpc.com/2020/03/xilinx-announces-versal-premium-acap-for-network-and-cloud-acceleration/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
In this video from the Rice Oil & Gas Conference, Chin Fang from Zettar presents: Moving Massive Amounts of Data across Any Distance Efficiently.
The objective of this talk is to present two on-going projects aiming at improving and ensuring highly efficient bulk transferring or streaming of massive amounts of data over digital connections across any distance. It examines the current state of the art, a few very common misconceptions, the differences among the three major type of data movement solutions, a current initiative attempting to improve the data movement efficiency from the ground up, and another multi-stage project that shows how to conduct long distance large scale data movement at speed and scale internationally. Both projects have real world motivations, e.g. the ambitious data transfer requirements of Linac Coherent Light Source II (LCLS-II) [1], a premier preparation project of the U.S. DOE Exascale Computing Initiative (ECI) [2]. Their immediate goals are described and explained, together with the solution used for each. Findings and early results are reported. Possible future works are outlined.
Watch the video: https://wp.me/p3RLHQ-lBX
Learn more: https://www.zettar.com/
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from the Rice Oil & Gas Conference, Bradley McCredie from AMD presents: Scaling TCO in a Post Moore's Law Era.
"While foundries bravely drive forward to overcome the technical and economic challenges posed by scaling to 5nm and beyond, Moore’s law alone can provide only a fraction of the performance / watt and performance / dollar gains needed to satisfy the demands of today’s high performance computing and artificial intelligence applications. To close the gap, multiple strategies are required. First, new levels of innovation and design efficiency will supplement technology gains to continue to deliver meaningful improvements in SoC performance. Second, heterogenous compute architectures will create x-factor increases of performance efficiency for the most critical applications. Finally, open software frameworks, APIs, and toolsets will enable broad ecosystems of application level innovation."
Watch the video:
Learn more: http://amd.com
and
https://rice2020oghpc.rice.edu/program-2/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
In this deck from the ECSS Symposium, Abe Stern from NVIDIA presents: CUDA-Python and RAPIDS for blazing fast scientific computing.
"We will introduce Numba and RAPIDS for GPU programming in Python. Numba allows us to write just-in-time compiled CUDA code in Python, giving us easy access to the power of GPUs from a powerful high-level language. RAPIDS is a suite of tools with a Python interface for machine learning and dataframe operations. Together, Numba and RAPIDS represent a potent set of tools for rapid prototyping, development, and analysis for scientific computing. We will cover the basics of each library and go over simple examples to get users started. Finally, we will briefly highlight several other relevant libraries for GPU programming."
Watch the video: https://wp.me/p3RLHQ-lvu
Learn more: https://developer.nvidia.com/rapids
and
https://www.xsede.org/for-users/ecss/ecss-symposium
Sign up for our insideHPC Newsletter: http://insidehp.com/newsletter
In this deck from FOSDEM 2020, Colin Sauze from Aberystwyth University describes the development of a RaspberryPi cluster for teaching an introduction to HPC.
"The motivation for this was to overcome four key problems faced by new HPC users:
* The availability of a real HPC system and the effect running training courses can have on the real system, conversely the availability of spare resources on the real system can cause problems for the training course.
* A fear of using a large and expensive HPC system for the first time and worries that doing something wrong might damage the system.
* That HPC systems are very abstract systems sitting in data centres that users never see, it is difficult for them to understand exactly what it is they are using.
* That new users fail to understand resource limitations, in part because of the vast resources in modern HPC systems a lot of mistakes can be made before running out of resources. A more resource constrained system makes it easier to understand this.
The talk will also discuss some of the technical challenges in deploying an HPC environment to a Raspberry Pi and attempts to keep that environment as close to a "real" HPC as possible. The issue to trying to automate the installation process will also be covered."
Learn more: https://github.com/colinsauze/pi_cluster
and
https://fosdem.org/2020/schedule/events/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
In this deck from ATPESC 2019, Ken Raffenetti from Argonne presents an overview of HPC interconnects.
"The Argonne Training Program on Extreme-Scale Computing (ATPESC) provides intensive, two-week training on the key skills, approaches, and tools to design, implement, and execute computational science and engineering applications on current high-end computing systems and the leadership-class computing systems of the future."
Watch the video: https://wp.me/p3RLHQ-luc
Learn more: https://extremecomputingtraining.anl.gov/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Getting the Most Out of ScyllaDB Monitoring: ShareChat's TipsScyllaDB
ScyllaDB monitoring provides a lot of useful information. But sometimes it’s not easy to find the root of the problem if something is wrong or even estimate the remaining capacity by the load on the cluster. This talk shares our team's practical tips on: 1) How to find the root of the problem by metrics if ScyllaDB is slow 2) How to interpret the load and plan capacity for the future 3) Compaction strategies and how to choose the right one 4) Important metrics which aren’t available in the default monitoring setup.
Must Know Postgres Extension for DBA and Developer during MigrationMydbops
Mydbops Opensource Database Meetup 16
Topic: Must-Know PostgreSQL Extensions for Developers and DBAs During Migration
Speaker: Deepak Mahto, Founder of DataCloudGaze Consulting
Date & Time: 8th June | 10 AM - 1 PM IST
Venue: Bangalore International Centre, Bangalore
Abstract: Discover how PostgreSQL extensions can be your secret weapon! This talk explores how key extensions enhance database capabilities and streamline the migration process for users moving from other relational databases like Oracle.
Key Takeaways:
* Learn about crucial extensions like oracle_fdw, pgtt, and pg_audit that ease migration complexities.
* Gain valuable strategies for implementing these extensions in PostgreSQL to achieve license freedom.
* Discover how these key extensions can empower both developers and DBAs during the migration process.
* Don't miss this chance to gain practical knowledge from an industry expert and stay updated on the latest open-source database trends.
Mydbops Managed Services specializes in taking the pain out of database management while optimizing performance. Since 2015, we have been providing top-notch support and assistance for the top three open-source databases: MySQL, MongoDB, and PostgreSQL.
Our team offers a wide range of services, including assistance, support, consulting, 24/7 operations, and expertise in all relevant technologies. We help organizations improve their database's performance, scalability, efficiency, and availability.
Contact us: info@mydbops.com
Visit: https://www.mydbops.com/
Follow us on LinkedIn: https://in.linkedin.com/company/mydbops
For more details and updates, please follow up the below links.
Meetup Page : https://www.meetup.com/mydbops-databa...
Twitter: https://twitter.com/mydbopsofficial
Blogs: https://www.mydbops.com/blog/
Facebook(Meta): https://www.facebook.com/mydbops/
Lee Barnes - Path to Becoming an Effective Test Automation Engineer.pdfleebarnesutopia
So… you want to become a Test Automation Engineer (or hire and develop one)? While there’s quite a bit of information available about important technical and tool skills to master, there’s not enough discussion around the path to becoming an effective Test Automation Engineer that knows how to add VALUE. In my experience this had led to a proliferation of engineers who are proficient with tools and building frameworks but have skill and knowledge gaps, especially in software testing, that reduce the value they deliver with test automation.
In this talk, Lee will share his lessons learned from over 30 years of working with, and mentoring, hundreds of Test Automation Engineers. Whether you’re looking to get started in test automation or just want to improve your trade, this talk will give you a solid foundation and roadmap for ensuring your test automation efforts continuously add value. This talk is equally valuable for both aspiring Test Automation Engineers and those managing them! All attendees will take away a set of key foundational knowledge and a high-level learning path for leveling up test automation skills and ensuring they add value to their organizations.
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
The Microsoft 365 Migration Tutorial For Beginner.pptx
Advancing Science in Alternative Energy and Bioengineering with Many-Core Processors
1. Advancing Science in Alternative Energy and
Bioengineering with Many-Core Processors
W. Michael Brown, Intel® Corporation†
Jan-Michael Carrillo, Oak Ridge National Laboratory*
Intel Confidential — Do Not Forward
54th HPC User Forum
September 15-17, 2014
Seattle, Washington
†Disclaimer: This talk includes results from my tenure at ORNL* that do not necessarily
reflect the views, research directions, etc. of Intel® Corporation.
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
3. Risk Factors
The above statements and any others in this document that refer to plans and expectations for the third quarter, the year and the future are forward-looking statements that involve a number of
risks and uncertainties. Words such as “anticipates,” “expects,” “intends,” “plans,” “believes,” “seeks,” “estimates,” “may,” “will,” “should” and their variations identify forward-looking
statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify forward-looking statements. Many factors could affect Intel’s actual results, and
variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers
the following to be the important factors that could cause actual results to differ materially from the company’s expectations. Demand could be different from Intel's expectations due to factors
including changes in business and economic conditions; customer acceptance of Intel’s and competitors’ products; supply constraints and other disruptions affecting customers; changes in
customer order patterns including order cancellations; and changes in the level of inventory at customers. Uncertainty in global economic and financial conditions poses a risk that consumers and
businesses may defer purchases in response to negative financial events, which could negatively affect product demand and other related matters. Intel operates in intensely competitive industries
that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Revenue and the gross
margin percentage are affected by the timing of Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product
offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to technological developments and to incorporate
new features into its products. The gross margin percentage could vary significantly from expectations based on capacity utilization; variations in inventory valuation, including variations related to
the timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and execution of the manufacturing ramp and associated costs; start-up costs; excess or
obsolete inventory; changes in unit costs; defects or disruptions in the supply of materials or resources; product manufacturing quality/yields; and impairments of long-lived assets, including
manufacturing, assembly/test and intangible assets. Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its
customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates.
Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the
level of revenue and profits. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel's results could be affected by adverse effects associated with product
defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust, disclosure and other issues, such
as the litigation and regulatory matters described in Intel's SEC reports. An unfavorable ruling could include monetary damages or an injunction prohibiting Intel from manufacturing or selling one or
more products, precluding particular business practices, impacting Intel’s ability to design its products, or requiring other remedies such as compulsory licensing of intellectual property. A detailed
discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the company’s most recent reports on Form 10-Q, Form 10-K and earnings release.
Rev. 7/17/13
4. Optimization Notice
4
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for
optimizations that are not unique to Intel microprocessors. These optimizations include SSE2,
SSE3, and SSE3 instruction sets and other optimizations. Intel does not guarantee the
availability, functionality, or effectiveness of any optimization on microprocessors not
manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel
microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for
Intel microprocessors. Please refer to the applicable product User and Reference Guides for
more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
5. 5
Outline
Motivation: A subset of science problems I have been involved with
that have been difficult to investigate with previous
hardware/software solutions
What we have done: Recent hardware and software advances that have
enabled new investigations
What we are doing: Issues and the solutions required to continue to advance
HPC capabilities
7. Liquid Crystal Biosensors
7
Label Free Diagnostics
§ Alignment changes induced by interactions with
biomolecules can be detected with optical microscopy
§ "Critical to this area is a sound understanding of the
theory and modelling of liquid-crystal materials at
interfaces. Computer simulations of the molecular and
mesoscopic interactions between thermotropic liquid-crystal
materials … are a necessity. It is important to
understand the interactions at these complicated
interfaces“ - Woltman, et al., Nature Materials*, 6, 929
(2007)
§ Prior to 2013, no molecular simulations at
experimentally relevant scales
§ Computational limitation
§ Extrapolations from small-scale studies difficult/dangerous
due to the presence of undulative modes
REVIEW ARTICLE
Aqueous lipid-rich solution
Aligned
liquid crystal
Lipid-depleted regions
Glass
substrate
Patterned
well
Woltman, et al., Nature
Materials*, 6, 929 (2007)
T = 0 min T = 15 min T = 45 min T = 75 min
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
8. Icephobic Surfaces for Wind Power
8
Many of the regions of the world that
can benefit from wind power are in cold
climates
Ice can reduce the efficiency of
turbines or force shutdown
Extremely difficult to probe ice
formation experimentally
Probing with simulation requires large
simulations at time scales sufficient to
capture rare nucleation events leading
to ice formation
9. Organic Solar Cells
9
Organic photovoltaic (OPV) solar cells are promising
renewable energy sources
• Low cost, high-flexibility, light weight
Morphology of the OPV active layer has a significant impact
on device efficiency
• Donor regions should be large enough to absorb light
efficiently, but small enough to allow charge excitations
to diffuse to acceptor before recombining
• High interface-to-volume ratio for efficient exciton
dissociation
• Proper donor molecular alignment to optimize the
carrier mobility of the donor phase
Multi-scale problem
• Very large simulations to reach experimental scales
PCBM
(Electron
Acceptor)
P3HT
(Electron
Donor)
11. 11
Many-core and GPUs
Issues with power consumption, heat dissipation, and memory access latencies
have forced a change in direction towards many-core processors for current and
next-generation supercomputers that advance the limits of peak floating point
performance
GPU-based architectures were the first affordable many-core processors
generally available
ORNL* Titan came online in late 2012 as a hybrid supercomputer capable of over
17PF HPL performance.
• Cray* XK7 architecture with a Gemini* network interconnect and nodes
containing a single AMD* Opteron* 6274 processor and Nvidia* Tesla K20X
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
12. 12
Center for Accelerated Application Readiness
(CAAR at ORNL*)
In order to exploit the potential performance gains on Titan, changes to the
software are required
CAAR was formed to address this issue for several codes before the upgrade to
Titan
• LAMMPS* (Large-scale Atomic/Molecular Massively Parallel Simulator) was
the molecular dynamics code chosen for CAAR
• We implemented the ‘GPU package’ for LAMMPS* with new algorithms suited for
running on hybrid GPGPU architectures
Brown, W.M., Wang, P. Plimpton, S.J., Tharrington, A.N. Implementing Molecular Dynamics on Hybrid High Performance
Computers - Short Range Forces. Computer Physics Communications*. 2011. 182: p. 898-911.
Brown, W.M., Kohlmeyer, A. Plimpton, S.J., Tharrington, A.N. Implementing Molecular Dynamics on Hybrid High Performance
Computers - Particle-Particle Particle-Mesh. Computer Physics Communications*. 2012. 183: p. 449-459.
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
13. Spinoidal Dewetting in Liquid Crystal Layers
Science Team: Trung Dac Nguyen, Jan-Michael Carrillo, Mike Matheson, W. Michael Brown, (ORNL*)
13
Problem: How/why do defects form in liquid
crystal layers?
• Not possible to study on previous Jaguar
supercomputer with software available
unless the machine was dedicated to the
problem
Innovation: Coarse-grain simulation model
with high arithmetic intensity/vector potential,
GPU algorithms
Result: > 7X performance gain (1S CPU+GPU
vs 2S CPU); First simulations at scale with
molecular detail
Nguyen, T. D., Carrillo, J.-M. Y., Matheson, M. A.,
Brown, W.M., Rupture Mechanism of liquid
crystal thin films realized by large-scale
molecular simulations. Nanoscale*. 2014. 6:
3083-96.
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
14. Icephobic Surfaces
Science Team: Yamada Masako, et. al (GE* Global Research)
14
Problem: Probe mechanism for water droplet
freezing on a surface with molecular detail
Innovation: New 3-body potential for water
simulation (E. B. Moore, V. Molinero, Nature*
479 (2011) 506–508)
• Coarse-grain model, eliminate all-to-all
communications for electrostatics, larger
timestep
• > 100X Simulation rate speedup from model
• Additional concurrency for 3-body: O(N3)
independent force computations vs O(N2)
[for N atoms within cutoff radius]
• Additional 2-3X Simulation speedup on XK7
node (1S CPU+GPU) vs 2S Opteron*
Result: Simulation of water freezing
process now typical with hundreds of
nodes.
Brown, W.M., Masako, Y. Implementing Molecular
Dynamics on Hybrid High Performance
Computers – Three-Body Potentials. Computer
Physics Communications*. 2013. 184: p. 2785-2793.
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
16. Issues with the Programming Model
16
Using CUDA* with C/C++ semantics introduces multiple languages, code paths,
compiler requirements, etc. into a code base.
• Support for Fortran or x86 targets requires proprietary compilers
OpenACC* can potentially address this, but in my opinion, will still require
separate code paths and different algorithms for x86 in general
• GPUs have 10,000s of threads in flight, different performance for atomic
operations, different penalties for thread synchronization, context switching,
etc.
• For example, the algorithms used for molecular dynamics on the GPU would
not perform well on x86 processors.
• For the 3-body water/ice simulation presented here, we triple the number of force
computations on the GPU with redundant computations!
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
17. Optimizing separate code paths
17
Optimizations for GPU do not necessarily improve
performance on CPUs
• Many production codes supporting GPU acceleration still use the
CPU for many routines and must still support CPU-only
• Intel® Xeon® processors are still improving the performance/
socket
• 85% of all systems on TOP500* use Intel® processors (97% of
new systems)
• Future Intel® many-core processors will be bootable (no
coprocessor necessary)
• Difficult to manage/debug/port software with separate optimization
code paths
No Coprocessor/GPU
1
8.38
2.94
12.43
14
12
10
8
6
4
2
0
Liquid Crystal Benchmark Simulation
Rate
(Higher is Better)
(2012) LAMMPS Baseline on 2S AMD* Opteron* 6274 [1600MHz DDR3]
(2012) LAMMPS w/ GPU Optimizations on 1S AMD* Opteron* 6274 +
Nvidia* Tesla* K20X
(2014) LAMMPS w/ GPU Optimizations on 2S Intel® Xeon® E5-2697v3
[2133 MHz DDR4]
(2014) LAMMPS w/ Intel® Xeon Phi™ Coprocessor Optimizations on 2S
Intel® Xeon® E5-2697v3 [2133 MHz DDR4]
Source:
AMD* & AMD/NVIDIA* results: http://www.nvidia.com/docs/IO/122634/computational-chemistry-benchmarks.pdf
Intel Results: Intel Measured August 2014
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary.
You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. See benchmark tests and configurations in the speaker notes. For more information go to http://www.intel.com/performance
* Other names and brands may be claimed as the property of others.
19. Motivation
19
Provide a workspace for demonstrating code
modernization strategies with vectorization for
different precision modes, OpenMP*, and
MPI*-3
Support computation offload to Intel®
coprocessors with performance that is
comparable or better than GPU performance
Demonstrate portable performance with a
single code path using standard MD
algorithms with C++/OpenMP*
Provide routines that allow the community to
easily add new functionality by example
LAMMPS*
Rhodopsin
Protein
Benchmark
512K
32.00
16.00
8.00
4.00
2.00
1.00
0.50
Atoms
(TACC*
Stampede)
1
2
4
8
16
32
Simula@on
Time
(Lower
is
beEer)
Nodes
2S
Intel®
Xeon®
Processor
E5-‐2680
2S
Intel®
Xeon®
Processor
E5-‐2680
+
Intel®
Xeon
Phi™
Coprocessor
SE10P
2S
Intel®
Xeon®
Processor
E5-‐2680
+
Nvidia*
Tesla
K20m
Stampede Configuration: HT Off, 32GB DDR3-1600MHz, PCIe2
(GPU), PCIe2 (Intel® Coprocessor), FDR 56 Gb/s, MPSS 3.3,
MVAPICH 2.0b
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary.
You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when
combined with other products. See benchmark tests and configurations in the speaker notes. For more information go to http://www.intel.com/performance
Performance improvements
with Intel® package still
significant with 1000 atoms
per CPU core
Source: Intel Measured August 2014
* Other names and brands may be claimed as the property of others.
20. 20
Current Optimizations in Intel Package
Data alignment
Support for mixed and single precision modes in addition to double
SIMD directives to allow compiler vectorization for routines with data
dependencies
Modifications to conditional branches to better support compiler vectorization for
both coprocessors and Intel® Advanced Vector Extensions (Intel® AVX)
Neighbor-list padding to prevent execution of vector remainder code
Offload directives to manage data allocation, transfer, and concurrent
computation on the coprocessor
* Other names and brands may be claimed as the property of others.
21. 21
Advantages of Intel® Package vs GPU Package (1)
• Same code for routines run on the CPU and coprocessor (with or without offload)
• Optimizations for Intel® Xeon Phi™ coprocessors resulted in faster performance on Intel® Xeon®
processors (up to 3.5X)
• GPU package uses different algorithms and different code/language
• Support for both ‘newton’ settings allows for more flexibility for new force-fields
• Improved flexibility for heterogeneous calculations
• Intel® coprocessor offload not limited to 16 MPI* tasks on CPU (CUDA*-MPS limitation)
• Intel® package supports OpenMP* with multiple threads on the CPU (GPU package does not use
OpenMP)
• MPI* tasks sharing coprocessor are able to get exclusive core affinity
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
22. 22
Advantages of Intel® Package vs GPU Package (2)
• More options for overlap of MPI* communications and computation
• Build process is simpler and does not require building a separate library for
coprocessor routines
• One compiler/Makefile for everything
• Precision mode (single, mixed, or double) can be switched at run-time without
rebuilding
• Package written in standard C++ with OpenMP*
• Offload directives used for the coprocessor
*
Other
names
and
brands
may
be
claimed
as
the
property
of
others.
23. Organic Solar Cells
Science Team: Jan-Michael Y Carrillo, Rajeev Kumar, Monojoy Goswami, S. Michael Kilbey II, Bobby G Sumpter (ORNL*/
UT*)
23
Problem: Predictive simulation of active
layer morphology and molecular alignment
based on blend composition
Innovation: Code Modernization/HW
Result: With code modernization and advanced
HPC resources, we have been the first to perform
simulations at scales that match experiment.
• Left: Up to 1.9X with use of a coprocessor
• Simulations include all of the statistics and I/O
(about 10% of run time) from the production runs
• Significant potential for advanced multiscale
simulation models with coprocessors…
256.00
128.00
64.00
32.00
16.00
8.00
OPV
Simula@on
1.77M
Atoms
(Stampede)
2
4
8
16
32
64
Simula@on
Time
(Lower
is
beEer)
Nodes
2S
Intel®
Xeon®
Processor
E5-‐2680
(Baseline)
2S
Intel®
Xeon®
Processor
E5-‐2680
(Intel®
Package)
2S
Intel®
Xeon®
Processor
E5-‐2680
+
Intel®
Xeon
Phi™
SE10P
2S
Intel®
Xeon®
Processor
E5-‐2680
+
Nvidia*
Tesla*
K20m
COARSE-‐GRAINED
MD
• Morphology
• Phase
segregation
Carrillo, J.-M. Y., Kumar, R., Goswami, G., Sumpter, B., Brown, W.M., New Insights
into Dynamics and Morphology of P3HT:PCBM Active Layers in Bulk
Heterojunctions. Physical Chemistry Chemical Physics. 2013. 15: p. 17873-17882.
Source: Michael Brown
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software,
operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information
and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product
when combined with other products. See benchmark tests and configurations in the speaker notes. For more information go to
http://www.intel.com/performance
* Other names and brands may be claimed as the property of others.
24. LAMMPS performance with current processors
24
1.00
4.68
7.64
7.12
1.92
6.58
8.66
9.71
2.30
10.36
12.44
14.00
12.00
10.00
8.00
6.00
4.00
2.00
0.00
Baseline
(CPU
Only)
Intel®
Package
(CPU
Only)
Offload
to
GPU
Offload
to
Intel®
Xeon
Phi™
Coprocessor
Simula@on
Rate
(Higher
is
BeEer)
LAMMPS
Liquid
Crystal
Benchmark
524K
Atoms
2S
Intel®
Xeon®
Processor
E5-‐2680
/
Nvidia*
Tesla*
K20m
/
Intel®
Xeon
Phi™
Coprocessor
SE10P†
2S
Intel®
Xeon®
Processor
E5-‐2697v2
/
Nvidia*
Tesla*
K40c
/
Intel®
Xeon
Phi™
Coprocessor
7120A
‡
2S
Intel®
Xeon®
Processor
E5-‐2697v3
/
HW
Not
Available
/
Intel®
Xeon
Phi™
Coprocessor
7120A‡
1.00
1.21
2.30
2.16
1.51
1.82
2.68
2.70
1.82
2.31
3.06
3.50
3.00
2.50
2.00
1.50
1.00
0.50
0.00
Baseline
(CPU
Only)
Intel®
Package
(CPU
Only)
Offload
to
GPU
Offload
to
Intel®
Xeon
Phi™
Coprocessor
Simula@on
Rate
(Higher
is
BeEer)
LAMMPS
Rhodopsin
Protein
Benchmark
512K
Atoms
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using
specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests
to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. See benchmark tests and configurations in the speaker
notes. For more information go to http://www.intel.com/performance
* Other names and brands may be claimed as the property of others.
26. Available/Ongoing/Future Improvements
26
Compiler vectorization
• Significant advances in compiler
vectorization using the Intel®
Composer XE 2015.
• New: Vector variant functions that
allow explicit vector coding for small
subroutines
• Advanced compiler optimization
reports with Intel® Composer XE
2015 and runtime analysis with
Intel® VTune™ Amplifier XE
MPI Performance in symmetric modes
• Significant performance
improvements with Intel® MPI* 5.
• Future many-core chips will have
bootable options and also options
for integrated fabric
27. Unveiling Details of Knights Landing
(Next Generation Intel® Xeon Phi™ Products)
Compute: Energy-efficient IA cores2
§ Microarchitecture enhanced for HPC3
§ 3X Single Thread Performance vs Knights Corner4
§ Intel Xeon Processor Binary Compatible5
§ 1/3X the Space6
§ 5X Power Efficiency6
Platform Memory: DDR4 Bandwidth and
Capacity Comparable to Intel® Xeon®
Processors
.
.
.
.
.
.
. . .
. . .
. . .
. . .
. . .
Intel® Silvermont Arch.
Enhanced for HPC
Integrated Fabric
Processor Package
. . .
. . .
All
products,
computer
systems,
dates
and
figures
specified
are
preliminary
based
on
current
expecta@ons,
and
are
subject
to
change
without
no@ce.
1Over
3
Teraflops
of
peak
theore@cal
double-‐precision
performance
is
preliminary
and
based
on
current
expecta@ons
of
cores,
clock
frequency
and
floa@ng
point
opera@ons
per
cycle.
FLOPS
=
cores
x
clock
frequency
x
floa@ng-‐
point
opera@ons
per
second
per
cycle.
.
2Modified
version
of
Intel®
Silvermont
microarchitecture
currently
found
in
Intel®
AtomTM
processors.
3Modifica@ons
include
AVX512
and
4
threads/
core
support.
4Projected
peak
theore@cal
single-‐thread
performance
rela@ve
to
1st
Genera@on
Intel®
Xeon
Phi™
Coprocessor
7120P
(formerly
codenamed
Knights
Corner).
5
Binary
Compa@ble
with
Intel
Xeon
processors
using
Haswell
Instruc@on
Set
(except
TSX)
.
6Projected
results
based
on
internal
Intel
analysis
of
Knights
Landing
memory
vs
Knights
Corner
(GDDR5).
7Projected
result
based
on
internal
Intel
analysis
of
STREAM
benchmark
using
a
Knights
Landing
processor
with
16GB
of
ultra
high-‐bandwidth
versus
DDR4
memory
only
with
all
channels
populated.
Conceptual—Not Actual Package Layout
…
Jointly Developed with Micron Technology
Designed using Intel’s cutting-edge
14nm Transistor
Technology
Not bound by “offloading” bottlenecks
Standalone CPU or
PCIe Coprocessor
Common instruction set architecture
Intel® Advanced Vector
On-Package Memory: Extensions 512
§ up to 16GB at launch
§ 5X Bandwidth vs DDR47
29. 29
Summary
We have demonstrated that important research for a variety of science problems
continues to benefit from hardware and software advances in HPC.
We believe that the Intel® path forward for HPC architecture and software offers
a solution that allows for a simpler code base and reduced software effort
• We are dedicated to providing solutions that allow for portable performance
across a range of different processors for both conventional HPC centers and
those on a path to exa-scale.
We have demonstrated, in a large production code, that optimizations for Intel®
Xeon Phi™ coprocessors also improve performance on Intel® Xeon® processors
and that the same routine can be used for efficient computations on both
30. Summary (2)
30
The innovations here also included
reconsideration of the simulation model
• Important to realize that there is often a
choice in the model used, and that in the
past, minimizing computation was
important to performance
• It may be the case that you can exploit
additional arithmetic intensity and
concurrency to increase the accuracy,
time-step, etc. in order to advance
research capabilities
• This has already been the case for
simulation in materials problems
GPT BOP
CHARMm
Stillinger-Weber
EAM
Tersoff
AIREBO
MEAM
ReaxFF
ReaxFF/C
eFF
COMB
T/2
)
REBO
EIM
GAP
1980 1990 2000 2010
Year Published
-3
10
-5
10
10
-6
10
-4
10
-2
Cost [core-sec/atom-timestep]
O(2
Computational Aspects of Many-body
Potentials, S. J. Plimpton and A. P. Thompson,
MRS Bulletin*, 37, 513-521 (2012).
* Other names and brands may be claimed as the property of others.
31. 31
Acknowledgements
Research by the teams presented here made possible by:
DOE Early Science and ALCC Programs
This research used resources of the Oak Ridge Leadership Computing Facility* at the Oak Ridge National Laboratory*,
which is supported by the Office of Science of the U.S. Department of Energy* under Contract No. DE-AC05-00OR22725.
NSF TACC* Stampede Project: ACI-1134872
The researchers acknowledge the Texas Advanced Computing Center* (TACC) at The University of Texas at Austin* for
providing HPC resources that have contributed to the research results reported within this paper. URL: http://
www.tacc.utexas.edu
NSF UT* Beacon Project
This material is based upon work supported by the National Science Foundation under Grant Number 1137097 and by the
University of Tennessee* through the Beacon Project. Any opinions, findings, and conclusions or recommendations
expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science
Foundation* or the University of Tennessee*.
* Other names and brands may be claimed as the property of others.
32. 32
Getting Access to Intel® Xeon Phi™ HPC Systems
NSF XSEDE*:
TACC* Stampede, LSU* SuperMIC
https://www.xsede.org/allocations
NSF UT* Beacon:
https://www.nics.tennessee.edu/computing-resources/beacon/allocations
DOE NERSC* Users (Babbage):
https://www.nersc.gov/users/computational-systems/testbeds/babbage/
Purdue* Conte (Purchase available):
https://www.rcac.purdue.edu/compute/conte/
* Other names and brands may be claimed as the property of others.