LAMMPS is a classical molecular dynamics code used to simulate the movement of atoms to develop better therapeutics, improve alternative energy devices, and more. The document discusses optimizations to LAMMPS that have enabled simulations on Intel processors to now run up to 7.6X faster and with over 9X better energy efficiency. These achievements are helping the LAMMPS user community overcome barriers in computational modeling.
Intel Core i5 & Core i3 processor-powered HP EliteBooks: A better experience ...Principled Technologies
Workers have a range of tasks to complete and use a number of different applications. Laptops for these users, while providing advantages in mobility, are not always equal in terms of performance, experience, or battery life. We found that Intel Core i5 and Core i3 processor-powered HP EliteBooks provided a number of advantages in performance and application responsiveness over an AMD processor-based HP EliteBook, while also delivering longer battery life. When your organization needs notebooks for a broad range of users performing different yet vital tasks, our testing shows that an Intel processor-powered HP EliteBook could offer a better experience than an AMD processor-based HP EliteBook notebook.
Fault tolerance ease of setup comparison: NEC hardware-based FT vs. software-...Principled Technologies
For enterprise datacenter staff, time is of the essence. While using a software-based FT solution such as VMware vSphere Fault Tolerance is an effective way to eliminate downtime, the five extra steps required to configure every single VM can add up.
In our hands-on tests, setting up a server with eight fault-tolerant virtual machines took only 41 steps on the NEC Express5800/R320d-M4, vs. 60 steps when we used a VMware vSphere Fault Tolerance, a difference of 36.6 percent. With a greater number of VMs per server, this difference would increase. Hardware-based fault tolerance on the NEC Express5800/R320d-M4 also reduced the number of necessary hardware components required by half compared to the software-based based FT approach.
With dozens of servers hosting hundreds of VMs, your IT staff can benefit enormously from the hardware-based fault tolerance that the NEC Express5800/R320d-M4 delivers.
Remote monitoring and management of switches from IBM BNT. Centralized point of administration to improve service delivery while reducing management costs.
Intel Core i5 & Core i3 processor-powered HP EliteBooks: A better experience ...Principled Technologies
Workers have a range of tasks to complete and use a number of different applications. Laptops for these users, while providing advantages in mobility, are not always equal in terms of performance, experience, or battery life. We found that Intel Core i5 and Core i3 processor-powered HP EliteBooks provided a number of advantages in performance and application responsiveness over an AMD processor-based HP EliteBook, while also delivering longer battery life. When your organization needs notebooks for a broad range of users performing different yet vital tasks, our testing shows that an Intel processor-powered HP EliteBook could offer a better experience than an AMD processor-based HP EliteBook notebook.
Fault tolerance ease of setup comparison: NEC hardware-based FT vs. software-...Principled Technologies
For enterprise datacenter staff, time is of the essence. While using a software-based FT solution such as VMware vSphere Fault Tolerance is an effective way to eliminate downtime, the five extra steps required to configure every single VM can add up.
In our hands-on tests, setting up a server with eight fault-tolerant virtual machines took only 41 steps on the NEC Express5800/R320d-M4, vs. 60 steps when we used a VMware vSphere Fault Tolerance, a difference of 36.6 percent. With a greater number of VMs per server, this difference would increase. Hardware-based fault tolerance on the NEC Express5800/R320d-M4 also reduced the number of necessary hardware components required by half compared to the software-based based FT approach.
With dozens of servers hosting hundreds of VMs, your IT staff can benefit enormously from the hardware-based fault tolerance that the NEC Express5800/R320d-M4 delivers.
Remote monitoring and management of switches from IBM BNT. Centralized point of administration to improve service delivery while reducing management costs.
z/OS Small Enhancements - Episode 2013AMarna Walle
This presentation covers small enhancements from older z/OS releases. You might have missed little functions that are helpful, but you never knew existed! The content of each of these z/OS Small Enhancements changes every half year (Episode A and Episode B each year).
This is the updates the Small Enhancements slides, for the second half of 2016. (Remember: 50% is updated every 6 months, so 50% of this is identical to Edition 2016A.) Enjoy.
z/OS Small Enhancements - Episode 2014AMarna Walle
This presentation covers small enhancements from older z/OS releases. You might have missed little functions that are helpful, but you never knew existed! The content of each of these z/OS Small Enhancements changes every half year (Episode A and Episode B each year).
z/OS Small Enhancements - Episode 2016AMarna Walle
This presentation covers small enhancements from older z/OS releases. You might have missed little functions that are helpful, but you never knew existed! The content of each of these z/OS Small Enhancements changes every half year (Episode A and Episode B each year).
Ims13 ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...Robert Hain
Together, the IBM IMS Tools Solution Packs and IMS 13 deliver simplification, automation and intelligence, with all the tools needed to support IMS databases now in one package. It doesn’t make sense to run reorganization utilities if your databases do not need to be reorganized. Now you can quickly and easily improve IMS application performance, IMS resource utilization and deliver higher system availability with the end-to-end analysis of IMS transactions. Comprehensive performance reporting and easier interactive analysis determine what happened, what needs fixing and how to fix it – all part of the intelligence and automation of the IMS Tools Performance Solution Pack.
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Amazon Web Services
Cloud services built on compute-optimized EC2 instances can serve as your next-generation HPC platform. Learn how to utilize the Rescale platform on AWS to meet the ever-increasing demands on compute resources while avoiding costly capex investments. Custom Intel Xeon processors enable you to meet your HPC needs by taking advantage of the newest technologies in the cloud. This session is brought to you by AWS partner, Intel.
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel IT Center
This set of Intel® Xeon® processor E7-8800/4800 v4 family proof points spans several key business segments. The Intel® Xeon® processor E7-8800/4800 v4 product family delivers the horsepower for real-time, high-capacity data analysis that can help businesses derive rapid actionable insights to deliver innovative new services and customer experiences. With high performance, industry’s largest memory, robust reliability, and hardware-enhanced security features, the E7-8800/4800 v4 is optimal for scale-up platforms, delivering rapid in-memory computing for today’s most demanding real-time data and transaction-intensive workloads.
Abstract: Explore the packet I/O data path from a NIC across PCI-Express to cache/memory and understand how to build efficient CPU code for networked applications.
Speaker: Venky Venkatesan, Intel Fellow, Chief Architect – Packet Processing and Networking Applications
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY
HPC DAY 2017 - http://www.hpcday.eu/
Accelerating tomorrow's HPC and AI workflows with Intel Architecture
Atanas Atanasov | HPC solution architect, EMEA region at Intel
hbaseconasia2019 HBase Bucket Cache on Persistent MemoryMichael Stack
Anoop Sam John, Ramkrishna S Vasudevan, and Xu Kai of Intel
Track 1: Internals
https://open.mi.com/conference/hbasecon-asia-2019
THE COMMUNITY EVENT FOR APACHE HBASE™
July 20th, 2019 - Sheraton Hotel, Beijing, China
https://hbase.apache.org/hbaseconasia-2019/
Intel's Data Center & Connected Systems Group and Diane Bryant shares the latest news on the latest Intel Xeon E5v2 family of processors and technologies like Intel Network Builders to enable the re-architecture of the Data Center.
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...Databricks
Apache Spark is a popular data processing engine designed to execute advanced analytics on very large data sets which are common in today’s enterprise use cases. To enable Spark’s high performance for different workloads (e.g. machine-learning applications), in-memory data storage capabilities are built right in.
However, Spark’s in-memory capabilities are limited by the memory available in the server; it is common for computing resources to be idle during the execution of a Spark job, even though the system’s memory is saturated. To mitigate this limitation, Spark’s distributed architecture can run on a cluster of nodes, thus taking advantage of the memory available across all nodes. While employing additional nodes would solve the server DRAM capacity problem, it does so at an increased cost. Intel(R) Memory Drive Technology is a software-defned memory (SDM) technology, which combined with an Intel(R) Optane(TM) SSD, expands the system’s memory.
This combination of Intel(R) Optane(TM) SSD with Intel Memory Drive Technology alleviates those memory limitations that are inherent to Spark, by making more memory available to the operating system and to Spark jobs, transparently.
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel IT Center
The Future-Ready Data Center platform is here. Whether you navigate in the High Performance Computing, Enterprise, Cloud, or Communications spheres, you will find an Intel® Xeon® processor that is ready to power your data center now and well into the future. An innovative approach to platform design in the Intel® Xeon® Scalable processor platform unlocks the power of scalable performance for today’s data centers—from the smallest workloads to your most mission-critical applications. Powerful convergence and capabilities across compute, storage, memory, network and security deliver unprecedented scale and highly optimized performance across a broad range of workloads—from high performance computing (HPC) and network functions virtualization, to advanced analytics and artificial intelligence (AI). Many examples here show how our software partner ecosystem has optimized their applications and/or taken advantage of inherent platform enhancements to deliver dramatic performance gains, that can translate into tangible business benefits.
Accelerating Apache Spark with Intel QuickAssist TechnologyDatabricks
Enterprise and cloud data centers are under pressure to continuously expand revenue-generating and value-added services, such as compute intensive and I/O-demanding Big Data solutions, which moves large amounts of data into and out of storage, and sends it across the networked clusters.
A significant amount of time and network bandwidth can be saved when the data is compressed before it is passed between servers, as long as the compression/decompression operations are efficient and require negligible CPU cycles. Intel QuickAssist Technology allows compute-intensive workloads, specifically compression, to be offloaded from the CPU core onto dedicated hardware accelerators. Intel Quick Assist Technology enables developers to create software solutions that leverage compression/decompression acceleration, accessing the technology through APIs in the Intel QuickAssist Software.
This talk provides developers with information on Intel QuickAssist Technology and presents some key use cases to provide background for them to understand how they can take advantage of the hardware-based compression acceleration and performance improvements available with Intel QuickAssist Technology in their Spark applications.
VTune Amplifier XE OpenMP analysis answers on customer’s questions about performance on the language of OpenMP constructs
• The analysis is well-scalable for many-core systems with good balance of tracing and sampling collection technologies
• The OpenMP analysis is “MPI-aware” that is helpful for inner-node hybrid MPI + OpenMP application tuning
• The full feature set is available in VTune Amplifier XE 2018 with Intel
OpenMP and Intel MPI runtimes as a part of Intel® Parallel Studio XE 2018
z/OS Small Enhancements - Episode 2013AMarna Walle
This presentation covers small enhancements from older z/OS releases. You might have missed little functions that are helpful, but you never knew existed! The content of each of these z/OS Small Enhancements changes every half year (Episode A and Episode B each year).
This is the updates the Small Enhancements slides, for the second half of 2016. (Remember: 50% is updated every 6 months, so 50% of this is identical to Edition 2016A.) Enjoy.
z/OS Small Enhancements - Episode 2014AMarna Walle
This presentation covers small enhancements from older z/OS releases. You might have missed little functions that are helpful, but you never knew existed! The content of each of these z/OS Small Enhancements changes every half year (Episode A and Episode B each year).
z/OS Small Enhancements - Episode 2016AMarna Walle
This presentation covers small enhancements from older z/OS releases. You might have missed little functions that are helpful, but you never knew existed! The content of each of these z/OS Small Enhancements changes every half year (Episode A and Episode B each year).
Ims13 ims tools ims v13 migration workshop - IMS UG May 2014 Sydney & Melbo...Robert Hain
Together, the IBM IMS Tools Solution Packs and IMS 13 deliver simplification, automation and intelligence, with all the tools needed to support IMS databases now in one package. It doesn’t make sense to run reorganization utilities if your databases do not need to be reorganized. Now you can quickly and easily improve IMS application performance, IMS resource utilization and deliver higher system availability with the end-to-end analysis of IMS transactions. Comprehensive performance reporting and easier interactive analysis determine what happened, what needs fixing and how to fix it – all part of the intelligence and automation of the IMS Tools Performance Solution Pack.
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Amazon Web Services
Cloud services built on compute-optimized EC2 instances can serve as your next-generation HPC platform. Learn how to utilize the Rescale platform on AWS to meet the ever-increasing demands on compute resources while avoiding costly capex investments. Custom Intel Xeon processors enable you to meet your HPC needs by taking advantage of the newest technologies in the cloud. This session is brought to you by AWS partner, Intel.
Intel® Xeon® Processor E7-8800/4800 v4 EAMG 2.0Intel IT Center
This set of Intel® Xeon® processor E7-8800/4800 v4 family proof points spans several key business segments. The Intel® Xeon® processor E7-8800/4800 v4 product family delivers the horsepower for real-time, high-capacity data analysis that can help businesses derive rapid actionable insights to deliver innovative new services and customer experiences. With high performance, industry’s largest memory, robust reliability, and hardware-enhanced security features, the E7-8800/4800 v4 is optimal for scale-up platforms, delivering rapid in-memory computing for today’s most demanding real-time data and transaction-intensive workloads.
Abstract: Explore the packet I/O data path from a NIC across PCI-Express to cache/memory and understand how to build efficient CPU code for networked applications.
Speaker: Venky Venkatesan, Intel Fellow, Chief Architect – Packet Processing and Networking Applications
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY
HPC DAY 2017 - http://www.hpcday.eu/
Accelerating tomorrow's HPC and AI workflows with Intel Architecture
Atanas Atanasov | HPC solution architect, EMEA region at Intel
hbaseconasia2019 HBase Bucket Cache on Persistent MemoryMichael Stack
Anoop Sam John, Ramkrishna S Vasudevan, and Xu Kai of Intel
Track 1: Internals
https://open.mi.com/conference/hbasecon-asia-2019
THE COMMUNITY EVENT FOR APACHE HBASE™
July 20th, 2019 - Sheraton Hotel, Beijing, China
https://hbase.apache.org/hbaseconasia-2019/
Intel's Data Center & Connected Systems Group and Diane Bryant shares the latest news on the latest Intel Xeon E5v2 family of processors and technologies like Intel Network Builders to enable the re-architecture of the Data Center.
Optimizing Apache Spark Throughput Using Intel Optane and Intel Memory Drive...Databricks
Apache Spark is a popular data processing engine designed to execute advanced analytics on very large data sets which are common in today’s enterprise use cases. To enable Spark’s high performance for different workloads (e.g. machine-learning applications), in-memory data storage capabilities are built right in.
However, Spark’s in-memory capabilities are limited by the memory available in the server; it is common for computing resources to be idle during the execution of a Spark job, even though the system’s memory is saturated. To mitigate this limitation, Spark’s distributed architecture can run on a cluster of nodes, thus taking advantage of the memory available across all nodes. While employing additional nodes would solve the server DRAM capacity problem, it does so at an increased cost. Intel(R) Memory Drive Technology is a software-defned memory (SDM) technology, which combined with an Intel(R) Optane(TM) SSD, expands the system’s memory.
This combination of Intel(R) Optane(TM) SSD with Intel Memory Drive Technology alleviates those memory limitations that are inherent to Spark, by making more memory available to the operating system and to Spark jobs, transparently.
Intel® Xeon® Scalable Processors Enabled Applications Marketing GuideIntel IT Center
The Future-Ready Data Center platform is here. Whether you navigate in the High Performance Computing, Enterprise, Cloud, or Communications spheres, you will find an Intel® Xeon® processor that is ready to power your data center now and well into the future. An innovative approach to platform design in the Intel® Xeon® Scalable processor platform unlocks the power of scalable performance for today’s data centers—from the smallest workloads to your most mission-critical applications. Powerful convergence and capabilities across compute, storage, memory, network and security deliver unprecedented scale and highly optimized performance across a broad range of workloads—from high performance computing (HPC) and network functions virtualization, to advanced analytics and artificial intelligence (AI). Many examples here show how our software partner ecosystem has optimized their applications and/or taken advantage of inherent platform enhancements to deliver dramatic performance gains, that can translate into tangible business benefits.
Accelerating Apache Spark with Intel QuickAssist TechnologyDatabricks
Enterprise and cloud data centers are under pressure to continuously expand revenue-generating and value-added services, such as compute intensive and I/O-demanding Big Data solutions, which moves large amounts of data into and out of storage, and sends it across the networked clusters.
A significant amount of time and network bandwidth can be saved when the data is compressed before it is passed between servers, as long as the compression/decompression operations are efficient and require negligible CPU cycles. Intel QuickAssist Technology allows compute-intensive workloads, specifically compression, to be offloaded from the CPU core onto dedicated hardware accelerators. Intel Quick Assist Technology enables developers to create software solutions that leverage compression/decompression acceleration, accessing the technology through APIs in the Intel QuickAssist Software.
This talk provides developers with information on Intel QuickAssist Technology and presents some key use cases to provide background for them to understand how they can take advantage of the hardware-based compression acceleration and performance improvements available with Intel QuickAssist Technology in their Spark applications.
VTune Amplifier XE OpenMP analysis answers on customer’s questions about performance on the language of OpenMP constructs
• The analysis is well-scalable for many-core systems with good balance of tracing and sampling collection technologies
• The OpenMP analysis is “MPI-aware” that is helpful for inner-node hybrid MPI + OpenMP application tuning
• The full feature set is available in VTune Amplifier XE 2018 with Intel
OpenMP and Intel MPI runtimes as a part of Intel® Parallel Studio XE 2018
This session was held by Vladimir Brenner, Partner Account Manager, Disruptors & AI, Intel AI at the Dive into H2O: London training on June 17, 2019.
Please find the recording here: https://youtu.be/60o3eyG5OLM
Accelerate Your Apache Spark with Intel Optane DC Persistent MemoryDatabricks
The capacity of data grows rapidly in big data area, more and more memory are consumed either in the computation or holding the intermediate data for analytic jobs. For those memory intensive workloads, end-point users have to scale out the computation cluster or extend memory with storage like HDD or SSD to meet the requirement of computing tasks. For scaling out the cluster, the extra cost from cluster management, operation and maintenance will increase the total cost if the extra CPU resources are not fully utilized. To address the shortcoming above, Intel Optane DC persistent memory (Optane DCPM) breaks the traditional memory/storage hierarchy and scale up the computing server with higher capacity persistent memory. Also it brings higher bandwidth & lower latency than storage like SSD or HDD. And Apache Spark is widely used in the analytics like SQL and Machine Learning on the cloud environment. For cloud environment, low performance of remote data access is typical a stop gap for users especially for some I/O intensive queries. For the ML workload, it's an iterative model which I/O bandwidth is the key to the end-2-end performance. In this talk, we will introduce how to accelerate Spark SQL with OAP (https://github.com/Intel-bigdata/OAP) to accelerate SQL performance on Cloud to archive 8X performance gain and RDD cache to improve K-means performance with 2.5X performance gain leveraging Intel Optane DCPM. Also we will have a deep dive how Optane DCPM for these performance gains.
Speakers: Cheng Xu, Piotr Balcer
AWS Summit Singapore - Make Business Intelligence Scalable and AdaptableAmazon Web Services
Akanksha Bilani, APAC Director, Intel Software
Kapil Bansal, Alliance Head, APJ, Intel
Business, science, and academia are using AI applications — in the data center, the cloud, and at the edge — supported by a broad, growing portfolio of Intel technologies. Come join Kapil Bansal (Alliance Head – APJ) and Akanksha Bilani (APAC Director, Intel Software) to learn how Intel helps make AI initiatives practical and straightforward. Learn from the opportunities AI brings in and be part of the era of the convergence of data and the power of compute.
NFF-GO (YANFF) - Yet Another Network Function FrameworkMichelle Holley
NFF-Go is a framework allows developers to deploy performant cloud-native network functions much faster. NFF-Go internally implements low-level optimizations and can auto-scale to multicores using built-in capabilities to take advantage of Intel® architecture. NFF uses Data Plane Development Kit (DPDK) for efficient input/output (I/O) and Go programming language as a high-level, safe, productive language.
Networks need to incorporate innovative and high-performance packet processing entities to meet the demands of meteoric rise in data coupled with advances in compute capacity and innovative apps. A fully programmable forwarding plane enables network owners to build the network they want and evolve it as the needs change. P4 is a domain specific language for networking and it empowers network builders to craft the functionality they need in a high-level programming language and execute it at line-rate on a variety of devices including the Barefoot Tofino series of Ethernet switches. This talk will give an overview of P4 and go over a couple of use-cases.
Open Source Interactive CPU Preview Rendering with Pixar's Universal Scene De...Intel® Software
Universal Scene Description* (USD) is an open source initiative developed by Pixar for fast, large scale, and universal asset management across multiple programs including Maya, Houdini, and others.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
2. Lammps Coarse-Grain Water Simulation
LAMMPS is a classical molecular dynamics code, and an acronym for
Large-scale Atomic/Molecular Massively Parallel Simulator. It is used to
simulate the movement of atoms to develop better therapeutics, improve
alternative energy devices, develop new materials, and more.
Application: Coarse-grain water simulation with LAMMPS using Stillinger-
Weber potential. More at http://lammps.sandia.gov/
Code: In main LAMMPS repository. Recipe: Available here
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Life Sciences
Ice formation in water droplet
wetting a flat surface with
coarse-grain model in LAMMPS
Image Source: Comput. Phys. Commun., 2013, 184, 2785-2793
2
3. "LAMMPS is an established code in the molecular dynamics simulation community with users from a variety of
disciplines including materials modeling, biology, and many others. Over the past twenty years, the code base
has been refactored and extended with high performing kernels and more complex interatomic and coarse-
grained potentials.
In partnership with Intel, software optimizations focused on several commonly-used LAMMPS models have
enabled simulations on the latest Intel® processors to now run up to 7.6X faster with over 9X the energy
efficiency, compared to LAMMPS runs a year ago on previous Intel processors. Furthermore, using a single
Intel® Xeon Phi™ processor (codenamed Knights Landing), the optimized LAMMPS code now runs up to
1.95X faster with 2.83X better performance per watt, when compared to two Intel® Xeon® E5 v3 processors.
These achievements are enabling the LAMMPS user community to overcome barriers in computational
modeling, enabling new research with larger simulation sizes and longer timescales." Steve Plimpton,
Distinguished Member of Technical Staff, Sandia National Laboratories.”
- Steve Plimpton, Distinguished Member of the Technical Staff,
Sandia National Laboratories
See the LAMMPS Technology Brief here
3
4. vasp*
The Vienna Ab initio Simulation Package (VASP) is a computer program for
atomic scale materials modeling and performs electronic structure calculations
and quantum-mechanical molecular dynamics from first principles.
Application: Vienna Ab initio Simulation Package (VASP)
Code: Available here Recipe: See configuration details. Also, check for future
availability here
Power Data: Total system wall power is measured out-of-band over iPMI
interface, polling the BMC chip every half second. Energy usage is matched to
internally timed code segment to arrive at performance-per-watt estimate.
Image Source: Intel
Material Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others 4
5. Intel embree V2.13
Embree is a collection of high-performance ray tracing kernels, developed at
Intel. The target user of Embree are graphics application engineers that
want to improve the performance of their application by leveraging the
optimized ray tracing kernels of Embree. Embree supports runtime code
selection to choose the traversal and build algorithms that best matches the
instruction set of the CPU.
Application: Path tracer renderer using Embree v2.13, ICC 2016 Update 1,
Intel® SPMD program compiler (Intel® ISPC) 1.9.1, and OptiX* RT libraries.
Code: Available here Recipe: Available here
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Image Source:
Intel
DCC/Visualization
5
6. Industrial standard benchmark that uses Monte Carlo method for pricing
European call options. It pre-generates random numbers then uses them in
all options pricing processes. Used by all financial firms to price derivatives
with multiple dimensions. Uses the stock price, strike price and time as
input streams then creates a call output stream.
Application: Monte Carlo European Options
Code and Recipe: Available here
Monte Carlo European options
benchmark*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Financial Services
Image Source: US gov.
6
7. Black-Scholes benchmark*
Industrial standard benchmark that calculates call and put option price
using the Black-Scholes-Merton Formula. Used by all financial firms to
price derivatives with multiple dimensions. Stock price, strike price and time
are input streams that create call and put as output streams.
Application: Black-Scholes formula
Code and Recipe: Available here
Financial Services
Image Source: US gov
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others 7
8. STAC-A2* BENCHMARK
The STAC-A2 Benchmark suite is the industry standard created by the
financial community to test technology stacks used for compute-intensive
analytic workloads involved in pricing and risk management
Application: Intel Composer XE STAC Pack Rev. H
Code: Available here Recipe: Available here
Image Source: Intel
“STAC” and all STAC names are
trademarks or registered trademarks
of the Securities Technology
Analysis Center LLC.
Financial Services
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others 8
9. Binomial options Pricing
benchmark*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Binomial Tree Option pricing method is an industrial standard
benchmark that calculates call option prices. Used by financial firms
especially for options that involve early exercise clause. Uses stock
price, strike price and time as input streams, then creates a call
output stream.
Application: Binomial Tree Option Pricing
Code and Recipe: Check for availability here
Financial Services
Image Source: Science Direct
9
10. CP2K is a powerful and scalable program for atomistic simulations of a wide
range of systems, including condensed phase, molecular systems and complex
interfaces. CP2K features a wide range of atomistic interaction models including
classical potentials, semi-empirical schemes, Density Functional Theory (DFT),
Hartree-Fock (HF), and post-HF correlation methods such as MP2 and RPA. The
program was a Gordon Bell Finalist in 2015. CP2K is freely available.
Application: CP2K Quantum Chemistry & Solid State Physics Software Package
Code: Available Here Recipe: Check for availability here
Courtesy of ETH Zurich. (url)
Material Sciences Linear Scaling (LS) Density
Function Theory (DFT)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others 10
11. The MILC Code is used to study quantum chromodynamics (QCD), the theory
of the strong interactions of subatomic physics and is written by the MIMD
Lattice Computation (MILC) collaboration.
Application: Trinity MILC provided by NERSC as part of Trinity8 suite
Code: Original NERSC Benchmark code is here; contact Intel for Optimized
Code Recipe: Check for availability here
MILC*Physics - QCD
Image Credit: Brookhaven Lab (BNL)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others 11
12. QPhiX is an optimized solver library for QCD on Intel® Xeon® and Intel® Xeon
Phi™ processors and provides implementation for Dslash operator and CG,
BICGStab and mixed precision solvers for Wilson and Clover improved Wilson
Quarks.
Application: QPhiX Test Benchmark (time_dslash_noqdp), QUDA* (NVIDIA*)
Code: https://github.com/JeffersonLab/qphix Recipe: Follow the instructions in
the download package
QPhix*Physics - QCD
Flux tubes between 3 quarks.
Credit: Dr Derek Leinweber, Centre for the Subatomic structure of Matter
(CSSM) and Department of Physics, University of Adelaide, 5005
Australia
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others 12
13. BAW American options
Approximation benchmark*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Popular analytical method of pricing exchange-traded American
options using quadratic approximation. Uses an underlying asset and
carrying cost rate as key inputs and prices commodity options,
futures and foreign exchange options, precious metals, long-terms
debt and stock indexes with continuous dividend yields. Uses the
stock price, strike price and time as input streams and creates a call
output stream.
Application: Barone-Adesi and Whaley (BAW) American Options
Approximation
Code and Recipe: Available here
Financial Services
Image Sources: Parsiad.azimzadeh, finance.bi.no
13
14. Monte Carlo European options
benchmark*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Industrial standard benchmark that uses Monte Carlo method for
pricing European call options. It pre-generates random numbers then
uses them in all options pricing processes. Used by all financial firms
to price derivatives with multiple dimensions. Uses the stock price,
strike price and time as input streams then creates a call output
stream.
Application: Monte Carlo European Options
Code and Recipe: Available here
Financial Services
Image Source: US gov.
14
15. Black-Scholes benchmark*
Industrial standard benchmark that calculates call and put option price
using the Black-Scholes-Merton Formula. Used by all financial firms to
price derivatives with multiple dimensions. Stock price, strike price and time
are input streams that create call and put as output streams.
Application: Black-Scholes formula
Code and Recipe: Available here
Financial Services
Image Source: US gov
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
15
16. Binomial options Pricing
benchmark*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Binomial Tree Option pricing method is an industrial standard
benchmark that calculates call option prices. Used by financial firms
especially for options that involve early exercise clause. Uses stock
price, strike price and time as input streams, then creates a call
output stream.
Application: Binomial Tree Option Pricing
Code and Recipe: Check for availability here
Financial Services
Image Source: Science Direct
16
17. Amber16* Generalized Born
(Implicit Solvent)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Amber* is a bio related simulation code for DNA, RNA, protein, and other
bio-molecules. Amber has two solvers: Particle Mesh Ewald (PME),
known as Explicit, and Generalized Born (GB), known as Implicit. Amber
is written in Fortran 90 and is mainly MPI*, OpenMP* and Vectorization
parallelized.
Application: Amber 16 PMEMD Implicit
Code: In main Amber GIT repository. Recipe: http://ambermd.org/intel
Life Sciences
Images Source: Intel
17
18. Amber16* PME (explicit solvent)
Amber* is a bio related simulation code for DNA, RNA, protein, and other bio-
molecules. Amber has two solvers: Particle Mesh Ewald (PME), known as
Explicit, and Generalized Born (GB), known as Implicit. Amber is written in
Fortran 90 and is mainly MPI*, OpenMP* and Vectorization parallelized.
Application: Amber 16 PMEMD Explicit
Code: In main Amber GIT repository. Recipe: http://ambermd.org/intel
Life Sciences
Images Source: Intel
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
18
19. NWChem NWPW AIMD
SIMULATION
NWChem is a computational chemistry software package which also includes
quantum chemical and molecular dynamics functionality. It was designed to
run on high-performance parallel supercomputers as well as conventional
workstation clusters. It aims to be scalable both in its ability to treat large
problems efficiently, and in its usage of available parallel computing
resources.
Application: Ab-initio molecular dynamics simulation (NWChem NWPW),
water128 benchmark
Code: Main NWChem repository at https://svn.pnl.gov/svn/nwchem/trunk,
SVN rev 28860. Recipe: Check for availability here
Life Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
19
Images Source: US Govt.; NWChem
20. ROME/sml
ROME (Refinement and Optimization via Machine lEarning for cryo-EM) is one
of the major research software packages from the Dana-Farber Cancer Institute.
ROME is a parallel computing software system dedicated to high-resolution cryo-
EM structure determination and data analysis, implementing advanced machine
learning approaches optimized for HPC clusters. ROME 1.0 introduces SML
(statistical manifold learning)-based deep classification, following MAP-based
(maximum a posteriori) image alignment.
Application: ROME/SML
Code and Recipe: Available here
Images Source: L. Zhang, S. Chen, J. Ruan, J. Wu, A.B. Tong, Q. Yin, Y. Li, L. David, A. Lu, W.L.
Wang, C. Marks, Q. Ouyang, X. Zhang, Y. Mao*, H. Wu*. Cryo-EM structure of the activated NAIP2-
NLRC4 inflammasome reveals nucleated polymerization. Science 350, 404-409 (2015).
Life Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
20
21. RELION (REgularised LIkelihood OptimisatioN) is a stand-alone computer
program that employs an empirical Bayesian approach to refinement of 3D
reconstructions or 2D class averages in Cryo-EM.
Application: RELION 1.4
Code and Recipe: Available here
RELIONLife Sciences
Images Source: http://www.rcsb.org/
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
21
22. Nanoscale Molecular Dynamics
program*
Nanoscale Molecular Dynamics program (NAMD) is a parallel molecular
dynamics code designed for high-performance simulation of large
biomolecular systems. Based on Charm++ parallel objects, NAMD scales to
hundreds of cores for typical simulations and beyond 200,000 cores for the
largest simulations.
Application: NAMD 2.11
Code: http://www.ks.uiuc.edu/Research/namd/
Recipe: Check for availability here
Life Sciences
Image Source: Use approved by NAMD
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
22
23. GROMACS*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
GROMACS (GROningen MAchine for Chemical Simulations ) is a
versatile package to perform classical Molecular Dynamics simulations.
Heavily optimized for most modern platforms and provides extremely
high performance compared to all other MD codes.
Application: GROMACS
Code: Available here
Recipe: All optimizations merged in GROMACS 2016 branch, MKL FFT
Life Sciences
Images Source: Used with permission
23
24. GROMACS*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
GROMACS (GROningen MAchine for Chemical Simulations ) is a
versatile package to perform classical Molecular Dynamics
simulations. Heavily optimized for most modern platforms and
provides extremely high performance compared to all other MD
codes.
Application: GROMACS (Intel® AVX-512 speedup)
Code: Available here
Recipe: All optimizations merged in GROMACS 2016 branch, MKL FFT
Life Sciences
Images Source: Used with permission
24
25. LB3D is a 3D Lattice Boltzmann method kernel developed by Carlos Rosales
of Texas Advanced Computer Center (TACC) and used in multiphase flows
with applications in the multiphase reactors and separation systems.
Application: LBS3D (Advanced support for multiphase flows with large
density and viscosity ratios)
Code: Available here
Recipe: No code changes were required. Recompile with Intel® AVX-512.
Texas advanced computer center
LBS3D*
Manufacturing
Image Source: Used with permission
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
25
26. Openfoam*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
OpenFOAM (for "Open source Field Operation And Manipulation")
is a C++ toolbox for the development of customized numerical
solvers, and pre-/post-processing utilities for the solution of
continuum mechanics problems, including computational fluid
dynamics (CFD).
Application: OpenFOAM
Code: Available here Recipe: Available here
Optimizations: https://github.com/OpenFOAM/OpenFOAM-Intel
Image Source: Intel
Manufacturing
26
27. NASA Overflow*
OVERFLOW is a 3D time marching implicit Navier-Stokes computational
fluid dynamics simulator developed by NASA and used across aerospace
and other industries.
Application: OVERFLOW 2.2L has an extensive feature set supporting
collision detection and modelling with support for thin layer and full viscous
terms.
Code: http://overflow.larc.nasa.gov/
Recipe: No code changes were required. Recompile with Intel® AVX-512.
Manufacturing
UH-60 Black Hawk helicopter and Navier-Stokes detached eddy simulation of a flexible
UH-60 rotor using the OVERFLOW CFD code in forward flight.
Image Source: Neal Chaderjian and Tim Sandstrom, NASA/Ames
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
27
28. LS-DYNA is an advanced general-purpose Multiphysics simulation software
package developed by LSTC. While the package continues to contain more
and more possibilities for the calculation of many complex, real world
problems, its origins and core-competency lie in highly nonlinear transient
dynamic finite element analysis (FEA) using explicit time integration. LS-
DYNA is used by the automobile, aerospace, construction, military,
manufacturing, and bioengineering industries.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
28
Lstc ls-dyna*Manufacturing
Application:
LS-DYNA non-linear
explicit FEA software
Code: Available here.
Recipe NA
RELATED
VIDEO
29. OpenLB*
The OpenLB project provides a C++ package for the implementation of
lattice Boltzmann simulations that is general enough to address a vast
range of problems in computational fluid dynamics. The package is mainly
intended as a programming support for researchers and engineers who
simulate fluid flows by means of a Lattice Boltzmann method.
Application: OpenLB
Code: Available here Recipe: Available here
Image Source: Used with permission
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Manufacturing
29
30. High resolution flow solver on unstructured
meshes (hIfun)*
HiFUN is a general purpose flow solver employing unstructured data based
algorithms and fine tuned to solve typical aerospace applications. The code
has been extensively used for solving a number of problems, over a wide
range of Mach numbers, ranging from airship aerodynamics to aerodynamics
of hypersonic vehicles.
Application: HiFUN
Baseline Code: Proprietary code (2.5.1 beta version), http://sandi.co.in/
Optimized Code: Proprietary code-Pending check-in
Recipe: “mpiifort” OPTIONS=“-xMIC-AVX512 –O3”
Manufacturing
Image Source: SandI Engineering
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
30
31. GE TACOMA*
TACOMA is an explicit Navier-Stokes computational fluid dynamics
software with structured and unstructured grid capability developed by
General Electric*. It has an extensive feature set including multi-grid
methods and is routinely used for the design of turbomachinary at GE.
Application: TACOMA
Code: GE internal in-house proprietary code
Recipe: Code optimized for outer-loop vectorization of key routines
Manufacturing
Left: Canonical compressor simulation, Right: Low pressure turbine simulation
Image Source: Brian Mitchell, GE Global Research
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
31
32. Nucleus for European Modelling
of the Ocean* (NEMO)
Nucleus for European Modelling of the Ocean (NEMO) is an ocean modelling
framework composed of "engines" in an "environment". The "engines" provide
numerical solutions of ocean, sea-ice, tracers and biochemistry equations and
their related physics. The "environment" consists of the pre- and post-processing
tools, the interface to the other components of the Earth System, the user
interface, the computer dependent functions and the documentation of the
system. NEMO allows several ocean related components of the earth system to
work together or separately.
Application: NEMO 3.6
Code: http://www.nemo-ocean.eu/
Recipe: See configuration details or check availability here
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
Climate &
Weather
Image Source: Sediment Dynamics in the Black Sea
32
33. Danish Meteorological Institute
HIROMB-BOOS-Model*
The Danish Meteorological Institute (DMI) institute was founded to make
observations, communicate them to the general public, and to develop
scientific meteorology.
Application: DMI HIROMB-BOOS-Model is a 3D ocean circulation model
code forced by atmospheric meteo-fields from a weather model.
Code: To get access to the code and test cases, please contact DMI
Recipe: Check for availability here
Climate &
Weather
Image Source: DMI
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
33
34. Non-hydrostatic Icosahedral Model (NIM) is developed by NOAA’s Earth
System Research Laboratory. NIM is used for earth system modeling and
weather and climate prediction. NIM is a multi-scale model designed to
improve tropical convective clouds and to extend weather forecasts into
intra-seasonal predictions.
Application: Non-hydrostatic Icosahedral Model
Code: To get access to the code and test cases, please contact NOAA
Recipe: Build and run instructions are similar to those for Intel® Xeon®
processor E5-2697 v4, except build with –xMIC-AVX512 flag and run with
“numactl –m 1” prepended to the command-line.
Non-hydrostatic Icosahedral
Model * (nim)
Climate &
Weather
Image Source: NOAA
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
34
35. MPAS Ocean 4.0*
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
MPAS (Model for Prediction Across Scales) is a suite of programs for
atmosphere, ocean, and other earth-system simulation. LANL is
primarily responsible for the MPAS Ocean (MPAS-O) model. MPAS-O has
demonstrated the ability to accurately reproduce mesoscale activity.
(workload contact: Doug Jacobson, LANL,
jacobsen.douglas@gmail.com)
Application: MPAS-O
Code: Available here Recipe: See the configuration details slide
Image Source: Los Alamos
National Laboratory
Climate &
Weather
35
36. HOMME is the spectral element dynamical core that solves the equations of
motion in the CAM5 atmospheric model, part of the Community Earth System
Model (CESM) jointly developed by NSF and DOE.
Application: Baroclinic instability simulation in a “whole atmosphere” (extending to
lower thermosphere) configuration.
Code: Request access to development branches here.
Recipe: cmake -DADD_Fortran_FLAGS="-O3 -xMIC-AVX512 -fp-model fast"
HOMME atmospheric dynamical core *
Numerical climate Simulation
Climate &
Weather
Image source: Used by
permission.
Global 1/8 degree
simulation run using
CAM5 with the HOMME
spectral element
dynamical core.
Visualization from
http://www.sdav-
scidac.org/29-
highlights/visualization/64-
precipitable-water-vis.html.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
36
37. GNAQPMS*
GNAQPMS is an in-house code from the Institute of Atmospheric Physics (IAP),
China Academy of Sciences (CAS), parallelized with hybrid MPI+OpenMP and
written in Fortran. The application is the global multi-scale chemistry transport
model which can simulate the trace gases including ozone, NOx , CO and main
aerosols including dusts, sea salts, BC, OC, sulfate and nitrate in multi-special
resolutions.
Application: GNAQPMS.
Code: In-house code. To access, please contact tangxiao@mail.iap.ac.cn
Recipe: Please contact tangxiao@mail.iap.ac.cn
Image Source: PRC IAP
Climate &
Weather
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
37
38. Weather & Research Forecast Model*
Numerical Weather Simulation
The WRF Model is a numerical weather prediction system designed to serve
atmospheric research and operational forecasting needs. Currently in
operational use at NCEP, AFWA, NASA, NOAA, etc.
Application: The Weather & Research Forecast Model* (WRF) WRFV3.6.1
Conus12km. Community code is managed by NCAR. CONUS12KM
benchmark is an adhoc industry standard workload and is widely cited.
Code: Available here. (WRF 3.6 & 3.6.1) Recipe: Select Intel MIC
configuration on build. Check for availability here
Climate &
Weather
Image Source: NOAA
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
38
39. Integrated Forecasting System (IFS) is a comprehensive Earth system
modelling software developed by The European Centre for Medium-Range
Weather Forecasts (ECMWF) in collaboration with Météo-France. IFS forms
the basis of all data assimilation and numerical weather prediction activities at
ECMWF.
IFS RAPS14*
Climate &
Weather
Application: IFS RAPS14
Code and Recipe: IFS RAPS14 code
is proprietary and available to
consortium members under license.
Minor source code changes were
applied to some of the key routines to
ensure vectorization. See the
configuration details for code and
recipe information. Check for future
availability here.
Image Source: ECMWF
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
39
40. Parallel Ocean Program*
Parallel Ocean Program (POP) is an ocean circulation model that solves the three-
dimensional primitive equations for ocean dynamics. It consists of the baroclinic
and barotropic solvers that solve 3-D equations explicitly and 2-D surface pressure
implicitly, respectively. It is widely used for oceanography. The code is Open
Source.
Application: POP (Parallel Ocean Program).
Code: http://www2.cesm.ucar.edu/
Recipe: Available here
Climate &
Weather
Image Source: FIO, China
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
40
41. MASNUM WAVE*
MASNUM WAVE model is the 3rd generation surface wave model proposed
early in 1990s in LAGFD (Laboratory of Geophysical Fluid Dynamics) from
FIO. The application is used to simulate and predict the wave process by
solving the wave energy spectrum balance equation and its complicated
characteristic equations in wave-number space, which is written in Fortran
and parallelized with MPI+OMP.
Application: MASNUM WAVE.
Code and Recipe: Available here
Image Source: FIO, China
Climate &
Weather
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
41
42. Trinity Benchmarks - Optimized
cluster (GFLOPS)
Trinity is a set of benchmark programs used as part of the joint NERSC/ACES
NERSC-8/Trinity system procurement.
Code: In main NERSC website. Recipe: Check for availability here
AMG: Parallel algebraic multigrid solver for linear systems
MiniFE: Finite Element mini-app
UMT: 3D, deterministic, multigroup, photon transport code for unstructured
meshes
SNAP: proxy application to model the performance of a modern discrete
ordinates neutral particle transport application
GTC: Gyrokinetic Particle Simulation of Turbulent Transport in Burning
Plasmas
MILC: MIMD Lattice Computation (MILC) collaboration kernel used to study
quantum chromodynamics (QCD)
MiniGhost: Finite Difference mini-app
Material Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
42
43. Trinity Benchmarks - Single Node
Optimized (GFLOPS)
Trinity is a set of benchmark programs used as part of the joint NERSC/ACES
NERSC-8/Trinity system procurement.
Code: In main NERSC website. Recipe: Check for availability here
AMG: Parallel algebraic multigrid solver for linear systems
MiniFE: Finite Element mini-app
UMT: 3D, deterministic, multigroup, photon transport code for unstructured
meshes
SNAP: proxy application to model the performance of a modern discrete
ordinates neutral particle transport application
GTC: Gyrokinetic Particle Simulation of Turbulent Transport in Burning
Plasmas
MILC: MIMD Lattice Computation (MILC) collaboration kernel used to study
quantum chromodynamics (QCD)
MiniGhost: Finite Difference mini-app
Material Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
43
44. CP2K is a powerful and scalable program for atomistic simulations of a wide range of
systems, including condensed phase, molecular systems and complex interfaces.
CP2K features a wide range of atomistic interaction models including classical
potentials, semi-empirical schemes, Density Functional Theory (DFT), Hartree-Fock
(HF), and post-HF correlation methods such as MP2 and RPA. The program was a
Gordon Bell Finalist in 2015. CP2K is freely available.
Application: CP2K Quantum Chemistry & Solid State Physics Software Package
Workload: Linear Scaling DFT used to compute the electronic structure of an
amorphous material with 14K atoms using the DZVP-MOLOPT basis set.
Code: Available Here Recipe: Check for availability here
Courtesy of ETH Zurich. (url)
Material Sciences Linear Scaling Density Function
Theory
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
44
45. CP2K is a powerful and scalable program for atomistic simulations of a wide range of
systems, including condensed phase, molecular systems and complex interfaces.
CP2K features a wide range of atomistic interaction models including classical
potentials, semi-empirical schemes, Density Functional Theory (DFT), Hartree-Fock
(HF), and post-HF correlation methods such as MP2 and RPA. The program was a
Gordon Bell Finalist in 2015. CP2K is freely available.
Application: CP2K Quantum Chemistry & Solid State Physics Software Package
Workload: Random Phase Approximation (RPA) energies using hybrid XC functionals
with exact Hartree-Fock Exchange (HFX).
Code: Available Here Recipe: Check for availability here
Material Sciences Random Phase Approximation
(RPA using exact HFX)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
45
46. BerkeleyGW Package is a set of computer codes that calculates the
quasiparticle properties and the optical responses of a large variety of
materials from bulk periodic crystals to nanostructures such as slabs, wires
and molecules. It is a massively parallel computational package for electron
excited state properties that is based on many-body perturbation theory
employing the ab initio GW and GW plus Bethe-Salpeter equation
methodology. Sigma is the second half of the GW code. It gives the
quasiparticle self-energies and dispersion relation for quasielectron and
quasihole states.
Application: BerkeleyGW Sigma phase of Benzene analysis
Code: http://www.berkeleygw.org
Deslippe, Jack, Georgy Samsonidze, David A. Strubbe, Manish Jain, Marvin L. Cohen, and
Steven G. Louie. "BerkeleyGW: A massively parallel computer package for the calculation
of the quasiparticle and optical properties of materials and nanostructures." Computer
Physics Communications 183, no. 6 (2012): 1269-1289.
Recipe: -xMIC-AVX512 -Ofast –qopenmp
Material Sciences berkeleygw*
Image Source: Used with permission
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
46
47. PWmat is a commercial software using plane wave pseudopotential method for
density functional theory (DFT) material simulations. It is an ab initio code,
meaning it uses initial atomic positions to predict the material properties. PWmat is
developed and optimized by Beijing LongXun Inc. based on the open source code
PEtot (PEtot has a BSD 3-Clause license).
Application: PWmat
Code: Commercial software. Contact LongXun Inc. (support@pwmat.com). PEtot
can be downloaded as a reference (http://cmsn.lbl.gov/html/PEtot/PEtot.html).
Recipe: Evaluation binaries: http://www.pwmat.com/pwmat_performance
(encrypted). Contact LongXun Inc. (support@pwmat.com) for approval at first.
PWmat*Material Sciences
)( o
io
i
i
ii PPAP
ij
jiii PPP
,1
|
iiiii P sincos
ij
jiii
Precond.CG step
Projection *
Orth *
Sub_diag *
Line minimize
nline
Hpsi *
ji Hjih ),( Sub_diag *
iiii HP Hpsi *
MPI_Send & Recv (wf)
Nonlocal & FFT
MPI_Allreduce (mx)
Zgemm
iiii HP
Zpotrf & Zgemm
MPI_Allreduce(mx)
Zheev & Zgemm
MPI_Allreduce(mx)
Zheev & Zgemm
MPI_Allreduce(mx)
MPI_Send & Recv (wf)
Nonlocal & FFT
ji Hjih ),(
Image Source: LongXun Inc.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
47
48. Quantum EsPresso*
AUSURF112 (PWSCF)
Quantum ESPRESSO is an integrated suite of Open-Source computer codes
for electronic-structure calculations and materials modeling at the nanoscale.
It is based on density-functional theory, plane waves, and pseudopotentials.
Application: Quantum ESPRESSO AUSURF112 (QE 5.4)
Code: Available here
Recipe: See the configuration details slide
Image Source: Argonne National Laboratory
Material Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
48
49. Quantum EsPresso* PRACE
GRIR443 (PWSCF)
Quantum ESPRESSO is an integrated suite of Open-Source computer codes
for electronic-structure calculations and materials modeling at the nanoscale.
It is based on density-functional theory, plane waves, and pseudopotentials.
Application: Quantum ESPRESSO 6.0
Code: Available here. Recipe: See the configuration details slide
Material Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
49
50. Quantum EsPresso* PRACE
Test Case A (CP)
Quantum ESPRESSO is an integrated suite of Open-Source computer codes
for electronic-structure calculations and materials modeling at the nanoscale.
It is based on density-functional theory, plane waves, and pseudopotentials.
Application: Quantum ESPRESSO 6.0
Code: Available here. Recipe: See the configuration details slide.
Material Sciences
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
50
51. QPhiX is an optimized solver library for QCD on Intel® Xeon® processors and
Intel® Xeon Phi™ processors and provides implementation for Dslash operator
and CG, BICGStab and mixed precision solvers for Wilson and Clover improved
Wilson Quarks.
Application: QPhiX Test Benchmark (time_dslash_noqdp)
Code: Available here Recipe: Follow the instructions in the download package
QPHIX*Physics - QCD
Flux tubes between 3 quarks.
Credit: Dr Derek Leinweber,
Centre for the Subatomic
structure of Matter (CSSM)
and Department of Physics,
University of Adelaide, 5005
Australia
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
51
52. The MILC Code is used to study quantum chromodynamics (QCD), the theory
of the strong interactions of subatomic physics and is written by the MIMD
Lattice Computation (MILC) collaboration.
Application: Trinity MILC provided by NERSC as part of Trinity8 suite
Code: Original NERSC Benchmark code is here; contact Intel for Optimized
Code Recipe: Check for availability here
MILC*Physics - QCD
Image Credit: Brookhaven Lab (BNL)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
52
53. Cloverleaf*
The CloverLeaf* code investigates the behavior of fluids under high
temperatures and pressures, which potentially cause shock fronts to form. It is
common for hydrocodes to be constructed using one of two formulations –
Lagrangian, in which a mesh is constructed and evolved through time, or
Eulerian, where material flow is calculated relative to a fixed spatial grid.
Application: CloverLeaf
Code: https://github.com/UK-MAC/CloverLeaf
Recipe: make COMPILER=INTEL MPI_COMPILER=mpiifort
C_MPI_COMPILER=mpiicc OPTIONS=“-xMIC-AVX512” C_OPTIONS=“-
xMIC-AVX512”
Physics - Hydrodynamics
Image Source: Intel
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
53
54. Chroma*
The Chroma package supports data-parallel programming constructs for lattice
field theory and in particular lattice QCD. It uses the SciDAC QDP++ data-parallel
programming (in C++) that presents a single high-level code image to the user,
but can generate highly optimized code for many architectural systems including
single node workstations, multi and many-core nodes, clusters of nodes via QMP,
and classic vector computers.
Application: Chroma “hmc”
Code: Available here Recipe: Check for availability here
Physics - QCD
Image Credit: Brookhaven Lab (BNL)
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
54
55. Soft sphere simulation*
Soft Sphere is a 3D Molecular Dynamic simulation of IPE-CAS (Institute of Process
Engineering), China. Using sphere particles to simulate structured molecules and
calculation based on BKS (Beest-Kramers-Santen) Experience Potential Model
allows scientists to fight epidemics like the 2014 Ebola virus.
Application: Soft Sphere Simulation
Code and Recipe: Available here
The simulation of force-driven flow in the nano-scale channel
Image Source: IPE-CAS, China
begin
end
Physics
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
55
56. Petsc – Portable, extensible toolkit for
scientific computation
PETSc – the Portable, Extensible Toolkit for Scientific Computation – is a
suite of data structures and routines for the scalable (parallel) solution of
scientific applications modeled by partial differential equations.
Application: Solution of the incompressible, variable viscosity Stokes
equation in 3d using Q1Q1 elements, using a state-of-the art Schur
complement-based approach robust to large viscosity jumps.
Code: PETSc development code is completely open and available here.
Recipe: “-g -O3 -fp-model fast” and “-xMIC-AVX512” or “-xCORE-AVX2
Image Source: Peter Lichtner, OFM Research
Physics
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
56
57. YASK HPC Stencils, AWP-ODC
kernel
YASK, Yet Another Stencil Kernel, is a framework to facilitate design
exploration and tuning of HPC kernels. One of the stencils included in YASK
is awp-odc, a staggered-grid finite difference scheme used to approximate
the 3D velocity-stress elastodynamic equations:
http://hpgeoc.sdsc.edu/AWPODC. Applications using this stencil simulate the
effect of earthquakes to help evaluate designs for buildings and other at-risk
structures.
Application: YASK, AWP stencil (a single-node proxy for the compute kernel
in the Anelastic Wave Propagation application)
Code and Recipe: Available here
Geophysics
Image Source: USGS
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
57
58. YASK HPC Stencils, iso3dfd
kernel
YASK, Yet Another Stencil Kernel, is a framework to facilitate design
exploration and tuning of HPC kernels. One of the stencils included in YASK
is iso3dfd, a finite-difference code found in seismic imaging software used by
energy-exploration companies to predict the location of oil and gas deposits.
Application: YASK, iso3dfd stencil
Code and Recipe: Available here
Geophysics
Image Source: US Dept. Energy
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
58
59. Seissol Seismic Solver*
SeisSol software simulates wave propagation and dynamic rupture based on the
arbitrary high-order accurate derivative discontinuous Galerkin method (ADER-
DG). Characteristics include tetrahedral meshes to approximate complex 3D model
geometries and rapid model generation use of elastic, viscoelastic and viscoplastic
material to approximate realistic geological subsurface properties. The code is
Open Source.
Application: SeisSol Seismic Solver.
Code: Available here
Recipe: Download code and follow instructions
Geophysics
Images Source: Intel Labs
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
59
60. SPECFEM3D_GLOBE simulates the three-dimensional global and regional
seismic wave propagation based upon the spectral-element method (SEM).
It is a time-step algorithm which simulates the propagation of earth waves
given the initial conditions, mesh coordinates/ details of the earth crust.
Application: specfem3D_globe
Baseline Code: https://geodynamics.org/cig/software/specfem3d_globe/
Optimized Code: Pending check-in. Recipe: Runs out-of-the-box.
Specfem3d_globe*Geophysics
Image Source:
https://geodynamics.org/cig/soft
ware/specfem3d_globe/
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
60
61. ISO3DFD (3D acoustic Isotropic
finite difference)*
Iso3DFD - The Iso-3D 16th order Isotropic kernel is at the heart of RTM
algorithm. It plays a major role on accurate imaging of complex subsurfaces.
This kernel computes the wave propagation used in seismic imaging. The
code is in-house code.
Application: Iso3DFD (3D Acoustic Isotropic Finite Difference)
Code: Check for availability here
Recipe: -O3 -xMIC-AVX512 -fp-model fast –fma –qopenmp -lmemkind
x
y
z
Half Length
(HL) = 8
Energy Industry
Image Source: Intel
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
61
62. Sparse Matrix Vector Multiply*
Sparse matrix-vector multiplication (SpMV) is an important kernel for a
diverse set of applications in which systems with sparse pattern are used,
such as scientific computing, engineering, economic modeling, information
retrieval, oil & gas, weather consulting, animation, aero-spacing,
recommender systems in machine learning, and earthquake prediction.
Application: Sparse Matrix Vector Multiply
Code: Available here. Recipe: Check for availability here
Performance
Image Source: US gov.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
62
63. High Performance Conjugate Gradients
The High Performance Conjugate Gradients (HPCG) Benchmark project is
an effort to create a new metric for ranking HPC systems. HPCG is intended
as a complement to the High Performance LINPACK (HPL) benchmark,
currently used to rank the TOP500 computing systems.
Application: High Performance Conjugate Gradients
Code: Available here. Recipe: Check for availability here
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems,
components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated
purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance *Other names and brands may be claimed as the property of others
63
Performance
Image Source: Univ. Maryland, Baltimore Co.
Description: LAMMPS is distributed by Sandia National Laboratories, a US Department of Energy laboratory. LAMMPS has potentials for solid-state materials (metals, semiconductors) and soft matter (biomolecules, polymers) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale. LAMMPS runs on single processors or in parallel using message-passing techniques and a spatial-decomposition of the simulation domain. The code is designed to be easy to modify or extend with new functionality. LAMMPS is distributed as an open source code under the terms of the GPL.
Highlights: The code had been optimized over the course of 3 years and going on, delivered to the community and available for download from the community site. Link to white paper when available.
Workload: Download from the Protein Data Bank (http://www.rcsb.org/pdb/home/home.do) ID 1AKE
Application: Applied modern program optimization and tested with the benchmark generated by spider.
Workloads:
The DLRF6 case is a wing/body/nacelle/pylon configuration from the AIAA 2nd Drag Prediction Workshop. The geometry is from the German aeronautics agency DLR (Deutsches Zentrum fur Luft-und Raumfahrt). It is a real geometry which was tested in a wind tunnel in Europe. The geometry is freely available. DLR/F6 case size <16GB memory.
The NASrotor case is a generic 3-blade rotor configuration provided by Dr. N. Chaderjian of the NASA Advanced Supercomputing Division at NASA/Ames Research Center. It models an isolated hovering rotor. The geometry is freely available. NASrotor case size = 43GB memory.
Application: Danish Meteorological Institute's Hiromb-Boos-Model single node performance. Test case: BaffinBay_2m, 1 MPI x N OMP. Physics component + bio-geo-chemical module.
Description: Major responsibility is storm surge events warnings for the Danish coasts since 2001. As one of its duties, DMI carries the national responsibility for emergency preparedness and response to events such as storm surges affecting Danish coasts.
Highlights: The Intel® Xeon Phi™ coprocessor laid the groundwork for the improved performance, due to scaling and vector efficiency, of the new Intel® Xeon Phi™ processor. Fortran 90 code. Supports both distributed (MPI*) and shared (OpenMP*) memory parallelization.
Recipe: Expect availability at KNL Launch, 6.20.16
Application: GNAQPMS is the in-house code from CAS IAP (Chinese Academy of Sciences, Institute of Atmosphere Physics), paralleled with hybrid MPI+OpenMP and written in Fortran. It is the global multi-scale chemistry transport model based the Nested Air Quality Prediction Modeling System(NAQPMS) from IAP and Global Environment Atmospheric Transport Model(GEATM). GNAQPMS can simulate the trace gases including ozone, NOx , CO and main aerosols including dusts, sea salts, BC, OC, sulfate and nitrate in multi-special resolutions.
Disable the result output and we benchmark the performance by the wall time of computation.
Application:
Description:
Highlights: The code had been optimized over the course of 3 years and going on, delivered to the community and available for download from the community site.
IFS has been in development since the early 90s. It is a highly parallel application written in FORTRAN with a mix of MPI and OpenMP.
Application: Disable the result output and the wall time of computation is performance benchmark. For this app, the result output is finished by 1st rank collecting data from other ranks.
Description: Add additional details here.
Highlights: Add additional details here.
Problem sizes selected to maximize performance per node with two restrictions:
(1) The minimum problem size per node should be such that the full problem
can still fit on to a cluster of 9K nodes as the supercomputers mentioned
previously will have at most 9K Knights Landing processor nodes.
(2) Problem size selected needs to be large enough where data is coming from
memory and not from core caches so that we can study the effects of memory
subsystem.
Results on Xeon Phi 7250 uses best cluster mode, best memory mode, optimal threads per core, optimal problem size for each workload
Results on two socket Xeon processor E5-2697 v4 uses optimal problem size with 1 thread per core
Problem sizes selected to maximize performance per node with two restrictions:
(1) The minimum problem size per node should be such that the full problem
can still fit on to a cluster of 9K nodes as the supercomputers mentioned
previously will have at most 9K Knights Landing processor nodes.
(2) Problem size selected needs to be large enough where data is coming from
memory and not from core caches so that we can study the effects of memory
subsystem.
Results on Xeon Phi 7250 uses best cluster mode, best memory mode, optimal threads per core, optimal problem size for each workload
Results on two socket Xeon processor E5-2697 v4 uses optimal problem size with 1 thread per core
I. Pictures come from http://www.pwmat.com website and their published paper. All of them were approved to be used here by LongXun Inc.
II. Only 64 cores can be used on 7250 due to the FFT grid size and efficiency.
III. GaAs160’s best performance on KNL should be run under 32 MPI processes and 2 threads per process.
IV. Overall performance enhanced by HBM Cache (up to 80%) and Intel® AVX-512 (up to 25%). Measured by 1. KNL binary run under Flat mode vs Cache mode; 2. KNL binary vs BDW binary under Cache mode.
Scientists make advance in battle against Ebola Chinese scientists have contributed to a new breakthrough in the fight against the Ebola virus after the deadly outbreak in West Africa in March 2014.
image from https://commons.wikimedia.org/wiki/File:Izmit_11-12-99.gif. Public domain-- U.S.G.S.
http://www.sdsc.edu/News%20Items/PR20160209_earthquake_center.html
KNL perf on DDR (w/o MCDRAM) is 5.26 G Pts/sec, i.e., MCDRAM provides 2.3x speedup.
Oil platform image from https://commons.wikimedia.org/wiki/File:PlatformHolly.jpg. Public domain-- U.S. Department of Energy
KNL perf on DDR (w/o MCDRAM) is 3.94 G Pts/sec, i.e., MCDRAM provides 4.8x speedup. Note that perf has not been tuned for the DDR config.