The European Commission recently announced the creation of the European Processor Initiative (EPI), a European consortium to co-design, develop and bring to the market a European low-power microprocessor. EPI will start in 2018 and will develop the first European High Performance Computing (HPC) Systems on Chip (SoC) and accelerators. Both elements will be implemented and validated in a prototype system that will become the basis for a full Exascale machine based on European technology
This lecture aims to give some food for thought regarding how the current High Performance Computing systems (hardware and software) tends to merge with Big Data ones (Machine Learning, Analytics and Enterprise workloads) in order to meet both workloads demands sharing the same clusters.
From weather and climate to seismic imaging to aeronautics, OpenACC sessions featured at GTC20 are helping to facilitate discussions, educate attendees and encourage networking and collaboration.
Sessions cover a broad range of topics, the “Meet the Experts” session enabled one-on-one deep dives into using OpenACC to solve specific challenges, posters highlight how OpenACC is being applied to current science applications, and the on-demand tutorial delivers hands-on skills building.
Smart Data Slides: Emerging Hardware Choices for Modern AI Data ManagementDATAVERSITY
Leading edge AI applications have always been resource-intensive and known for stretching the limits of conventional (von Neumann architecture) computer performance. Specialized hardware, purpose built to optimize AI applications, is not new. In fact, it should be no surprise that the very first .com internet domain was registered to Symbolics - a company that built the Lisp Machine, a dedicated AI workstation - in 1985. In the last three decades, of course, the performance of conventional computers has improved dramatically with advances in chip density (Moore’s Law) leading to faster processor speeds, memory speeds, and massively parallel architectures. And yet, some applications - like machine vision for real time video analysis and deep machine learning - always need more power.
Participants in this webinar will learn the fundamentals of the three hardware approaches that are receiving significant investments and demonstrating significant promise for AI applications.
- neuromorphic/neurosynaptic architectures (brain-inspired hardware)
- GPUs (graphics processing units, optimized for AI algorithms), and
- quantum computers (based on principles and properties of quantum-mechanics rather than binary logic).
Note - This webinar requires no previous knowledge of hardware or computer architectures.
MIPM PCo kafka SAP Faurecia coinnovation SAP LeonardoJose Gascon
Presentation for the opening of the SAP Leonardo Center in Paris with the last innovations around MIPM and Digital Transformation from Faurecia Group Information Systems
OpenACC and Open Hackathons Monthly Highlights: April 2022OpenACC
Stay up-to-date on the latest news, events and resources for the OpenACC and Open Hackathon community. This month’s highlights covers upcoming GPU Hackathons and Bootcamps, call for speakers for the OpenACC and Hackthons 2022 Summit , recent research, new resources and more!
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the upcoming OpenACC Summit, a complete schedule of upcoming events, using OpenACC to optimize structural analysis, new resources and more!
The European Commission recently announced the creation of the European Processor Initiative (EPI), a European consortium to co-design, develop and bring to the market a European low-power microprocessor. EPI will start in 2018 and will develop the first European High Performance Computing (HPC) Systems on Chip (SoC) and accelerators. Both elements will be implemented and validated in a prototype system that will become the basis for a full Exascale machine based on European technology
This lecture aims to give some food for thought regarding how the current High Performance Computing systems (hardware and software) tends to merge with Big Data ones (Machine Learning, Analytics and Enterprise workloads) in order to meet both workloads demands sharing the same clusters.
From weather and climate to seismic imaging to aeronautics, OpenACC sessions featured at GTC20 are helping to facilitate discussions, educate attendees and encourage networking and collaboration.
Sessions cover a broad range of topics, the “Meet the Experts” session enabled one-on-one deep dives into using OpenACC to solve specific challenges, posters highlight how OpenACC is being applied to current science applications, and the on-demand tutorial delivers hands-on skills building.
Smart Data Slides: Emerging Hardware Choices for Modern AI Data ManagementDATAVERSITY
Leading edge AI applications have always been resource-intensive and known for stretching the limits of conventional (von Neumann architecture) computer performance. Specialized hardware, purpose built to optimize AI applications, is not new. In fact, it should be no surprise that the very first .com internet domain was registered to Symbolics - a company that built the Lisp Machine, a dedicated AI workstation - in 1985. In the last three decades, of course, the performance of conventional computers has improved dramatically with advances in chip density (Moore’s Law) leading to faster processor speeds, memory speeds, and massively parallel architectures. And yet, some applications - like machine vision for real time video analysis and deep machine learning - always need more power.
Participants in this webinar will learn the fundamentals of the three hardware approaches that are receiving significant investments and demonstrating significant promise for AI applications.
- neuromorphic/neurosynaptic architectures (brain-inspired hardware)
- GPUs (graphics processing units, optimized for AI algorithms), and
- quantum computers (based on principles and properties of quantum-mechanics rather than binary logic).
Note - This webinar requires no previous knowledge of hardware or computer architectures.
MIPM PCo kafka SAP Faurecia coinnovation SAP LeonardoJose Gascon
Presentation for the opening of the SAP Leonardo Center in Paris with the last innovations around MIPM and Digital Transformation from Faurecia Group Information Systems
OpenACC and Open Hackathons Monthly Highlights: April 2022OpenACC
Stay up-to-date on the latest news, events and resources for the OpenACC and Open Hackathon community. This month’s highlights covers upcoming GPU Hackathons and Bootcamps, call for speakers for the OpenACC and Hackthons 2022 Summit , recent research, new resources and more!
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the upcoming OpenACC Summit, a complete schedule of upcoming events, using OpenACC to optimize structural analysis, new resources and more!
B Kindilien Finding Efficiency In Mach 120408jgIpotiwon
Presentation at the 2008 Defense Manufacturer\'s Conference (DMC), Orlando, FL: The advent of finite-element modeling based systems has ushered in an era of physics-based prediction of machining operations, giving engineers new insight into designing machining strategies. Some technologies are being employed as machining process development tools. Others are being applied by companies following the tool path generation process in the computer-aided manufacturing (CAM) software systems. Further, other software technologies are evolving within the CAM software systems users currently operate, offering dramatic reductions in machining cycle times by affecting air cuts and feed rates. But users still wonder how to apply these various approaches; they puzzle over what approaches work for their shop practices. Attendees described after this presentation that they had a clearer sense of how strategic changes in machining approaches and implementation of the right technology for a given manufacturing condition can make all the difference.
Deploying Massive Scale Graphs for Realtime InsightsNeo4j
Graph databases have been at the forefront of helping organizations manage and generate insights from data relationships, and applying those insights in real-time to drive competitive advantage. As organizations gain value in deploying graph databases, the data volumes managed are growing exponentially pushing the limits of large-scale in-memory graph processing. Neo4j and IBM Power Systems combined forces to deliver a market leading scalable graph database platform capable of affordably storing and processing graphs of extremely large size and offering real-time insights, using flash and FPGA accelerators. In this session we will cover the use cases driving the need for this extremely scalable platform and how this platform offers an easy to deploy model for extreme scale graph databases.
SIMA AZ: Emerging Information Technology Innovations & Trends 11/15/17Mark Goldstein
Mark Goldstein, International Research Center presented a big overview of Emerging Information Technology Innovations & Trends to the Society for Information Management Arizona Chapter (SIM AZ) on 11/15/17 showcasing the latest and greatest emerging technologies and novel tech innovations, highlighting the market and societal transformations underway or anticipated. It covered Advances in Computer Power and Pervasiveness; Internet of Things (IoT) Overview and Ecosystem; Mobility, Augmented Reality and Virtual Reality (AR/VR); Medical Advances Through Informatics; Artificial Intelligence (AI) and Robotics; Big Data, Its Applications and Implications; and Onward into the Future…
With the rise of fog and edge-computing as the basic paradigms for future communication standards such as 6G, new processing requirements are established. On the other hand, new security algorithms appear with the scaling of quantum technology, increasing the complexity of the cryptography applications for IoT devices with a tight standardization timeline. Finally, the integration of satellites as nodes for communication networks includes fault-tolerance and error-correction codes as design parameters.
FPGA-based soft-processors are supported by industry and space agencies as promising candidates to overcome all these challenges, due to their flexibility and power consumption compared to GPUs or multithreading CPUs with co-processors. To optimize these architectures to a wide range of scenarios, common methods, and arithmetic functions need to be integrated into the ISA. This talk will show some examples of the RISC-V EL2 core for both classical and post-quantum cryptography and error correction codes, reducing the latency of standardized solutions at a cost of a small cross-section increase keeping the behavior under radiation effects similar to the original core.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the upcoming OpenACC Summit and GPU Bootcamp, a complete schedule of upcoming events, OpenACC and base language parallelism, FortranCon2020, VASP 6, OmpSs-2@OpenACC version of the ZIPC application, new resources and more!
Präsentation von Martin Casaulta, Country Chief Technologist, Hewlett Packard Enterprise (Schweiz), Konferenz «Mehr Effizienz in Rechenzentren und Serverräumen» 25. September 2018, BERNEXPO, Bern
Stay up-to-date on the latest news, research and resources. This month's edition covers the Georgia Tech Open Hackathon, milestones in OpenACC development, upcoming Open Hackathons and Bootcamps, NVIDIA's developer program, and more!
The rush to the edge and new applications around AI are causing a shift in design strategies toward the highest performance per watt, rather than the highest performance or lowest power.
Invited talk at SSSW'16 (http://sssw.org/2016/?page_id=232) introducing the Fourth Industrial Revolution and discussing how Semantic Web technologies can support this movement. Also a teaser for the upcoming Springer book "Semantic Web for Intelligent Engineering Applications" (http://www.springer.com/us/book/9783319414881).
SCADA a gyakorlatban - Accenture Industry X.0 MeetupAccenture Hungary
Július utolsó délutánján az ipari SCADA rendszerről, annak fejlődéséről és jövőjéről beszélgettünk.
Meséltünk a jelenlegi piacvezető SCADA termékekről és felhasználási területeikről, bemutattunk egy tipikus SCADA rendszer felépítést, kitértünk az IT security és SCADA rendszerek integrációjára is. Szót ejtettünk a SCADA vs. MES vs. Connected platform versenyről, érintettük a Digital Twin és Thread rendszereket, melyekben a SCADA egy nagyon fontos alkotóelem lehet.
A meetup során láthattátok, hogyan épül fel egy SCADA project, sőt, egy példa kapcsán kötetlenül beszélgettünk jelen megoldásunkról, és a továbbfejlesztés lehetőségeiről. Néhány mondatot szenteltünk a SCADA jövőjének is, hiszen ebben a SCADA az AR/VR technológiákkal integrálva jelenik meg. Ez olyan új lehetőségekkel szolgál, mint hogy virtuális környezetben bejárhatjuk a technológiát és valós idejű adatokat láthatunk a berendezés mellett; vagy a berendezés meghibásodás estén a javítási instrukciók a szemünk előtt folyamatosan jelennek meg.
MIPM PCo to Kafka Faurecia SAP co-innovation at Hannover Messe 2017Jose Gascon
Co-innovation between Faurecia and SAP in the context of IIoT in order to capture in realtime process data from
manufacturing machines directly thru SAP Leonardo into the Data Lake via Apache Kafka Message Broker.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers pseudo random number generation, the first-ever MONAI Bootcamp, upcoming GPU Hackathons and Bootcamps, and new resources!
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMRedis Labs
The Linear Road benchmark was devised in 2004 to
compare Stream Data Management Systems. Walmart selected Linear Road to compare performance of streaming analytic
offerings. IBM implemented the benchmark application using Redis to maintain state, and IBM Streams to handle the
incoming events and queries. Walmart had to completely revamp the data drivers and test verification to take advantage
of multicore multithreaded servers available today. Tests were run on Microsoft Azure cloud to ensure fair comparison of
vendors. Redis and IBM Streams handled nearly 1 billion events in a 3 hour test on a single 16 core Azure node, and 3.8 billion
when scaled out to 4 nodes. Come learn about the application and near linear scalability of Redis and IBM Streams.
Tool-Driven Technology Transfer in Software EngineeringHeiko Koziolek
This talk presentst the tool-driven technology transfer process ABB Corporate Research applies in selected software engineering University collaborations. As an example, we have created an add-in to a popular UML tool and developed the tooling in close interaction with the target users. Centering the technology transfer around tool implementations brings many benefits such as the need to make conceptual contributions applicable and the ability to quickly benefit from the new concepts. A challenge to this form of technology transfer is the long-term commitment to the maintenance of the tooling, which we try to address by creating an open developer community. Tool-driven technology transfer projects have proven to be valuable a instrument of bringing advanced software engineering technologies into our organization.
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC
Stay up-to-date on the latest news, research and resources. This month's edition covers the Princeton GPU Hackathon, OpenACC at SC22, updates from GNU Tools Cauldron, the upcoming UK DPU Hackathon, relevant research and more!
B Kindilien Finding Efficiency In Mach 120408jgIpotiwon
Presentation at the 2008 Defense Manufacturer\'s Conference (DMC), Orlando, FL: The advent of finite-element modeling based systems has ushered in an era of physics-based prediction of machining operations, giving engineers new insight into designing machining strategies. Some technologies are being employed as machining process development tools. Others are being applied by companies following the tool path generation process in the computer-aided manufacturing (CAM) software systems. Further, other software technologies are evolving within the CAM software systems users currently operate, offering dramatic reductions in machining cycle times by affecting air cuts and feed rates. But users still wonder how to apply these various approaches; they puzzle over what approaches work for their shop practices. Attendees described after this presentation that they had a clearer sense of how strategic changes in machining approaches and implementation of the right technology for a given manufacturing condition can make all the difference.
Deploying Massive Scale Graphs for Realtime InsightsNeo4j
Graph databases have been at the forefront of helping organizations manage and generate insights from data relationships, and applying those insights in real-time to drive competitive advantage. As organizations gain value in deploying graph databases, the data volumes managed are growing exponentially pushing the limits of large-scale in-memory graph processing. Neo4j and IBM Power Systems combined forces to deliver a market leading scalable graph database platform capable of affordably storing and processing graphs of extremely large size and offering real-time insights, using flash and FPGA accelerators. In this session we will cover the use cases driving the need for this extremely scalable platform and how this platform offers an easy to deploy model for extreme scale graph databases.
SIMA AZ: Emerging Information Technology Innovations & Trends 11/15/17Mark Goldstein
Mark Goldstein, International Research Center presented a big overview of Emerging Information Technology Innovations & Trends to the Society for Information Management Arizona Chapter (SIM AZ) on 11/15/17 showcasing the latest and greatest emerging technologies and novel tech innovations, highlighting the market and societal transformations underway or anticipated. It covered Advances in Computer Power and Pervasiveness; Internet of Things (IoT) Overview and Ecosystem; Mobility, Augmented Reality and Virtual Reality (AR/VR); Medical Advances Through Informatics; Artificial Intelligence (AI) and Robotics; Big Data, Its Applications and Implications; and Onward into the Future…
With the rise of fog and edge-computing as the basic paradigms for future communication standards such as 6G, new processing requirements are established. On the other hand, new security algorithms appear with the scaling of quantum technology, increasing the complexity of the cryptography applications for IoT devices with a tight standardization timeline. Finally, the integration of satellites as nodes for communication networks includes fault-tolerance and error-correction codes as design parameters.
FPGA-based soft-processors are supported by industry and space agencies as promising candidates to overcome all these challenges, due to their flexibility and power consumption compared to GPUs or multithreading CPUs with co-processors. To optimize these architectures to a wide range of scenarios, common methods, and arithmetic functions need to be integrated into the ISA. This talk will show some examples of the RISC-V EL2 core for both classical and post-quantum cryptography and error correction codes, reducing the latency of standardized solutions at a cost of a small cross-section increase keeping the behavior under radiation effects similar to the original core.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers the upcoming OpenACC Summit and GPU Bootcamp, a complete schedule of upcoming events, OpenACC and base language parallelism, FortranCon2020, VASP 6, OmpSs-2@OpenACC version of the ZIPC application, new resources and more!
Präsentation von Martin Casaulta, Country Chief Technologist, Hewlett Packard Enterprise (Schweiz), Konferenz «Mehr Effizienz in Rechenzentren und Serverräumen» 25. September 2018, BERNEXPO, Bern
Stay up-to-date on the latest news, research and resources. This month's edition covers the Georgia Tech Open Hackathon, milestones in OpenACC development, upcoming Open Hackathons and Bootcamps, NVIDIA's developer program, and more!
The rush to the edge and new applications around AI are causing a shift in design strategies toward the highest performance per watt, rather than the highest performance or lowest power.
Invited talk at SSSW'16 (http://sssw.org/2016/?page_id=232) introducing the Fourth Industrial Revolution and discussing how Semantic Web technologies can support this movement. Also a teaser for the upcoming Springer book "Semantic Web for Intelligent Engineering Applications" (http://www.springer.com/us/book/9783319414881).
SCADA a gyakorlatban - Accenture Industry X.0 MeetupAccenture Hungary
Július utolsó délutánján az ipari SCADA rendszerről, annak fejlődéséről és jövőjéről beszélgettünk.
Meséltünk a jelenlegi piacvezető SCADA termékekről és felhasználási területeikről, bemutattunk egy tipikus SCADA rendszer felépítést, kitértünk az IT security és SCADA rendszerek integrációjára is. Szót ejtettünk a SCADA vs. MES vs. Connected platform versenyről, érintettük a Digital Twin és Thread rendszereket, melyekben a SCADA egy nagyon fontos alkotóelem lehet.
A meetup során láthattátok, hogyan épül fel egy SCADA project, sőt, egy példa kapcsán kötetlenül beszélgettünk jelen megoldásunkról, és a továbbfejlesztés lehetőségeiről. Néhány mondatot szenteltünk a SCADA jövőjének is, hiszen ebben a SCADA az AR/VR technológiákkal integrálva jelenik meg. Ez olyan új lehetőségekkel szolgál, mint hogy virtuális környezetben bejárhatjuk a technológiát és valós idejű adatokat láthatunk a berendezés mellett; vagy a berendezés meghibásodás estén a javítási instrukciók a szemünk előtt folyamatosan jelennek meg.
MIPM PCo to Kafka Faurecia SAP co-innovation at Hannover Messe 2017Jose Gascon
Co-innovation between Faurecia and SAP in the context of IIoT in order to capture in realtime process data from
manufacturing machines directly thru SAP Leonardo into the Data Lake via Apache Kafka Message Broker.
Stay up-to-date on the latest news, events and resources for the OpenACC community. This month’s highlights covers pseudo random number generation, the first-ever MONAI Bootcamp, upcoming GPU Hackathons and Bootcamps, and new resources!
Walmart & IBM Revisit the Linear Road Benchmark- Roger Rea, IBMRedis Labs
The Linear Road benchmark was devised in 2004 to
compare Stream Data Management Systems. Walmart selected Linear Road to compare performance of streaming analytic
offerings. IBM implemented the benchmark application using Redis to maintain state, and IBM Streams to handle the
incoming events and queries. Walmart had to completely revamp the data drivers and test verification to take advantage
of multicore multithreaded servers available today. Tests were run on Microsoft Azure cloud to ensure fair comparison of
vendors. Redis and IBM Streams handled nearly 1 billion events in a 3 hour test on a single 16 core Azure node, and 3.8 billion
when scaled out to 4 nodes. Come learn about the application and near linear scalability of Redis and IBM Streams.
Tool-Driven Technology Transfer in Software EngineeringHeiko Koziolek
This talk presentst the tool-driven technology transfer process ABB Corporate Research applies in selected software engineering University collaborations. As an example, we have created an add-in to a popular UML tool and developed the tooling in close interaction with the target users. Centering the technology transfer around tool implementations brings many benefits such as the need to make conceptual contributions applicable and the ability to quickly benefit from the new concepts. A challenge to this form of technology transfer is the long-term commitment to the maintenance of the tooling, which we try to address by creating an open developer community. Tool-driven technology transfer projects have proven to be valuable a instrument of bringing advanced software engineering technologies into our organization.
OpenACC and Open Hackathons Monthly Highlights: September 2022.pptxOpenACC
Stay up-to-date on the latest news, research and resources. This month's edition covers the Princeton GPU Hackathon, OpenACC at SC22, updates from GNU Tools Cauldron, the upcoming UK DPU Hackathon, relevant research and more!
Similar to Gschwind - Software and System Co-Optimization in the Era of Heterogeneous Computing (20)
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
2. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
2
Recent Power History
Technology
POWER5
2004
POWER8
POWER6
2007
POWER7
2010
POWER7+
2012
Compute
Cores
Threads
Caching
On-chip
Off-chip
Bandwidth
Sust. Mem.
Peak I/O
130nm SOI 65nm SOI
45nm SOI
eDRAM
22nm SOI
eDRAM
2
SMT2
2
SMT2
8
SMT4
12
SMT8
1.9MB
36MB
8MB
32MB
2 + 32MB
None
6 + 96MB
128MB
15GB/s
6GB/s
30GB/s
20GB/s
100GB/s
40GB/s
230GB/s
64GB/s
32nm SOI
eDRAM
8
SMT4
2 + 80MB
None
100GB/s
40GB/s
3. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
3
POWER8 Chip Overview
▪ Up to 2.5x socket perf vs. P7+
▪ 649mm2 die size, 4.2B transistors
▪ 12 high-performance cores
▪ Large Caches
– L2: 512KB private SRAM per core
– L3: 96MB shared eDRAM w/ 8MB “fast access” partition per core
– L4: Up to 128MB, located on memory buffer chip
▪ 4 High Speed I/O interfaces
– Memory, On-Node SMP, Off-Node SMP, PCIe Gen3
Acc
On
Node
SMP
Fabric, Pervasive
PCI
Off
Node
SMP
MC
Mem0-3
Mem4-7
Off-Node SMPPCI PCI
On-Node SMP
MC
Core
L3 Quadrant
CoreCore
L2 L2L2
Core
L3 Quadrant
CoreCore
L2 L2L2
Core
L3 Quadrant
CoreCore
L2 L2L2
Core
L3 Quadrant
CoreCore
L2 L2L2
4. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
4
POWER8 Technology
▪ 22nm SOI
▪ 15 layer BEOL:
5-1x, 2-2x, 3-4x, 3-8x, 2-UTM
▪ 3-Vt thin-oxide logic transistors for power optimization
▪ Multiple thick-oxide transistors (for I/O and analog support)
▪ 3 app-optimized SRAM cells:
– 0.160µm² 6T perf-oriented
– 0.144µm² 6T perf-density balance for directories / L2
– 0.192µm² 8T multi-port
▪ Technology eDRAM cell: 0.026µm² 2-2x
3-4x
UTM
5-1x
3-8x
UTM
5. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
5
Large Block Structured Synthesis
▪ Enhanced process which included:
– Structured dataflow
– Congestion-aware stdcell placement
– Embedded “hard” IP (e.g. arrays, regfiles,
complex custom cells)
▪ 30% fewer unique blocks vs.
POWER7
▪ Improvements in block power and total
design area
– 15% area reduction IFU
VSU
6. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
6
POWER8 Core: Back bone of big data computing system
▪ Enhanced Micro-Architecture
▪ Increased Execution Bandwidth
▪ SMT 8
▪ Transactional Memory
▪ Vector/Scalar Unit
▪ High-performance Integer & FP Vector Processor
▪ Increased Performance for Data Rich Applications
VSU
FXU
IFU
DFU
ISU
PC
PC
LSU
7. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
7
Combined I/O Bandwidth = 7.6Tb/s
POWER8
Processor
Memory
Buffers
Memory
Buffers
PCI
DMI
PCI
POWER8
Processor
POWER8
Processor
DMI
DMI
DMI
DMI
DMI
DMI
DMI
NODE-to-NODE
ON-NODE SMP
Big Bandwidth
for
Big Data
Putting it all together with the memory links, on- and off-node SMP links
as well as PCIe, at 7.6Tb/s of chip I/O bandwidth
9. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
9
Tectonic Shifts in Nature of Workloads
Graph
Analytics
Security, Fraud Detection
Genome Analysis
Social Network Analytics
Knowledge Graphs
Machine
Learning
Watson Health
Watson Analytics
Robotics
Education
Video,
Speech
Analytics
Multimodal Analytics
- Object recognition
- Complex video analytics
- Correlation and stitching
Automating
the
World
Learn
Predict
Ingest
Understanding
the
World
10. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
10
General-Purpose CPU Design
▪ Many competing requirements
– Branchy control-flow dominated code
– Code with unpredictable data access patterns
– Operating system code
– Multiple separate applications
– Multiple virtual machines at a time
▪ Result in low efficiency for any one metric
– Flops / area
– Integer ops / area
– Predictions / area
– …
VSU
FXU
IFU
DFU
ISU
PC
PC
LSU
Out of order
execution
Register
renaming
Branch
prediction
& prefetch
Robust
virtual
memory
support
dec
ode
I$
RF
int
D$
SIMD
11. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
11
Heterogeneous, Workload-optimized Acceleration
▪ On-chip integrated accelerators (SoC design)
– Compute accelerator (Cell BE)
– Compression (P7+)
– Encryption (P7+)
– Random number generation (P7+)
– …
▪ SoC design offers highest integration, but…
– Requires new chip design for accelerator
– Long time to market
– Requires very high volumes
Cell BE
POWER7+
decode
l
o
c
a
l
s
t
o
e
MMU
S
I
M
D
A
12. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
12
CAPI – Coherent Accelerator Processor Interface
▪ Open infrastructure for off-chip, memory-coherent accelerators
– Modular interface
– Third-party high value-add components
▪ Standardized, layered protocol
– architectural interface
– functional protocol
– PCIe signaling protocol
▪ Create workload-optimized innovative solutions
– Faster time to market
– Lower bar to entry
– Variety of implementation options
• FPGAs, ASICs
Coherence Bus
proxy
PSL
POWER8
* Power Service Layer
13. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
13
Heterogeneous System Challenges
▪The 4 ‘P’s of System Design
▪Programmer Productivity
▪Realize accelerator Performance benefits
▪Portability: Investment protection for applications
▪Partitioning for multi-user systems: processes, partitions
14. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
14
Application Acceleration
▪ Fine-grained data sharing
coherent, shared memory
▪ Accelerator-initiated data accesses/transfers
coherent, shared memory
▪ Pointer identity
shared addressing
▪ Flexible synchronization
symmetric, programmable interfaces
15. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
15
CAPI Acceleration overcomes Device Driver Deceleration
Typical I/O Model Flow:
Flow with Coherent Model:
DD Call
Copy or Pin
Source Data
MMIO Notify
Accelerator Acceleration
Poll / Interrupt
Completion
Copy or Unpin
Result Data
Return from DD
Completion
300 Instructions 10,000 Instructions 3,000 Instructions 1,000 InstructionsApplication
Dependent, but
Equal to Below
1,000 Instructions
Shared Mem
Notify Accelerator Acceleration
Shared Memory
Completion
Application
Dependent, but
Equal to Above
100 Instructions400 Instructions
0.3 µs 0.06 µs
Total ~0.36 µs
7.9 µs 4.9 µs
Total ~13 µs for data prep
16. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
16
Power GPU acceleration
▪ CUDA programming environment supported under LE Linux
– GPU as compute accelerator
– Offload regular compute-intensive application portions to GPU
▪ Advances in GPU Performance and Programmability
– UVA – Universal Virtual Addressing
– UM – Unified Memory
▪ Ongoing collaboration to co-optimize systems
– Next generation hardware enhancement
17. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
17
Relating content via concept graphs
@joe I wonder how do I use bitcoins
Apple’s digital wallet, if widely adopted,
could usher in a new era of ease and
convenience.
Icahn, who months ago called on eBay to spin off the lucrative online and
mobile payment service, continues to believe that the payments field must
be consolidated, either by PayPal buying up smaller rivals or by merging
with another major player.
Job ad: Lead Front-end Developer -
Virtual Currency Exchange
Conceptual
reasoning allows us
to relate content that
is hard to connect
otherwise
Watson Concept Insights
18. Constituency
parse
tree
Wikifier
(graph
linker)
Retrieve
concept vectors
from cache
(assumes static
graph!!!)
Merge concept
vectors to form
document vector
External
storageone CPU
socket
one CPU
socket
document
s
Reverse
conceptual
index
(Cassandra)
Compute related
concepts kernel
BASIC INGESTION
(only once per life of
document)
CONCEPTUAL INDEXING
Currently once per life of document, maybe 3-5 times
in future
USER INTERFACE
QUERY RUNTIME
(hopefully millions
of queries!!!)
CPU
Retrieve related
documents
Watson Concept Insights Workload Pipeline
M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
19. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
19
Watson Concept Insights: Compute
Performance Comparison (CPU vs. GPU)
N-element
Vector
Page Rank
Calculations
5 Iterations
Pareto
Normalization
Scoring Combiner
M
Concepts*
Page Rank
Calculations
5 Iterations
Pareto
Normalization Scoring CombinerInit
Batched Execution with batch size of 64
(0.027 s)
(2.21 sec) (0.032 sec) (0.0048 sec)
(0.016 s)
Current CPU Execution
(55 sec) (3 sec) (1 sec)
Parallel Execution on GPU
CPU: 58 sec vs. GPU: 2.35 sec (25X)
HOST
HOST
M : Concepts under consideration (28 for the test)
N: Total number of concepts in Corpus (4.7M for Wikipedia)
*Ivy Bridge
*Nvidia K40
N*N Sparse
Matrix
Loaded only once
20. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
20
21. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
21
Over 2 million $136 billion
often do not reveal rare toxicity of
some drugs, and they are not
personalized
of in-hospital medication errors
caused by unforeseen drug-drug
interactions
Adverse Drug Reactions pose a serious challenge to
the healthcare system
serious adverse drug reactions
(ADRs) yearly: 100,000 deaths
ADR associated cost yearly
(> diabetic & cardiovascular care)
Clinical Trials 3–5%
22. Insight as a Service for Personalized and
Detailed Adverse Drug Reactions Prediction
Leverage large amount of data for personalized prediction of
nature, cause, and severity of adverse drug reactions
EMRs
M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
23. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
23
Drug1 Drug2
Aspirin Probenecid
Aspirin Azilsartan
Learn PredictIngest
Personalized Medicine, Adverse Drug Reaction Prediction w/ ML
Drug1 Drug2 Sim
Salsalate Aspirin .7
Dicoumarol Warfarin .6
Drug1 Drug2
Aspirin Gliclazide
Aspirin Dicoumarol
Drug1 Drug2 Sim
Salsalate Aspirin .9
Dicoumaro
l
Warfarin .76
Known Interactions of type 1 to …
Drug1 Drug2 Best
Sim1*Sim1
Best
SimN*SimN
Salsalate Gliclazide .9*1 .7*1
Salsalate Warfarin .9*.76 .7*.6
Candidate Interactions of type i
Features
Chemical Similarity 1 to …
Drug1 Drug2 Prediction
Salsalate Gliclazide 0.85
Salsalate Warfarin 0.7
Interactions of type 1 Prediction
…
Drug1 Drug2 Prediction
Salsalate Gliclazide 0.53
Salsalate Warfarin 0.32
Interactions of type M Prediction
+ +
Machine Learning
Model
30X improvement in Learning performance
100s of TBs of data 50 million patients,
2000 drugs
2000 features
24. Personalized Medicine – Adverse Drug Reaction Workload
Personalization will result in massive increase in computation complexity
Real time prediction requirements for operational needs (< 1 minute for emergency situations)
• Computational pattern:
- Sparse cube to dense cube with patient as additional dimension
• Training:
- Number of patients above 50 Million
- Number of features around 1800
- Additional samples for training O(#patients)
- Number of cross-validation stages and #models per stage increases dramatically
- 100X increase in training complexity with ~100 TBs of Data
• Prediction:
- Input Model (#features) and dataset (# patients in the hospital)
- 1800 features and 500,000 patients
- Real time
M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
25. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
25
Programming Heterogeneous Systems
OpenCL?
SystemC?
VHDL?
C++?
Java?
CUDA?
26. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
26
Portability and Optimization in
Heterogeneous Systems
Library Layer
Accelerator
X
CPU
enablement
GPU
enablement
FPGA interface
& configuration
Accelerator X
Enablement
Cognitive Middleware
Application
ApplicationApplication
27. M. Gschwind. Software and System Co-Optimization in the Era of Heterogeneous Computing
21st Asia and South Pacific Design Automation Conference (ASP-DAC 2016), Macao, January 2016
27
Accelerate
Processing
in a Connected
World
Enable Compute-Intensive
Cognitive Workloads
Exploit Best-of-Breed
Accelerators
Provide Abstraction
of Hardware Function