The document provides an overview of the HSA System Architecture requirements as defined by the HSA Foundation's System Architecture workgroup. The workgroup aims to define a common set of platform properties that provide a dependable hardware and software foundation for heterogeneous parallel programming. This includes requirements for shared virtual memory, a consistent memory model, low-latency workload queuing, and other areas to simplify programming across different HSA-compliant processor architectures and components.
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...AMD Developer Central
Presentation CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Java applications, by Gary Frost and Vignesh Ravi at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
HSA is a new heterogeneous programming model, created for lowering the learning curve of heterogeneous. This slide shares you the advanced features and HSA.
Disclaimer: Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution 3.0 License.
You assume all responsibility for use and potential liability associated with any use of the material.
CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Ja...AMD Developer Central
Presentation CC-4001, Aparapi and HSA: Easing the developer path to APU/GPU accelerated Java applications, by Gary Frost and Vignesh Ravi at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
HSA is a new heterogeneous programming model, created for lowering the learning curve of heterogeneous. This slide shares you the advanced features and HSA.
Disclaimer: Unless otherwise noted, the content of this course material is licensed under a Creative Commons Attribution 3.0 License.
You assume all responsibility for use and potential liability associated with any use of the material.
Heterogeneous Systems Architecture: The Next Area of Computing Innovation AMD
Dr. Lisa Su, Senior Vice President and GM, Global Business Units, AMD keynote from ISSCC on Heterogeneous Systems Architecture: The Next Area of Computing Innovation - Case Study, The Holodeck.
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...AMD Developer Central
Keynote presentation, The Role of Java in Heterogeneous Computing, and How You Can Help, by Nandini Ramani, VP, Java Platform, Oracle Corporation, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
In this video from SC13, Vinod Tipparaju presents an Heterogeneous System Architecture Overview.
"The HSA Foundation seeks to create applications that seamlessly blend scalar processing on the CPU, parallel processing on the GPU, and optimized processing on the DSP via high bandwidth shared memory access enabling greater application performance at low power consumption. The Foundation is defining key interfaces for parallel computation utilizing CPUs, GPUs, DSPs, and other programmable and fixed-function devices, thus supporting a diverse set of high-level programming languages and creating the next generation in general-purpose computing."
Learn more: http://hsafoundation.com/
Watch the video presentation: http://wp.me/p3RLHQ-aXk
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...AMD Developer Central
Presentation PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Applications Using PPA , by Hui Huang, Zhaoqiang Zheng and Lihua Zhang at the AMD Developer Summit (APU13) November 11-13, 2013
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
Presentation PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander, at the AMD Developer Summit (APU13) November 11-13, 2013.
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...AMD Developer Central
Presentation MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabilities, by Srikanth Gollapudi at the AMD Developer Summit (APU13) November 11-13, 2013.
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...AMD Developer Central
Keynote presentation, The Programmers Guide to Reaching for the Cloud, by Phil Rogers, AMD Corporate Fellow, AMD, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...AMD Developer Central
Presentation CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with Windows Server, by Derrick Isoka at the AMD Developer Summit (APU13) November 11-13, 2013
Heterogeneous Systems Architecture: The Next Area of Computing Innovation AMD
Dr. Lisa Su, Senior Vice President and GM, Global Business Units, AMD keynote from ISSCC on Heterogeneous Systems Architecture: The Next Area of Computing Innovation - Case Study, The Holodeck.
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...AMD Developer Central
Keynote presentation, The Role of Java in Heterogeneous Computing, and How You Can Help, by Nandini Ramani, VP, Java Platform, Oracle Corporation, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
In this video from SC13, Vinod Tipparaju presents an Heterogeneous System Architecture Overview.
"The HSA Foundation seeks to create applications that seamlessly blend scalar processing on the CPU, parallel processing on the GPU, and optimized processing on the DSP via high bandwidth shared memory access enabling greater application performance at low power consumption. The Foundation is defining key interfaces for parallel computation utilizing CPUs, GPUs, DSPs, and other programmable and fixed-function devices, thus supporting a diverse set of high-level programming languages and creating the next generation in general-purpose computing."
Learn more: http://hsafoundation.com/
Watch the video presentation: http://wp.me/p3RLHQ-aXk
PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Ap...AMD Developer Central
Presentation PT-4058, Measuring and Optimizing Performance of Cluster and Private Cloud Applications Using PPA , by Hui Huang, Zhaoqiang Zheng and Lihua Zhang at the AMD Developer Summit (APU13) November 11-13, 2013
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
Presentation PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander, at the AMD Developer Summit (APU13) November 11-13, 2013.
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...AMD Developer Central
Presentation MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabilities, by Srikanth Gollapudi at the AMD Developer Summit (APU13) November 11-13, 2013.
Keynote (Phil Rogers) - The Programmers Guide to Reaching for the Cloud - by ...AMD Developer Central
Keynote presentation, The Programmers Guide to Reaching for the Cloud, by Phil Rogers, AMD Corporate Fellow, AMD, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with W...AMD Developer Central
Presentation CC-4006, Deliver Hardware Accelerated Applications Using RemoteFX vGPU with Windows Server, by Derrick Isoka at the AMD Developer Summit (APU13) November 11-13, 2013
Great Paper on HSAemu Full system simulator built form PQUEMU to do Full System Emulation of HSA from our Academic Member Yeh-Ching Chung of National Tsing Hua University
Hovitaga OpenSQL Editor - Security and authorization conceptHovitaga Kft.
Hovitaga OpenSQL Editor is a powerful tool for SAP consultants, ABAP developers and basis administrators that helps to work with the database of an SAP system.
This paper gives an overview of the authorization concept and security measures of the solution.
Quick overview of what Spark is and how we use it at Viadeo with Mesos.
Presenting also two concrete applications of Spark at Viadeo:
Predicting click on job offers in emails and building our Member Segmentation & Targeting platform.
Similar to HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Blinzer (20)
Vulkan and DirectX12 share many common concepts, but differ vastly from the APIs most game developers are used to. As a result, developing for DX12 or Vulkan requires a new approach to graphics programming and in many cases a redesign of the Game Engine. This lecture will teach the basic concepts common to Vulkan and DX12 and help developers overcome the main problems that often appear when switching to one of the new APIs. It will explain how those new concepts will help games utilize the hardware more efficiently and discuss best practices for game engine development.
For more, visit http://developer.amd.com/
AMD’s math libraries can support a range of programmers from hobbyists to ninja programmers. Kent Knox from AMD’s library team introduces you to OpenCL libraries for linear algebra, FFT, and BLAS, and shows you how to leverage the speed of OpenCL through the use of these libraries.
Review the material presented in the AMD Math libraries webinar in this deck.
For more:
Visit the AMD Developer Forums:http://devgurus.amd.com/welcome
Watch the replay: www.youtube.com/user/AMDDevCentral
Follow us on Twitter: https://twitter.com/AMDDevCentral
This is the slide deck from the popular "Introduction to Node.js" webinar with AMD and DevelopIntelligence, presented by Joshua McNeese. Watch our AMD Developer Central YouTube channel for the replay at https://www.youtube.com/user/AMDDevCentral.
This presentation accompanies the webinar replay located here: http://bit.ly/1zmvlkL
AMD Media SDK Software Architect Mikhail Mironov shows you how to leverage an AMD platform for multimedia processing using the new Media Software Development Kit. He discusses how to use a new set of C++ interfaces for easy access to AMD hardware blocks, and shows you how to leverage the Media SDK in the development of video conferencing, wireless display, remote desktop, video editing, transcoding, and more.
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
This deck presents highlights from the Introduction to OpenCL™ Programming Webinar presented by Acceleware & AMD on Sept. 17, 2014. Watch a replay of this popular webinar on the AMD Dev Central YouTube channel here: https://www.youtube.com/user/AMDDevCentral or here for the direct link: http://bit.ly/1r3DgfF
Learn more about DirectGMA in this blog post: bit.ly/AMDDirectGMA
AMD has introduced Direct Graphics Memory Access in order to:
‒ Makes a portion of the GPU memory accessible to other devices
‒ Allows devices on the bus to write directly into this area of GPU memory
‒ Allows GPUs to write directly into the memory of remote devices on the bus supporting DirectGMA
‒ Provides a driver interface to allow 3rd party hardware vendors to support data exchange with an AMD GPU using DirectGMA
‒ and more
View the accompanying blog post here: bit.ly/AMDDirectGMA
This Webinar explores a variety of new and updated features in Java 8, and discuss how these changes can positively impact your day-to-day programming.
Watch the video replay here: http://bit.ly/1vStxKN
Your Webinar presenter, Marnie Knue, is an instructor for Develop Intelligence and has taught Sun & Oracle certified Java classes, RedHat JBoss administration, Spring, and Hibernate. Marnie also has spoken at JavaOne.
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
This presentation discusses the Mantle API, what it is, why choose it, and abstraction level, small batch performance and platform efficiency.
Download the presentation from the AMD Developer website here: http://bit.ly/TrEUeC
Inside XBox One by Martin Fuller from the Sweden Game Developers Conference, June 2, 2014, Stockholm, Sweden. View other presentations here: http://bit.ly/TrEUeC
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Harris Gasparakis, AMD, at the Embedded Vision Alliance Summit, May 2014.
Harris Gasparakis, Ph.D., is AMD’s OpenCV manager. In addition to enhancing OpenCV with OpenCL acceleration, he is engaged in AMD’s Computer Vision strategic planning, ISVs, and AMD Ventures engagements, including technical leadership and oversight in the AMD Gesture product line. He holds a Ph.D. in theoretical high energy physics from YITP at SUNYSB. He is credited with enabling real-time volumetric visualization and analysis in Radiology Information Systems (Terarecon), including the first commercially available virtual colonoscopy system (Vital Images). He was responsible for cutting edge medical technology (Biosense Webster, Stereotaxis, Boston Scientific), incorporating image and signal processing with AI and robotic control.
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
In this webinar presentation, ArrayFire COO Oded Green demonstrates best practices to help you quickly get started with OpenCL™ programming. Learn how to get the best performance from AMD hardware in various programming languages using ArrayFire. Oded discusses the latest advancements in the OpenCL™ ecosystem, including cutting edge OpenCL™ libraries such as clBLAS, clFFT, clMAGMA and ArrayFire. Examples are shown in real code for common application domains.
Watch the webinar here: http://bit.ly/1obT0M2
For more developer resources, visit:
http://arrayfire.com/
http://developer.amd.com/
Follow us on Twitter: https://twitter.com/AMDDevCentral
See info in the slides for more contact information and resource links!
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
Johan Andersson will show how the Frostbite 3 game engine is using the low-level graphics API Mantle to deliver significantly improved performance in Battlefield 4 on PC and future games from Electronic Arts in this presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Also view this and other presentations on our developer website at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Epistemic Interaction - tuning interfaces to provide information for AI support
HC-4015, An Overview of the HSA System Architecture Requirements, by Paul Blinzer
1. THE
HSA
SYSTEM
ARCHITECTURE
REQUIREMENTS
–
AN
OVERVIEW
PAUL
BLINZER,
FELLOW,
HSA
SYSTEM
SOFTWARE,
AMD
SYSTEM
ARCHITECTURE
WORKGROUP
CHAIR,
HSA
FOUNDATION
1
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
2. AGENDA
!
What
is
the
HSA
FoundaKon?
!
The
System
Architecture
Workgroup
and
its
goals
!
What
defines
HSA
plaVorms
and
components?
!
The
Shared
Virtual
Memory
requirements
!
The
HSA
Memory
Model
Requirements
!
The
HSA
Queuing
Architecture
!
Some
other
requirements
set
by
the
System
Architecture
specificaKon
!
Where
to
find
further
informaKon
!
Q
&
A
2
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
3. WHAT
IS
THE
HSA
FOUNDATION?
" This
is
the
short
version…
!
The
HSA
FoundaKon
is
a
not-‐for-‐profit
consorKum
of
SOC
and
SOC
IP
vendors,
OEMs,
academia,
OSVs
and
ISVs
defining
a
consistent
heterogeneous
plaVorm
architecture
to
make
it
dramaKcally
easier
to
program
heterogeneous
parallel
devices
!
!
It
spans
mulKple
host
plaVorm
architectures
and
programmable
data
parallel
components
(e.g.
CPU:
x86,
ARM,
MIPS,
…
device
types:
GPUs,
DSPs,
…)
to
work
collaboraKvely
within
the
same
HSA
system
architecture
It
defines
a
set
of
specificaKons
that
define
HW
&
SW
plaVorm
requirements
to
enable
applicaKons
to
target
the
feature
set
from
high
level
languages
and
APIs
!
!
!
It’s
not
a
replacement
to
e.g.
OpenCL
but
complementary
to
it,
defining
the
system
level
properKes
“below
the
API”,
leveraged
by
applicaKon-‐
and
system
soiware
Conformance
The
System
Architecture
specificaKon
defines
the
required
component
and
plaVorm
features
for
HSA
compliant
components
This
presentaKon
is
an
overview
of
the
current
System
Architecture
definiKons
and
does
not
represent
a
complete
or
“final”
state
!
Tools
that
one
is
the
specificaKon
itself
when
available
☺
3
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
System
Runtime
Specification
Programmer’s
Reference
Manual
Platform
(Software)
System
Architecture
Specification
4. THE
SYSTEM
ARCHITECTURE
WORKGROUP
OF
THE
HSA
FOUNDATION
"
Who
ParKcipates
and
what
are
the
goals?
" The
workgroup
membership
spans
a
wide
variety
of
IP
and
plaVorm
architecture
owners
‒ Several
host
plaVorm
architectures
are
targeted
" The
specificaKons
define
a
common
set
of
plaVorm
properKes
that
provide
a
dependable
hardware
and
system
foundaKon
for
applicaKon
soiware,
libraries
and
runKmes
" The
goal
is
to
eliminate
“weak
points”
in
the
system
soiware-‐
and
hardware
architecture
of
tradiKonal
plaVorms
that
lead
to
unnecessary
overhead
in
the
operaKons
of
data
parallel
workloads
" The
main
deliverables
are:
‒ Well-‐defined,
consistent
and
dependable
memory
model
all
HSA
agents
operate
in
‒ Share
access
to
process
virtual
memory
between
HSA
agents
(“ptr-‐is-‐ptr”)
‒ Low-‐latency
workload
dispatch
contained
in
user-‐mode
queues
‒ Scalability
across
a
wide
range
of
plaVorms
‒ These
properKes
are
leveraged
in
the
“HSA
Programmer’s
Reference”,
HSAIL
and
HSA
RunKme
specificaKons
4
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
5. WHAT
DEFINES
HSA
PLATFORMS
AND
COMPONENTS?
"
"
In
short,
an
HSA
compaKble
plaVorm
consists
of
“HSA
agents”
(hardware
components
that
parKcipate
in
the
HSA
memory
model)
adhering
to
the
various
system
architecture
requirements
Each
HSA
agent
adheres
to
the
same
queuing
&
dispatch
mechanics,
low-‐latency
synchronizaKon
primiKves,
memory
coherence
and
data
visibility
(memory
model)
requirements
‒
Defined
mainly
in
the
“(Soiware)
System
Architecture”
specificaKon
‒
The
HSAIL
and
“Programmer’s
Reference
Manual”
specificaKons
define
the
soiware
execuKon
model
‒
Architected
mechanisms
to
enqueue
and
dispatch
workloads
from
one
HSA
agent
queue
to
another
eliminate
the
need
to
use
the
host
CPU
for
these
purposes
for
a
lot
of
scenarios
‒
Architected
infrastructure
allows
exchanging
data
with
non-‐HSA
compliant
components
in
a
plaVorm
‒
Fundamental
data
types
are
naturally
aligned
5
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
6. WHAT
DEFINES
HSA
PLATFORMS
AND
COMPONENTS?
‒ There
are
two
different
machine
models
(“small”
and
“large”)
that
target
different
funcKonality
levels
‒ It
takes
into
account
different
feature
requirements
for
different
plaVorm
environments
‒ In
all
cases,
the
same
HSA
applicaKon
programming
model
is
used
to
target
HSA
agents
and
provides
the
same
power–
efficient
and
low-‐latency
dispatch
mechanisms,
synchronizaKon
primiKves
and
SW
programming
model
‒ ApplicaKons
wriren
to
target
HSA
small
model
machines
will
generally
work
on
large
model
machines,
too
‒ If
the
large
model
plaVorm
and
host
OperaKng
System
provides
a
32bit
process
environment
Proper&es
Small
Machine
Model
Large
Machine
Model
PlaVorm
targets
embedded
or
personal
device
space
(controllers,
smartphones,
etc.)
PC,
workstaKon,
cloud
Server,
etc
running
more
demanding
workloads
NaKve
pointer
size
32bit
64bit
(+
32bit
ptr
if
32bit
processes
are
supported)
FloaKng
point
size
Half
(FP16*),
Single
(FP32)
precision
Half
(FP16*),
Single
(FP32),
Double
(FP64)
precision
Atomic
ops
size
32bit
32bit,
64bit
*min.
Load
and
store
on
memory
6
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
7. THE
SHARED
PROCESS
VIRTUAL
ADDRESS
SPACE
REQUIREMENTS(1)
‒ The
Basis
of
“ptr-‐is-‐ptr”
"
Each
HSA
agent
adheres
to
the
same
user
process
address
space
view
as
the
host
CPU
‒
"
The
process
address
view
is
established
by
the
hardware’s
page
table
mappings
‒
‒
‒
"
HSA
operates
in
a
“flat”
virtual
address
space,
using
64bit
&
32bit
ptrs
depending
on
applicaKon/machine
model
‒ A
pointer
value
references
the
same
memory
for
every
HSA
agent
‒ An
HSA
agent
can
“walk”
or
update
linked
data
structures
directly
without
any
assistance
from
a
host
CPU
HSA
agent
virtual
address
range
matches
the
host
plaVorm
(e.g.
48bit,
32bit,
…)
HSA
agents
always
operate
at
“user
privilege”
of
the
host
CPU,
policy
enforced
by
system
HSA
agents
observe
the
same
memory
page
table
arributes
(cache,
read,
write,
…)
and
page
sizes
of
the
host
CPU,
policy
enforced
by
system
HSA
agents
support
page
faults,
allowing
to
directly
operate
on
pageable
memory
as
provided
by
the
OperaKng
System
environment
‒
‒
For
allocated
pageable
memory,
System
Soiware
takes
page
faults,
commits
memory,
loads
contents
from
backup
store
and
restarts
execuKon
like
it
does
for
any
access
from
host
CPU
threads
There
is
no
tedious
device
buffer
copy,
explicit
page
lock
or
similar
needed
to
access
data
in
allocated
memory
by
an
HSA
agent
directly!
7
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
8. THE
SHARED
PROCESS
VIRTUAL
ADDRESS
SPACE
REQUIREMENTS(2)
" The
basis
of
“ptr-‐is-‐ptr”
"
On
AMD
processor-‐based
pla9orms,
the
IOMMUv2
device
provides
the
HSAMMU
translaKon
services
via
standard
PCI
Express™
ATS/PRI
protocols
to
HSA
compliant
hardware
when
accessing
memory
from
the
HSA
agent
‒
‒
"
Device
Table
base
register
Event
Counter
registers
HSA MMU
(IOMMUv2 device)
Command
Page Req
Buffer
Log
base register
base register
Event Log
base register
System memory
IOMMUv2
integraKon
into
OS
memory
manager
provides
the
low-‐level
infrastructure
(e.g.
in
Linux®
kernel)
Different
host
plaVorm
architectures
may
use
different
detail
mechanisms
here
HSA MMU
Translation Tables
(per Process, PASID)
Page Service
Request Log
Event
Log
8
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
I/O page tables
Command
Buffer
The
implementaKon
detail
is
not
relevant
to
the
applicaKon
and
dealt
within
the
system
soiware
(e.g.
OS)
Host
translation
Device
Table
‒
As
long
as
it
follows
the
HSA
Sysarch
requirements,
it
is
ok
Interrupt
Remapping
Table
‒
Guest &
host
translation
separate
translaKon
levels
are
used
(see
block
diagram)
ImplementaKon
of
shared
virtual
address
space
by
other
vendors
on
other
host
plaVorms
may
be
different
Perf Counters &
RAS Info (opt.)
Peripheral Page
Requests
(PPR) Service
The
HSAMMU
funcKonality
is
provided
in
addiKon
to
IOMMU
funcKonality
used
in
device
virtualizaKon
‒
"
HSA MMU Data structures
9. THE
HSA
MEMORY
MODEL
REQUIREMENTS
" What
are
Its
key
properKes?
"
A
memory
model
defines
how
writes
by
one
work
item
or
agent
becomes
visible
to
other
work
items
and
agents,
rules
that
need
to
be
adhered
to
by
compilers
and
applicaKon
threads
‒
‒
"
‒
Naturally
aligned
on
size,
small
machine
model
supports
32bit,
large
machine
model
supports
32bit
and
64bit
Cache
Coherency
between
HSA
agents
(&
host
CPU)
is
maintained
by
default
‒
Inherently
maps
to
many
CPU
and
device
architectures
very
easily
Efficient
sequenKal
consistency
mechanisms
supported
to
fit
high-‐level
language
programming
models
A
consistent,
full
set
of
atomic
operaKons
is
available
‒
"
Important
to
define
scope
for
performance
opKmizaKons
in
the
compiler,
to
allow
reordering
of
code
in
the
Finalizer
At
its
base,
the
HSA
memory
model
is
based
on
a
“relaxed”
load
acquire/store
release
model
‒
"
It
defines
visibility
and
ordering
rules
of
write
and
read
events
across
work
items,
HSA
agents
and
interacKons
with
non-‐HSA
components
in
the
system
key
feature
of
the
HSA
system
&
plaVorm
environment
9
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
10. THE
HSA
QUEUEING
ARCHITECTURE
REQUIREMENTS(1)
" The
basis
of
the
workload
dispatch
on
HSA
"
The
queue
dispatch
occurs
through
architected
queue
packets
(“Architected
Queuing
Language”,
AQL
)
that
references
the
work
items
&
parameters
‒
Dispatch
to
HW
occurs
directly
in
user
mode,
eliminaKng
a
notable
source
of
latency
overhead
in
tradiKonal
architectures!
‒
Two
architected
packet
types
exist
at
the
moment,
dispatch
and
barrier
packets
‒
‒
"
Each
queue
is
defined
by
several
architected
parameters
(type,
base
address,
size,
read
index,
write
index,
…)
that
allow
targeKng
the
queue
from
other
HSA
agents
and
the
host
CPU
The
design
allows
an
HSA
agent
on
the
plaVorm
to
build
&
dispatch
jobs
to
a
queue
using
HSA
architected
interfaces
ApplicaKons
and
runKme
can
build
different
queuing
models
on
top
of
the
infrastructure
‒
Single-‐producer,
MulK-‐producer
queuing
models,
lock-‐free
dispatch,
…
are
all
opKons
SW
can
implement
on
top
of
the
system
architecture’s
queue
definiKon
to
fit
the
use
model
10
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
11. THE
HSA
QUEUEING
ARCHITECTURE
REQUIREMENTS(2)
" The
basis
of
the
workload
dispatch
on
HSA
"
The
HSA
System
Architecture
defines
a
user
mode
queue
based
dispatch
mechanism
‒
‒
"
Each
queue
is
only
valid
within
that
process
context
and
represents
a
virtual
enKty
that
is
scheduled
to
hardware
The
job
execuKon
occurs
at
“user
privilege”
like
the
rest
of
the
applicaKon
code,
enforced
by
system
architecture
Each
HSA
agent
allows
for
mulKple
queues
per
applicaKon
process
‒
HSA
defines
in-‐order
dispatch
semanKcs
of
work
items
within
queues
for
efficient
HW
implementaKon
‒
‒
"
HW
may
execute
dispatch
packets
“out-‐of-‐order”,
if
no
dependencies
exist
and
in-‐order
semanKcs
are
followed
externally
“Out
of
order”
execuKon
applies
between
queues,
with
explicit,
memory
based
synchronizaKon
mechanisms
between
them
as
needed
It
is
“cheap”
to
create
queues
in
HSA,
so
applicaKons
can
have
one
queue
per
HSA
agent
for
each
applicaKon
thread,
or
leveraging
mulKple
HSA
user
queues
per
thread
if
needed
‒
This
gives
applicaKons
a
lot
of
flexibility
to
structure
the
queue
layout
to
match
the
problem
instead
of
trying
to
fit
the
problem
to
work
with
one
or
a
few
queues
only
11
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
12. OTHER
REQUIREMENTS
SET
BY
THE
HSA
SYSTEM
ARCHITECTURE
" Miscellaneous
menKon,
but
nevertheless
important
to
make
it
work
well…
"
HSA
Memory
based
signaling
and
synchronizaKon
primiKves
‒
Defines
memory
based
semanKcs
to
synchronize
with
work
items
processed
by
HSA
agents
‒
e.g.
32bit
or
64bit
value,
content
update,
wait
on
value
by
HSA
agents
and
AQL
packets
‒
‒
Allows
one-‐to-‐one
and
one-‐to-‐many
signaling
‒
The
signaling
semanKcs
follow
atomicity
requirements
defined
in
the
memory
model
‒
"
Hardware-‐assisted,
power-‐efficient
&
low-‐latency
way
to
synchronize
execuKon
of
work
items
between
threads
RunKme
&
applicaKon
SW
can
use
infrastructure
to
build
mutexes,
semaphores,
other
synchronizaKon
primiKves
HSA
Cache
Coherency
Domains
‒
Defines
the
scope
of
HSA
cache
coherency
and
relate
to
other
non-‐HSA
system
resource
operaKons
‒
Associated
with
the
memory
model
requirements
‒
Architected
way
to
interact
with
non-‐HSA
plaVorm
infrastructure
(e.g.
graphics)
12
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
13. OTHER
REQUIREMENTS
SET
BY
THE
HSA
SYSTEM
ARCHITECTURE
" Miscellaneous
menKon,
but
nevertheless
important
HSA Platform - Simple
"
HSA
system
Kmestamp
requirements
‒
‒
Defines
a
low-‐overhead
mechanism
to
“determine
the
passing
of
Kme”
on
an
HSA
plaVorm
core
GPU
core
core
core
H-CU
H-CU
Mem
HSA MMU
H-CU
The
value
can
be
queried
by
HSAIL
or
HSA
runKme
‒
CPU
System Memory
Represented
by
a
64bit
Kmestamp
value
that
does
not
roll
over
and
is
incremented
at
a
constant
rate
in
HW
‒
"
HSA APU
ApplicaKons
and
tools
are
able
to
build
a
consistent
Kmeline
across
all
HSA
agents
HSA
Topology
requirements
HSA Platform
Add-In GPU (optional)
GPU
HSA APU
‒
Defines
system
topology
and
properKes
of
HSA
agents
discoverable
on
an
HSA
plaVorm
by
an
applicaKon
to
take
advantage
of
plaVorm
properKes
‒
‒
Examples
are
#of
compute
units,
max.
work
item
dimensions,
work
group
size,
work
item
size,
queue
properKes,
…
API’s
like
OpenCL™
and
others
can
leverage
HSA
system
topology
data
to
discover
memory
layout,
compute
unit
properKes
and
other
properKes
and
consistently
report
the
system
topology
for
applicaKons
to
leverage
13
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
Device Local
Memory
HSA GPU
H-CU
CPU
core
core
core
core
System Memory
H-CU
GPU
HSA MMU
System
Firmware
H-CU
H-CU
H-CU
Mem
IOBUS
H-CU
Firmware
Mem
14. WHERE
TO
FIND
FURTHER
INFORMATION
ON
SYSTEM
ARCHITECTURE?
"
HSA
FoundaKon
Website:
hrp://www.hsafoundaKon.com
‒
The
main
locaKon
for
specs,
developer
info,
tools,
publicaKons
and
many
things
more
‒
HSA
Programmer’s
Reference
Manual
v
0.95
has
been
published
‒
HSA
PlaVorm
Soiware
Systems
Architecture
SpecificaKon
is
quickly
nearing
the
0.95
state
‒
Will
be
published
aier
raKficaKon
by
the
HSA
FoundaKon
Board
of
Directors
‒
Stay
Tuned
14
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13
15. ANY
QUESTIONS?
" Of
course
there
are,
so
go
ahead
☺
15
|
THE
HSA
PLATFORM
SYSTEM
ARCHITECTURE
SPECIFICATION
–
AN
OVERVIEW
|
NOVEMBER
12,
2013
|
APU13