A look at how new Direct3D advancements enhance efficiency and enable fully-threaded building of command buffers in this prentation from the 2014 Game Developers Conference in San Francisco March 17-21. Also view this and other presentations on our developer website at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
Keynote presentation, Is There Anything New in Heterogeneous Computing, by Mike Muller, Chief Technology Officer, ARM, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...AMD Developer Central
Presentation PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, by Jean-Charles Vasnier, at the AMD Developer Summit (APU13) November 11-13, 2013.
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
This presentation discusses the Mantle API, what it is, why choose it, and abstraction level, small batch performance and platform efficiency.
Download the presentation from the AMD Developer website here: http://bit.ly/TrEUeC
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...AMD Developer Central
Presentation MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabilities, by Srikanth Gollapudi at the AMD Developer Summit (APU13) November 11-13, 2013.
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
Keynote presentation, Is There Anything New in Heterogeneous Computing, by Mike Muller, Chief Technology Officer, ARM, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, ...AMD Developer Central
Presentation PT-4142, Porting and Optimizing OpenMP applications to APU using CAPS tools, by Jean-Charles Vasnier, at the AMD Developer Summit (APU13) November 11-13, 2013.
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
This presentation discusses the Mantle API, what it is, why choose it, and abstraction level, small batch performance and platform efficiency.
Download the presentation from the AMD Developer website here: http://bit.ly/TrEUeC
MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabi...AMD Developer Central
Presentation MM-4092, Optimizing FFMPEG and Handbrake Using OpenCL and Other AMD HW Capabilities, by Srikanth Gollapudi at the AMD Developer Summit (APU13) November 11-13, 2013.
This presentation accompanies the webinar replay located here: http://bit.ly/1zmvlkL
AMD Media SDK Software Architect Mikhail Mironov shows you how to leverage an AMD platform for multimedia processing using the new Media Software Development Kit. He discusses how to use a new set of C++ interfaces for easy access to AMD hardware blocks, and shows you how to leverage the Media SDK in the development of video conferencing, wireless display, remote desktop, video editing, transcoding, and more.
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...AMD Developer Central
Presentation, HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU," by Mayank Daga and Mark Nutter at the AMD Developer Summit (APU13) Nov. 11-13.
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
Presentation PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander, at the AMD Developer Summit (APU13) November 11-13, 2013.
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...AMD Developer Central
Presentation CC-4005, Performance analysis of 3D Finite Difference computational stencils on Seamicro fabric compute systems, by Joshua Mora from the AMD Developer Summit (APU13) November 2013.
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...AMD Developer Central
Presentation WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour and Brian Salomon at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...AMD Developer Central
Presentation CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and other usages, by Daryl-Sartain at the AMD Developer Summit (APU13) November 11-13, 2013.
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
Presentation HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated Processing Units, by Robert Engel at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...AMD Developer Central
Presentation MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Achievements, by Joseph Hsieh at the AMD Developer Summit, November 11-13, 2013.
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu from the AMD Developer Summit (APU13) November 11-13, 2013.
This presentation accompanies the webinar replay located here: http://bit.ly/1zmvlkL
AMD Media SDK Software Architect Mikhail Mironov shows you how to leverage an AMD platform for multimedia processing using the new Media Software Development Kit. He discusses how to use a new set of C++ interfaces for easy access to AMD hardware blocks, and shows you how to leverage the Media SDK in the development of video conferencing, wireless display, remote desktop, video editing, transcoding, and more.
HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU...AMD Developer Central
Presentation, HC-4019, "Exploiting Coarse-grained Parallelism in B+ Tree Searches on an APU," by Mayank Daga and Mark Nutter at the AMD Developer Summit (APU13) Nov. 11-13.
PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben SanderAMD Developer Central
Presentation PT-4059, Bolt: A C++ Template Library for Heterogeneous Computing, by Ben Sander, at the AMD Developer Summit (APU13) November 11-13, 2013.
CC-4005, Performance analysis of 3D Finite Difference computational stencils ...AMD Developer Central
Presentation CC-4005, Performance analysis of 3D Finite Difference computational stencils on Seamicro fabric compute systems, by Joshua Mora from the AMD Developer Summit (APU13) November 2013.
WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour...AMD Developer Central
Presentation WT-4072, Rendering Web Content at 60fps, by Vangelis Kokkevis, Antoine Labour and Brian Salomon at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and o...AMD Developer Central
Presentation CE-4028, Miracast with AMD Wireless Display technology – Kickass gaming and other usages, by Daryl-Sartain at the AMD Developer Summit (APU13) November 11-13, 2013.
HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated ...AMD Developer Central
Presentation HC-4021, Efficient scheduling of OpenMP and OpenCL™ workloads on Accelerated Processing Units, by Robert Engel at the AMD Developer Summit (APU13) Nov. 11-13, 2013.
MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Ac...AMD Developer Central
Presentation MM-4104, Smart Sharpen using OpenCL in Adobe Photoshop CC – Challenges and Achievements, by Joseph Hsieh at the AMD Developer Summit, November 11-13, 2013.
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compil...AMD Developer Central
PL-4051, An Introduction to SPIR for OpenCL Application Developers and Compiler Developers, by Yaxun Liu from the AMD Developer Summit (APU13) November 11-13, 2013.
Keynote (Dr. Lisa Su) - Developers: The Heart of AMD Innovation - by Dr. Lisa...AMD Developer Central
Keynote, Developers: The Heart of AMD Innovation, by Dr. Lisa Su, Senior VP and GM, Global Business Units, AMD, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
A presentation for all the IT resellers and retailers in Nepal.
Introducing next generation technologies into the consumer market to collectively deliver a greater and richer computer experience.
This AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21 explains how Mantle features can enable developers to improve both CPU and GPU performance in their titles. Also view this and other presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
RapidFire - the Easy Route to low Latency Cloud Gaming Solutions - AMD at GDC14AMD Developer Central
Learn more about how AMD’s RapidFire SDK simplifies the delivery of multi-game streaming from a single GPU while minimizing latency to ensure one of the best cloud gaming experiences in this presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Also view this and other presentations on our developer website at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Talk by Brendan Gregg for YOW! 2021. "The pursuit of faster performance in computing is the driving reason for many new technologies and updates. This talk discusses performance improvements now underway that you will likely be adopting soon, for processors (including 3D stacking and cloud vendor CPUs), memory (including DDR5 and high-bandwidth memory [HBM]), disks (including 3D Xpoint as a 3D NAND accelerator), networking (including QUIC and eXpress Data Path [XDP]), runtimes, hypervisors, and more. The future of performance is increasingly cloud-based, with hardware hypervisors and custom processors, meaningful observability of everything down to cycle stalls (even as cloud guests), and high-speed syscall-avoiding applications that use eBPF, FPGAs, and io_uring. The talk also discusses where future performance improvements might be expected, with predictions for new technologies."
Optimizing Direct X On Multi Core Architecturespsteinb
This slide set covers best practices in designing threaded rendering in PC games. Examples of current PC titles will be used throughout the talk to highlight the various points.
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Stefano Di Carlo
These slides have been presented by Dr. Alessandro Vallero at the IEEE VLSI Test Symposium, San Francisco, CA, USA (April 22-25, 2018).
General Purpose computing on Graphics Processing Unit offers a remarkable speedup for data parallel workloads, leveraging GPUs computational power. However, differently from graphic computing, it requires highly reliable operation in most of application domains.
This presentation talk about a “Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA and AMD GPUs“. The work is the outcome of a collaboration between the TestGroup of Politecnico di Torino (http://www.testgroup.polito.it) and the Computer Architecture Lab of the University of Athens (dscal.di.uoa.gr) started under the FP7 Clereco Project (http://www.clereco.eu). It presents an extended study based on a consolidated workflow for the evaluation of the reliability in correlation with the performance of four GPU architectures and corresponding chips: AMD Southern Islands and NVIDIA G80/GT200/Fermi. We obtained reliability measurements (AVF and FIT) employing both fault injection and ACE-analysis based on microarchitecture-level simulators. Apart from the reliability-only and performance-only measurements, we propose combined metrics for performance and reliability (to quantify instruction throughput or task execution throughput between failures) that assist comparisons for the same application among GPU chips of different ISAs and vendors, as well as among benchmarks on the same GPU chip.
Watch the presentation at: https://youtu.be/GV5xRDgfCw4
Paper Information:
Alessandro Vallero§ , Sotiris Tselonis, Dimitris Gizopoulos* and Stefano Di Carlo§, “Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA and AMD GPUs”, IEEE VLSI Test Symposium 2018 (VTS 2018), San Francisco, CA (USA), April 22-25, 2018.
∗Politecnico di Torino, Italy. Email: stefano.dicarlo,alessandro.vallero@polito.it †University of Athens, Greece Email: dgizop@di.uoa.gr
The Visual Effect Graph gives artists of all experience levels the power to create amazing particle VFX. In this intermediate-level session, Julien Fryer and Vlad Neykov from our development team will give you a sneak peek into how to generate millions of GPU-based particles in real-time using the Visual Effects Graph's toolset.
Vlad Neykov - Unity Technologies
Julien Fryer - Unity Technologies
Computing Performance: On the Horizon (2021)Brendan Gregg
Talk by Brendan Gregg for USENIX LISA 2021. https://www.youtube.com/watch?v=5nN1wjA_S30 . "The future of computer performance involves clouds with hardware hypervisors and custom processors, servers running a new type of BPF software to allow high-speed applications and kernel customizations, observability of everything in production, new Linux kernel technologies, and more. This talk covers interesting developments in systems and computing performance, their challenges, and where things are headed."
Optimizing HDRP with NVIDIA Nsight Graphics – Unite Copenhagen 2019Unity Technologies
Unity's High Definition Render Pipeline (HDRP) makes it possible for developers to unleash their application's full potential using a custom renderer. With this great power, comes great responsibility; more than ever you need to ensure that your application maintains optimal performance so your users can have the best experience possible. These slides look at using NVIDIA Nsight Graphics to profile and optimize your application to achieve peak performance. See how GPU Trace can help you visualize performance metrics to fully utilize your GPU; and how the brand new Shader Profile can help you to optimize your new HDRP shaders.
This session also showed how to utilize advanced features like the Acceleration Structure Viewer to debug your real-time ray tracing (DXR) application.
Speaker: Aurelio Reis – NVIDIA
Watch the session on YouTube: https://youtu.be/l_LiE1vAFhM
Vulkan and DirectX12 share many common concepts, but differ vastly from the APIs most game developers are used to. As a result, developing for DX12 or Vulkan requires a new approach to graphics programming and in many cases a redesign of the Game Engine. This lecture will teach the basic concepts common to Vulkan and DX12 and help developers overcome the main problems that often appear when switching to one of the new APIs. It will explain how those new concepts will help games utilize the hardware more efficiently and discuss best practices for game engine development.
For more, visit http://developer.amd.com/
AMD’s math libraries can support a range of programmers from hobbyists to ninja programmers. Kent Knox from AMD’s library team introduces you to OpenCL libraries for linear algebra, FFT, and BLAS, and shows you how to leverage the speed of OpenCL through the use of these libraries.
Review the material presented in the AMD Math libraries webinar in this deck.
For more:
Visit the AMD Developer Forums:http://devgurus.amd.com/welcome
Watch the replay: www.youtube.com/user/AMDDevCentral
Follow us on Twitter: https://twitter.com/AMDDevCentral
This is the slide deck from the popular "Introduction to Node.js" webinar with AMD and DevelopIntelligence, presented by Joshua McNeese. Watch our AMD Developer Central YouTube channel for the replay at https://www.youtube.com/user/AMDDevCentral.
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
This deck presents highlights from the Introduction to OpenCL™ Programming Webinar presented by Acceleware & AMD on Sept. 17, 2014. Watch a replay of this popular webinar on the AMD Dev Central YouTube channel here: https://www.youtube.com/user/AMDDevCentral or here for the direct link: http://bit.ly/1r3DgfF
Learn more about DirectGMA in this blog post: bit.ly/AMDDirectGMA
AMD has introduced Direct Graphics Memory Access in order to:
‒ Makes a portion of the GPU memory accessible to other devices
‒ Allows devices on the bus to write directly into this area of GPU memory
‒ Allows GPUs to write directly into the memory of remote devices on the bus supporting DirectGMA
‒ Provides a driver interface to allow 3rd party hardware vendors to support data exchange with an AMD GPU using DirectGMA
‒ and more
View the accompanying blog post here: bit.ly/AMDDirectGMA
This Webinar explores a variety of new and updated features in Java 8, and discuss how these changes can positively impact your day-to-day programming.
Watch the video replay here: http://bit.ly/1vStxKN
Your Webinar presenter, Marnie Knue, is an instructor for Develop Intelligence and has taught Sun & Oracle certified Java classes, RedHat JBoss administration, Spring, and Hibernate. Marnie also has spoken at JavaOne.
Inside XBox One by Martin Fuller from the Sweden Game Developers Conference, June 2, 2014, Stockholm, Sweden. View other presentations here: http://bit.ly/TrEUeC
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Ha...AMD Developer Central
Computer Vision Powered by Heterogeneous System Architecture (HSA) by Dr. Harris Gasparakis, AMD, at the Embedded Vision Alliance Summit, May 2014.
Harris Gasparakis, Ph.D., is AMD’s OpenCV manager. In addition to enhancing OpenCV with OpenCL acceleration, he is engaged in AMD’s Computer Vision strategic planning, ISVs, and AMD Ventures engagements, including technical leadership and oversight in the AMD Gesture product line. He holds a Ph.D. in theoretical high energy physics from YITP at SUNYSB. He is credited with enabling real-time volumetric visualization and analysis in Radiology Information Systems (Terarecon), including the first commercially available virtual colonoscopy system (Vital Images). He was responsible for cutting edge medical technology (Biosense Webster, Stereotaxis, Boston Scientific), incorporating image and signal processing with AI and robotic control.
Productive OpenCL Programming An Introduction to OpenCL Libraries with Array...AMD Developer Central
In this webinar presentation, ArrayFire COO Oded Green demonstrates best practices to help you quickly get started with OpenCL™ programming. Learn how to get the best performance from AMD hardware in various programming languages using ArrayFire. Oded discusses the latest advancements in the OpenCL™ ecosystem, including cutting edge OpenCL™ libraries such as clBLAS, clFFT, clMAGMA and ArrayFire. Examples are shown in real code for common application domains.
Watch the webinar here: http://bit.ly/1obT0M2
For more developer resources, visit:
http://arrayfire.com/
http://developer.amd.com/
Follow us on Twitter: https://twitter.com/AMDDevCentral
See info in the slides for more contact information and resource links!
Rendering Battlefield 4 with Mantle by Johan Andersson - AMD at GDC14AMD Developer Central
Johan Andersson will show how the Frostbite 3 game engine is using the low-level graphics API Mantle to deliver significantly improved performance in Battlefield 4 on PC and future games from Electronic Arts in this presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Also view this and other presentations on our developer website at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Mantle and Nitrous - Combining Efficient Engine Design with a modern API - AM...AMD Developer Central
Oxide Games Partners Dan Baker and Tim Kipp will show you how to build a high throughput renderer using the Mantle API in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Also view this and other presentations on our developer website at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Bill explains some of the ways that the Vertex Shader can be used to improve performance by taking a fast path through the Vertex Shader rather than generating vertices with other parts of the pipeline in this AMD technology presentation from the 2014 Game Developers Conference in San Francisco March 17-21. Check out more technical presentations at http://developer.amd.com/resources/documentation-articles/conference-presentations/
Keynote (Nandini Ramani) - The Role of Java in Heterogeneous Computing & How ...AMD Developer Central
Keynote presentation, The Role of Java in Heterogeneous Computing, and How You Can Help, by Nandini Ramani, VP, Java Platform, Oracle Corporation, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Direct3D and the Future of Graphics APIs - AMD at GDC14
1. DIRECT3D AND THE FUTURE OF
GRAPHICS APIS
Dave Oldcorn, AMD
Dan Baker, Oxide Games
Johan Andersson, EA / DICE
2. 2 | AMD Direct3D Futures | March 20th, 2014
NITROUS AND DX12
Dan Baker
Partner, Oxide Games
3. 3 | AMD Direct3D Futures | March 20th, 2014
HAVEN’T WE BEEN HERE BEFORE?
Goal of DX9
–Remember State blocks?
Goal of DX10
–Large state groups
Goal of DX11
–Deferred contexts
Are we actually getting faster, or are CPUs just faster?
–Quite possible no perf improvements due to API features in 10 years
Maybe adding features isn’t the answer…
4. 4 | AMD Direct3D Futures | March 20th, 2014
DEEPLY ROOTED PROBLEM
Coding design philosophies clash with real world
OOP, data hiding, polymorphic design clashes with task-driven, data parallel
Evident in language trends, striking disconnect between what is considered good code, and what is fast
Gap has always been there, but has grown in recent years
– 15 years ago, processors often bound by computation
– Now, usually bound by cache misses, serialization, pipeline stalls, etc.
– Multi-Core CPUs are ineffectively utilized
„Heavy Iron‟ , e.g. Big Object, Opaque memory is a dead end for performance
The revolt is beginning in high performance graphics APIS, but will spread
5. 5 | AMD Direct3D Futures | March 20th, 2014
BUT… HOW MUCH FASTER?
Biggest problem with industry today: Acceptance
Only 1 secret in API design: That it can be done.
–And isn‟t that hard
–And our code isn‟t that ugly
Star Swarm already demonstrating what is possible on a PC
6. 6 | AMD Direct3D Futures | March 20th, 2014
D3D12 FEATURES THAT NITROUS USES
True de-coupled multi-core rendering
– Expecting near linear thread scheduling
Manual Hazard tracking
– Hazards have been resolved already
Memory Heaps
– Bigger chunks of memory pool grouping make management simpler
Descriptor Tables
– Table exposure allows a cheaper way of binding textures
– Allows texture bindings to be shared between non-adjacent batches
7. 7 | AMD Direct3D Futures | March 20th, 2014
WHAT’S DIFFERENT NOW?
Spec Written
Spec
Reviewed
API
implemented
Released to
public
First Engine
use
Analysis
done
Thenn
8. 8 | AMD Direct3D Futures | March 20th, 2014
WHAT’S DIFFERENT NOW?
Nown
Create
Spec
Implement
Spec
Prototype
on Actual
Engines
Analyze
Discuss
with IHVs,
ISVs
Start Here
If Ready, exit
here to prep
for release
9. 9 | AMD Direct3D Futures | March 20th, 2014
IN THE SPIRIT OF CONTRIBUTING
Oxide proud to announce
that we have a proto-type of
Nitrous running on D3D12
*PR DISCLAIMER* This is
not an official
announcement regarding
D3D12 support
Porting from other modern
APIs is much simpler than
porting from D3D11 to
D3D12
10. 10 | AMD Direct3D Futures | March 20th, 2014
EXPECTED RESULTS
CPU Driver overhead largely put to rest
Huge increases in driver reliability
Huge decreases in frame latency, expecting median frame latency to be
1.5 frames
–Increased perceptual responsiveness
Never a dropped frame or stall due to driver API issues
–*Other OS events could cause stalls
Driver should be far smaller, simpler to implement, IHVs can spend more
time on optimizations
11. DIRECT3D12 AND THE FUTURE OF
GRAPHICS APIS
Dave Oldcorn, Direct3D12 Driver Architect, AMD
12. 12 | AMD Direct3D Futures | March 20th, 2014
THE PROBLEM
13. 13 | AMD Direct3D Futures | March 20th, 2014
THE PROBLEM
Mismatch between existing Direct3D and hardware capabilities
– Lots of CPU cores, but only one stream of data
– State communication in small chunks
– “Hidden” work
Hard to predict from any one given call what the overhead might be
Implicit memory management
– Hardware evolving away from classical register programming
14. 14 | AMD Direct3D Futures | March 20th, 2014
Metal
(register level access)
API LANDSCAPE
Gap between PC „raw‟ 3D APIs and the
hardware has opened up
Very high level APIs now ubiquitous; easy to
access even for casual developers, plenty of
choice
Where the PC APIs are is a middle ground
Capability,easeofuse,distancefrom3Dengine
Game Engines
Frostbite
Unity
Unreal
CryEngine
BlitzTech
Flash / Silverlight
Console APIs
Opportunity
D3D9
OpenGL
D3D11
D3D7/8
Application
15. 15 | AMD Direct3D Futures | March 20th, 2014
WHAT ARE THE CONSEQUENCES?
WHAT ARE THE SOLUTIONS?
16. 16 | AMD Direct3D Futures | March 20th, 2014
SEQUENTIAL API
Sequential API: state for given draw comes from arbitrary
previous time
Some states must be reconciled on the CPU (“delayed
validation”)
– All contributing state needs to be visible
GPU isn‟t like this, uses command buffers
– Must save and restore state at start and end
...
Draw
Set PS CB
Draw x 5
Set VS CB
Draw x 3
Set Blend
Set PS
Set RT state
Draw
Set VS VB
Draw
...
(more, earlier)
PS CB
VS CB
Blend state
PS
RT state
Draw
State contributing
to draw
API input
17. 17 | AMD Direct3D Futures | March 20th, 2014
THREADING A SEQUENTIAL API
Sequential API threading
– Simple producer / consumer model
Extra latency
Buffering has a cost
More threading would mean dividing tasks on finer grain
– Bottlenecked on application or driver thread
Difficult to extract parallelism (Amdahl‟s Law)
Application simulation
Prebuild
Thread 0
Prebuild
Thread 1
Application Render Thread
GPU Execution Queue
Queued
Buffer 0
Queued
Buffer 1
...
Runtime / Driver
Application
Driver Thread
Queued
Buffer 2
18. 18 | AMD Direct3D Futures | March 20th, 2014
COMMAND BUFFER API
GPUs only listen to command buffers
Let the app build them
– Command Lists, at the API level
Solves sequential API CPU issues
Application simulation
Thread 0 Thread 1
Build Cmd
Buffer
Build
Cmd
Buffer
GPU Execution Queue
Queued
Buffer 0
Queued
Buffer 1
...
Runtime / Driver
Application
19. 19 | AMD Direct3D Futures | March 20th, 2014
BETTER SCHEDULING
App has much more control over scheduling work
– Both CPU side and GPU
Threads don‟t really share much resource
Many more options for streaming assets
Driver thread
Create thread
D3D11: CB building threads tend to interfere
GPU load still added but only after queuing
Render work
Create work
GPU executes
D3D12: CB building threads more independent
Create thread
Build threads
20. 20 | AMD Direct3D Futures | March 20th, 2014
PIPELINE OBJECTS
Pipeline objects get rid of JIT and enable LTCG for GPUs
Decouple interface and implementation
We‟re aware that this is a hairpin bend for many graphics
engines to negotiate.
– Many engines don‟t think in terms of predicting state up
front
– The benefits are worth it
Simplified dataflow
through pipeline
VS
PS
Index
Process
Primitive
Generation
Rasteriser
Rendertarget
Output
?
?
?
21. 21 | AMD Direct3D Futures | March 20th, 2014
RENDER OBJECT BINDING MISMATCH
Hardware uses tables in video memory
BUT still programmed like a register solution
– So one bind becomes:
Allocate a new chunk of video memory
Create a new copy of the entire table
Update the one entry
Write the register with the new table base
address
SR
CB
On-chip
root table
(1 per stage) Pointer to table
(here, textures)
GPU Memory
SRD table
GPU Memory
resource
Pointer to table
(constant buffers)
Pointer to (+ params
of) resource
22. 22 | AMD Direct3D Futures | March 20th, 2014
DESCRIPTOR TABLES
Several tables of each type of resource
– Easy to divide up by frequency
Tables can be of arbitrary size; dynamically indexed to
provide bindless textures
Changing a table pointer is cheap
Updating a descriptor in a table is not
SR.T[0]
SR.T[3]
SR.T[2]
SR.T[1]
UAV
CB.T[1]
CB.T[0]
Samp
SR.T[0][0]
SR.T[0][2]
SR.T[0][1]
CB.T[1][0]
CB.T[1][1]
On-chip
table Pointer to table
(textures table 0)
GPU Memory
SRD table
Pointer to table
(constbuf table 1)
23. 23 | AMD Direct3D Futures | March 20th, 2014
KEY INNOVATIONS
Innovation CPU-side win GPU-side win
Command buffers
Build on many threads
Control of scheduling
Lower latency
Simplified state tracking
Pipeline state objects
Link at create time
No JIT shader compiles
Efficient batched updates
Cheaper state updates
Enables LTCG
Bind objects in
groups
Cheap to change group
Cheap to change group
Fits hardware paradigm
Move work to Create Predictability Enables optimisations
24. 24 | AMD Direct3D Futures | March 20th, 2014
KEY INNOVATIONS
Innovation CPU-side win GPU-side win
Explicit
Synchronisation
Efficiency
Required for bindless textures
Less overhead
Explicit Memory
Management
Efficiency
Predictability
Application flexibility
Zero copy
Control over placement
Do less
Predictability, Efficiency
Enables aggressive schedule
FEWER BUGS
25. 25 | AMD Direct3D Futures | March 20th, 2014
NEW PROBLEMS
(AND TIPS TO SOLVE THEM)
26. 26 | AMD Direct3D Futures | March 20th, 2014
NEW VISIBLE LIMITS
More draws in does not automatically mean more
triangles out
– You will not see full rendering rates with triangles
averaging 1 pixel each.
– Wireframe mode should look different to filled
rendering
27. 27 | AMD Direct3D Futures | March 20th, 2014
NEW VISIBLE LIMITS
Feeding the GPU much more efficiently means exploring interesting new limits that weren‟t visible before
10k/frame of anything is ~1µs per thing.
GPU pipeline depth is likely to be 1-10µs (1k-10k cycles).
Specific limit: context registers
– Shader tables are NOT in the context
– Compute doesn‟t bottleneck on context
28. 28 | AMD Direct3D Futures | March 20th, 2014
APPLICATION IN CHARGE
Application is arbiter of correct rendering
– This is a serious responsibility
– The benefits of D3D12 aren‟t readily available without this condition
Applications must be warning-free on the debug layer
Different opportunities for driver intervention
29. 29 | AMD Direct3D Futures | March 20th, 2014
APPLICATION IN CHARGE
No driver thread in play
– App can target much lower latency
– BUT implies app has to be ready with new
GPU work
Driver F1
App Render Frame 1
GPU F1
Frame 2
F2
F2
Frame 3
F3
F3
D3D11: No dead GPU time after 1st frame (but extra latency)
Dead
Time
First work sent to driver Driver buffers Present; no future dead time
No buffered present reveals dead time on GPU
30. 30 | AMD Direct3D Futures | March 20th, 2014
USE COMMAND BUFFERS SPARINGLY
Each API command list maps to a single hardware
command buffer
Starting / ending a command list has an overhead
– Writes full 3D state, may flush caches or idle GPU
We think a good rule of thumb will be to target around 100
command buffers/frame
– Use the multiple submission API where possible
CB0 CB1 CB2CB0
Multiple applications running on system
Application 0 queue
CB0 CB1 CB2
CB0
Application 1 queue
GPU executes
32. 32 | AMD Direct3D Futures | March 20th, 2014
ALL-NEW
There‟s a learning curve here for all of us
In the main it‟s a shallow one
– Compared at least to the general problem of multithreaded rendering
Multithread is always hard.
– Simpler design means fewer bugs and more predictable performance
33. 33 | AMD Direct3D Futures | March 20th, 2014
WHAT AMD PLAN TO DELIVER
An early preview driver “soon”
Release driver for Direct3D12 launch
Continuous engagement
– With Microsoft
– With ISVs
Bring your opinions to us and to Microsoft.
34. 34 | AMD Direct3D Futures | March 20th, 2014
DX12 AND FROSTBITE
Johan Andersson
Technical Director
35. 35 | AMD Direct3D Futures | March 20th, 2014
DX12 AND FROSTBITE
PC is very important for EA and we‟ve been pushing hard to improve graphics capabilities on Windows
Excited to be working with Microsoft and the IHVs on Direct3D again!
Good & very healthy collaboration between Microsoft, the IHVs and us game/engine developers
DX12 is a really big step forward from DX11 or GL4
36. 36 | AMD Direct3D Futures | March 20th, 2014
DX12 FEATURES AND FROSTBITE
Key DX12 features that are a great fit for Frostbite:
– Efficient parallel command buffers
– Descriptor tables
– Pipeline objects
– Explicit resource synchronization
– Explicit memory management
DX12 is still in development so actively working with Microsoft & the IHVs to help make sure all of it fits
together and is efficient
37. 37 | AMD Direct3D Futures | March 20th, 2014
DX12 PLATFORMS
DX12 support on Windows 7 & most existing PC hardware is critical for us
– Huge user base still on Windows 7
– Gamers would see major benefits without upgrading
DX12 support on Xbox One is critical for us
– Will lead to improved performance & quality for future Xbox One titles
– Almost all of our games are cross platform Gen4/PC
– Easier development – renderer is shared between Windows & Xbox One
Looking forward to DX12 on mobile/tablets
– Power efficiency & low overhead is really key
– Need larger user base to target on Windows for mobile
38. 38 | AMD Direct3D Futures | March 20th, 2014
DX12 AND FROSTBITE
We are building a DX12 renderer for Frostbite!
– Will work on GPUs from all vendors – benefits a wide set of gamers
Expected benefits over DX11:
– More stable and consistent performance
– Higher overall performance
– Move our design target – more richer & more detailed game worlds
– Thinner drivers – easier to work with / less of a black box
– More control for us developers – new techniques & optimizations
Really happy that the full Windows & Xbox eco systems are moving to low-level graphics API!