"Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software

•

2 likes•87 views

This document discusses considerations for building embedded deep learning systems and running neural networks on accelerators. It covers choosing hardware that provides needed performance acceleration, standards like OpenCL and SYCL for programming accelerators, and Codeplay's choice of widely adopted standards like these in a layered approach to make AI acceleration adaptable. Kernel fusion techniques and performance impacts of algorithms and components are also addressed.

Technology

Building embedded deep learning systems
•
•
How do you choose?
• Hardware acceleration is
usually needed to get
performance
• It’s a fast-moving field so
you may want rapid time-
to-market

Who am I?
CEO of Codeplay
• Edinburgh, Scotland-based
pioneer in GPU acceleration
• Chaired the SYCL and HSA
Software specifications
• We build GPU compilers for
semiconductor companies
• Now working to make AI
acceleration safe for automotive

Impact of algorithms on performance (inference)

Impact of algorithms on performance (training)

Component-wise & reduction operations performance

For Codeplay, these are our layer choices
We have chosen a layer of standards,
based on current market adoption
•
•
•
•
The actual choice of standards may change
based on market dynamics; but, by choosing
widely adopted standards and a layering
approach, it is easy to adapt

• OpenCL https://www.khronos.org/opencl/
• OpenVX https://www.khronos.org/openvx/
• HSA http://www.hsafoundation.com/
• NNEF https://www.khronos.org/nnef
• SYCL http://sycl.tech
• OpenCV http://opencv.org/
• Halide http://halide-lang.org/
• VisionCpp https://github.com/codeplaysoftware/visioncpp
• SYCL-BLAS https://github.com/codeplaysoftware/sycl-blas
• TensorFlow-SYCL https://github.com/codeplaysoftware/tensorflow
• Eigen http://eigen.tuxfamily.org

Similar to "Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software

What Is Your PLM Challenge - Decrease downtime and minimize production problemsDawn Collins

Introduction to Agile Hardware Cprime

TejaSoft Code Audit Case StudiesRaja Nagendra Kumar

CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)Vizualsite LLC

CGM versus SVGLarson Software Technology

Functional verification techniques EW16 sessionSameh El-Ashry

Jfokus 2016 - A JVMs Journey into Polyglot RuntimesCharlie Gracie

Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...Theo Jungeblut

Scrum.pptssuser98a1af

English-Redistributable-Intro-Scrum (1) (1).pptShwetaPuneyani1

GCP Deployment- Vertex AITriloki Gupta

Post compiler software optimization for reducing energyAbhishek Abhyankar

Choose Your Weapon: Comparing Spark on FPGAs vs GPUsDatabricks

AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...Amazon Web Services

Sitecore development approach evolution – destination helixPeter Nazarov

Introduction to embedded computing and arm processorsSiva Kumar

ODB++ Format for PCB DesignsSierra Circuits, Inc.

Java Performance Engineer's Survival GuideMonica Beckwith

Hardware Software Codesigndestruck

Robotic process automation IntroductionPriyab Satoshi

Similar to "Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software (20)

What Is Your PLM Challenge - Decrease downtime and minimize production problems

Introduction to Agile Hardware

TejaSoft Code Audit Case Studies

CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)

CGM versus SVG

Functional verification techniques EW16 session

Jfokus 2016 - A JVMs Journey into Polyglot Runtimes

Debugging,Troubleshooting & Monitoring Distributed Web & Cloud Applications a...

Scrum.ppt

English-Redistributable-Intro-Scrum (1) (1).ppt

GCP Deployment- Vertex AI

Post compiler software optimization for reducing energy

Choose Your Weapon: Comparing Spark on FPGAs vs GPUs

AWS re:Invent 2016: Deep Learning, 3D Content Rendering, and Massively Parall...

Sitecore development approach evolution – destination helix

Introduction to embedded computing and arm processors

ODB++ Format for PCB Designs

Java Performance Engineer's Survival Guide

Hardware Software Codesign

Robotic process automation Introduction

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Google AI Hackathon: LLM based Evaluator for RAGSujit Pal

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Histor y of HAM Radio presentation slidevu2urc

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Handwritten Text Recognition for manuscripts and early printed texts

Google AI Hackathon: LLM based Evaluator for RAG

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners

GenCyber Cyber Security Day Presentation

Unblocking The Main Thread Solving ANRs and Frozen Frames

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Presentation on how to chat with PDF using ChatGPT code interpreter

Histor y of HAM Radio presentation slide

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

SQL Database Design For Developers at php[tek] 2024

Maximizing Board Effectiveness 2024 Webinar.pptx

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Finology Group – Insurtech Innovation Award 2024

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

"Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software

2. Building embedded deep learning systems • • How do you choose? • Hardware acceleration is usually needed to get performance • It’s a fast-moving field so you may want rapid time- to-market

3. Who am I? CEO of Codeplay • Edinburgh, Scotland-based pioneer in GPU acceleration • Chaired the SYCL and HSA Software specifications • We build GPU compilers for semiconductor companies • Now working to make AI acceleration safe for automotive

5. Performance trends

6. Questions to ask

7. Running a network on an accelerator

9. • • • • • • • •

10.

11. ➢ ➢ •

12. • • • • • • • • •

13.

14. Kernel fusion in graph programming

15. • • • • • • • • •

16.

17. • • • • •

18. • • • • • • • •

19. Impact of algorithms on performance (inference)

20. Impact of algorithms on performance (training)

21. Component-wise & reduction operations performance

22. Matrix multiply performance

23. • • • • • • • • • •

24.

25. For Codeplay, these are our layer choices We have chosen a layer of standards, based on current market adoption • • • • The actual choice of standards may change based on market dynamics; but, by choosing widely adopted standards and a layering approach, it is easy to adapt

26. • OpenCL https://www.khronos.org/opencl/ • OpenVX https://www.khronos.org/openvx/ • HSA http://www.hsafoundation.com/ • NNEF https://www.khronos.org/nnef • SYCL http://sycl.tech • OpenCV http://opencv.org/ • Halide http://halide-lang.org/ • VisionCpp https://github.com/codeplaysoftware/visioncpp • SYCL-BLAS https://github.com/codeplaysoftware/sycl-blas • TensorFlow-SYCL https://github.com/codeplaysoftware/tensorflow • Eigen http://eigen.tuxfamily.org

"Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software

Recommended

Recommended

More Related Content

Similar to "Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software

Similar to "Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software (20)

More from Edge AI and Vision Alliance

More from Edge AI and Vision Alliance (20)

Recently uploaded

Recently uploaded (20)

"Programming Techniques for Implementing Inference Software Efficiently," a Presentation from Codeplay Software