SlideShare a Scribd company logo
1 of 33
Boosting HTML Apps with WebCL

Janakiram Raghumandala
December 10, 2013

DevCon 2013: Dec 9th–11th 2013,
Monte Carlo
What is WebCL?
• WebCL is an open API to achieve heterogeneous parallel
computing in HTML5 web browsers.
• It is a JavaScript binding to OpenCL
• Generally used with Canvas element, SIMD computing,
particle simulation
• WebCL enables
of computeintensive web applications.
Source Name Placement
2

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

3

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

4

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation
• New mobile apps with high-computing demands
– Augmented Reality
– Video Processing
– 3D Games – lots of calculations and physics
– Computational photography

5

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation…
• GPU delivers
improved FLOPS
• 100 times faster

6

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation…
• Higher FLOPS per WATT
•
•
•
•
•

RED – CPUs
ORANGE – APUs
Light BLUE – ARM
GREEN – Grid Processors
YELLO – GPUs

• Top-left is the preferred zone

7

| © 2013 Cisco and/or its affiliates. All rights reserved.
Webcentric Applications

Webcentric Applications
8

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation… growing web centric applications

9

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

10

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL
•
•
•
•

11

C-based cross-platform programming interface
Kernels
Run-time or build-time compilation
Rich set of built-in functions, cross, dot, sin, cos, log …

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL…
• Host contains one or
more OpenCL Devices
(aka Compute Devices)
• A Computer Device
contains one or more
Compute Units ~ Cores

12

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL…
• A Compute Unit
contains one or more
processing elements
~ Intel’s hyper threads
• Processing elements
execute code as SIMD

13

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Work Items & Work Groups

14

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Execution Model
• Kernel
– Basic unit of executable code, similar to a C function
• Program
– Collection of kernels and functions called by kernels
– Analogous to a dynamic library (run-time linking)
• Command Queue
– Applications queue kernels and data transfers
– Performed in-order or out-of-order
• Work-item ~ Processing Element ~ Thread
– An execution of a kernel by a processing element (~ thread)
• Work-group ~ Compute Unit ~ Core
– A collection of related work-items that execute on a single compute unit (~ core)

15

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Execution Model
• Kernel
– Basic program element, similar to a C function
• Program
– Collection of kernels and functions called by kernels
– Analogous to a dynamic library (run-time linking)
• Command Queue
– Applications queue kernels and data transfers
– Performed in-order or out-of-order
• Work-item ~ Processing Element ~ Thread
– An execution of a kernel by a processing element (~ thread)
• Work-group ~ Compute Unit ~ Core
– A collection of related work-items that execute on a single compute unit (~ core)

16

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Memory Model
•
•
•
•

17

| © 2013 Cisco and/or its affiliates. All rights reserved.

Private Memory per PE
Local Memory per Compute Unit
Global Memory per Device
Host Memory per Host System
OpenCL:Kernels
OpenCL: Kernels
• OpenCL kernel
– Defined on an N-dimensional computation domain
– Execute a kernel at each point of the computation domain
Traditional Loop
void vectorMult(
const float* a,
const float* b,
float* c,
const unsigned int count)
{
for(int i=0; i<count; i++)
c[i] = a[i] * b[i];
}

18

Data Parallel OpenCL Kernel
kernel void vectorMult(
global const float* a,
global const float* b,
global float* c)

| © 2013 Cisco and/or its affiliates. All rights reserved.

{
int id = get_global_id(0);
c[id] = a[id] * b[id];
}
OpenCL:Execution of Programs - Steps
1. Query host for OpenCL devices.
2. Create a context to associate OpenCL devices.
3. Create programs for execution on one or more
associated devices.
4. From the programs, select kernels to execute.
5. Create memory objects accessible from the host
and/or the device.
6. Copy memory data to the device as needed.
7. Provide kernels to the command queue for
execution.
8. Copy results from the device to the host.

19

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

20

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL
• WebCL concepts were deliberately made similar to the
OpenCL model
– Programs
– Kernels
– Linking
– Memory Transfers

21

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Programming
•
•
•
•

Initialization
Create kernel
Run kernel
WebCL image object creation
– From Uint8Array
– From <img>, <canvas>, or <video>
– From WebGL vertex buffer
– From WebGL texture
• WebGL vertex animation
• WebGL texture animation
22

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Initialization
<script>
 var platforms = WebCL.getPlatforms();
var devices = platforms[0].getDevices(WebCL.DEVICE_TYPE_GPU);
var context = WebCL.createContext({ WebCLDevice: devices[0] } );
var queue = context.createCommandQueue();
</script>

23

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Creating a kernel
<script id="squareProgram" type="x-kernel">
__kernel square(
__global float* input,
__global float* output,
const unsigned int count)
{
int i = get_global_id(0);
if(i < count)
output[i] = input[i] * input[i];
}
</script>
<script>
 var programSource = getProgramSource("squareProgram");
var program = context.createProgram(programSource);
program.build();
var kernel = program.createKernel("square");
</script>
24

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Running WebCL Kernes
<script> ...

var inputBuf context.createBuffer(WebCL.MEM_READ_ONLY, Float32Array.BYTES_PER_ELEMENT * count);
var outputBuf =
context.createBuffer(WebCL.MEM_WRITE_ONLY, Float32Array.BYTES_PER_ELEMENT * count);
var data = new Float32Array(count); // populate data ...
queue.enqueueWriteBuffer(inputBuf, data, true); //Last arg indicates blocking API
kernel.setKernelArg(0, inputBuf);

kernel.setKernelArg(1, outputBuf);
kernel.setKernelArg(2, count, WebCL.KERNEL_ARG_INT);
var workGroupSize = kernel.getWorkGroupInfo(devices[0], WebCL.KERNEL_WORK_GROUP_SIZE);
queue.enqueueNDRangeKernel(kernel, [count], [workGroupSize]);
queue.finish(); //This API blocks
queue.enqueueReadBuffer(outputBuf, data, true); //Last arg indicates blocking API

</script>
25

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

26

| © 2013 Cisco and/or its affiliates. All rights reserved.
DevCon 2013: Dec 9th–11th 2013,
Monte Carlo
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status and References

28

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Summary
JavaScript binding for OpenCL
Provides high performance parallel processing on multicore CPU & GPGPU
– Portable and efficient access to heterogeneous multicore devices
– Provides a single coherent standard across desktop and mobile devices
WebCL HW and SW requirements
– Need a modified browser with WebCL support
– Hardware, driver, and runtime support for OpenCL
WebCL stays close to the OpenCL standard
– Preserves developer familiarity and facilitates adoption
– Allows developers to translate their OpenCL knowledge to the web environment
– Easier to keep OpenCL and WebCL in sync, as the two evolve
Intended to be an interface above OpenCL
Facilitates layering of higher level abstractions on top of WebCL API
29

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL:Goals
• Compute intensive web applications
– High Performance
– Platform independent – Standards-compliant
– Ease of application development

30

| © 2013 Cisco and/or its affiliates. All rights reserved.
WebCL Standardization – Current Status
• Work in progress
• Working-draft is available

31

| © 2013 Cisco and/or its affiliates. All rights reserved.
References
•
•
•
•

32

OpenCL 2.0 Tutorial
https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/index.html - WebCL Working draft
http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf- OpenCL 1.1 Specification
http://streamcomputing.eu/blog/2012-08-27/processors-that-can-do-20-gflops-watt/

| © 2013 Cisco and/or its affiliates. All rights reserved.
Thank you.

More Related Content

What's hot

Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...Vadym Kazulkin
 
The Eclipse Transformer Project
The Eclipse Transformer Project The Eclipse Transformer Project
The Eclipse Transformer Project Jakarta_EE
 
IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018Trayan Iliev
 
JavaOne 2015: 12 Factor App
JavaOne 2015: 12 Factor AppJavaOne 2015: 12 Factor App
JavaOne 2015: 12 Factor AppJoe Kutner
 
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...VMware Tanzu
 
Shorten All URLs
Shorten All URLsShorten All URLs
Shorten All URLsJakarta_EE
 
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...Christoforos Kachris
 
MySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB ClustersMySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB ClustersMiguel Araújo
 
Polygot Java EE on the GraalVM
Polygot Java EE on the GraalVMPolygot Java EE on the GraalVM
Polygot Java EE on the GraalVMRyan Cuprak
 
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Ryan Cuprak
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?DataWorks Summit
 
DEVNET-1152 OpenDaylight YANG Model Overview and Tools
DEVNET-1152	OpenDaylight YANG Model Overview and ToolsDEVNET-1152	OpenDaylight YANG Model Overview and Tools
DEVNET-1152 OpenDaylight YANG Model Overview and ToolsCisco DevNet
 
Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...mfrancis
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
 
ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)Logico
 
MuleSoft Meetup Warsaw Group DataWeave 2.0
MuleSoft Meetup Warsaw Group DataWeave 2.0MuleSoft Meetup Warsaw Group DataWeave 2.0
MuleSoft Meetup Warsaw Group DataWeave 2.0Patryk Bandurski
 
Building a Brain with Raspberry Pi and Zulu Embedded JVM
Building a Brain with Raspberry Pi and Zulu Embedded JVMBuilding a Brain with Raspberry Pi and Zulu Embedded JVM
Building a Brain with Raspberry Pi and Zulu Embedded JVMSimon Ritter
 
Calling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBayCalling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBayTony Ng
 

What's hot (19)

Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
Projects Valhalla, Loom and GraalVM at virtual JavaFest 2020 in Kiev, Ukraine...
 
The Eclipse Transformer Project
The Eclipse Transformer Project The Eclipse Transformer Project
The Eclipse Transformer Project
 
IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018
 
JavaOne 2015: 12 Factor App
JavaOne 2015: 12 Factor AppJavaOne 2015: 12 Factor App
JavaOne 2015: 12 Factor App
 
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
The Making of the Oracle R2DBC Driver and How to Take Your Code from Synchron...
 
Shorten All URLs
Shorten All URLsShorten All URLs
Shorten All URLs
 
Java 9 and Project Jigsaw
Java 9 and Project JigsawJava 9 and Project Jigsaw
Java 9 and Project Jigsaw
 
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
Instant scaling and deployment of Vitis Libraries on Alveo clusters using InA...
 
MySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB ClustersMySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB Clusters
 
Polygot Java EE on the GraalVM
Polygot Java EE on the GraalVMPolygot Java EE on the GraalVM
Polygot Java EE on the GraalVM
 
Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)Exploring Java Heap Dumps (Oracle Code One 2018)
Exploring Java Heap Dumps (Oracle Code One 2018)
 
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
 
DEVNET-1152 OpenDaylight YANG Model Overview and Tools
DEVNET-1152	OpenDaylight YANG Model Overview and ToolsDEVNET-1152	OpenDaylight YANG Model Overview and Tools
DEVNET-1152 OpenDaylight YANG Model Overview and Tools
 
Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...Native OSGi, Modular Software Development in a Native World - Alexander Broek...
Native OSGi, Modular Software Development in a Native World - Alexander Broek...
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)ADBA (Asynchronous Database Access)
ADBA (Asynchronous Database Access)
 
MuleSoft Meetup Warsaw Group DataWeave 2.0
MuleSoft Meetup Warsaw Group DataWeave 2.0MuleSoft Meetup Warsaw Group DataWeave 2.0
MuleSoft Meetup Warsaw Group DataWeave 2.0
 
Building a Brain with Raspberry Pi and Zulu Embedded JVM
Building a Brain with Raspberry Pi and Zulu Embedded JVMBuilding a Brain with Raspberry Pi and Zulu Embedded JVM
Building a Brain with Raspberry Pi and Zulu Embedded JVM
 
Calling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBayCalling All Modularity Solutions: A Comparative Study from eBay
Calling All Modularity Solutions: A Comparative Study from eBay
 

Similar to Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL

WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
 
Database As A Service: OEM + ODA (OOW 15 Presentation)
Database As A Service: OEM + ODA (OOW 15 Presentation)Database As A Service: OEM + ODA (OOW 15 Presentation)
Database As A Service: OEM + ODA (OOW 15 Presentation)Bobby Curtis
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAMD Developer Central
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...chiportal
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowIT Arena
 
TECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialTECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialOpenNebula Project
 
OpenStack for devops environment
OpenStack for devops environment OpenStack for devops environment
OpenStack for devops environment Orgad Kimchi
 
Show and Tell: Building Applications on Cisco Open SDN Controller
Show and Tell: Building Applications on Cisco Open SDN Controller Show and Tell: Building Applications on Cisco Open SDN Controller
Show and Tell: Building Applications on Cisco Open SDN Controller Cisco DevNet
 
Automatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmapAutomatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmapManolis Vavalis
 
4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)NAIM Networks, Inc.
 
Red Hat Java Update and Quarkus Introduction
Red Hat Java Update and Quarkus IntroductionRed Hat Java Update and Quarkus Introduction
Red Hat Java Update and Quarkus IntroductionJohn Archer
 
Development with JavaFX 9 in JDK 9.0.1
Development with JavaFX 9 in JDK 9.0.1Development with JavaFX 9 in JDK 9.0.1
Development with JavaFX 9 in JDK 9.0.1Wolfgang Weigend
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightabhijit2511
 
DEVNET-1006 Getting Started with OpenDayLight
DEVNET-1006	Getting Started with OpenDayLightDEVNET-1006	Getting Started with OpenDayLight
DEVNET-1006 Getting Started with OpenDayLightCisco DevNet
 
Cisco Cloupia uic product overview and demo presentation
Cisco Cloupia uic product overview and demo presentationCisco Cloupia uic product overview and demo presentation
Cisco Cloupia uic product overview and demo presentationxKinAnx
 
Warsaw MuleSoft Meetup - Runtime Fabric
Warsaw MuleSoft Meetup - Runtime FabricWarsaw MuleSoft Meetup - Runtime Fabric
Warsaw MuleSoft Meetup - Runtime FabricPatryk Bandurski
 
Akka.Net and .Net Core - The Developer Conference 2018 Florianopolis
Akka.Net and .Net Core - The Developer Conference 2018 FlorianopolisAkka.Net and .Net Core - The Developer Conference 2018 Florianopolis
Akka.Net and .Net Core - The Developer Conference 2018 FlorianopolisAlexandre Brandão Lustosa
 
Speeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCCSpeeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCCinside-BigData.com
 

Similar to Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL (20)

WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
 
Database As A Service: OEM + ODA (OOW 15 Presentation)
Database As A Service: OEM + ODA (OOW 15 Presentation)Database As A Service: OEM + ODA (OOW 15 Presentation)
Database As A Service: OEM + ODA (OOW 15 Presentation)
 
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware WebinarAn Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
An Introduction to OpenCL™ Programming with AMD GPUs - AMD & Acceleware Webinar
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
 
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and KubeflowKostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
 
TECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula TutorialTECNIRIS@: OpenNebula Tutorial
TECNIRIS@: OpenNebula Tutorial
 
OpenStack for devops environment
OpenStack for devops environment OpenStack for devops environment
OpenStack for devops environment
 
Show and Tell: Building Applications on Cisco Open SDN Controller
Show and Tell: Building Applications on Cisco Open SDN Controller Show and Tell: Building Applications on Cisco Open SDN Controller
Show and Tell: Building Applications on Cisco Open SDN Controller
 
Automatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmapAutomatic generation of platform architectures using open cl and fpga roadmap
Automatic generation of platform architectures using open cl and fpga roadmap
 
4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)4th SDN Interest Group Seminar-Session 2-2(130313)
4th SDN Interest Group Seminar-Session 2-2(130313)
 
Red Hat Java Update and Quarkus Introduction
Red Hat Java Update and Quarkus IntroductionRed Hat Java Update and Quarkus Introduction
Red Hat Java Update and Quarkus Introduction
 
Development with JavaFX 9 in JDK 9.0.1
Development with JavaFX 9 in JDK 9.0.1Development with JavaFX 9 in JDK 9.0.1
Development with JavaFX 9 in JDK 9.0.1
 
Current & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylightCurrent & Future Use-Cases of OpenDaylight
Current & Future Use-Cases of OpenDaylight
 
DEVNET-1006 Getting Started with OpenDayLight
DEVNET-1006	Getting Started with OpenDayLightDEVNET-1006	Getting Started with OpenDayLight
DEVNET-1006 Getting Started with OpenDayLight
 
OpenStack with OpenDaylight
OpenStack with OpenDaylightOpenStack with OpenDaylight
OpenStack with OpenDaylight
 
Cisco Cloupia uic product overview and demo presentation
Cisco Cloupia uic product overview and demo presentationCisco Cloupia uic product overview and demo presentation
Cisco Cloupia uic product overview and demo presentation
 
5 cisco open_stack
5 cisco open_stack5 cisco open_stack
5 cisco open_stack
 
Warsaw MuleSoft Meetup - Runtime Fabric
Warsaw MuleSoft Meetup - Runtime FabricWarsaw MuleSoft Meetup - Runtime Fabric
Warsaw MuleSoft Meetup - Runtime Fabric
 
Akka.Net and .Net Core - The Developer Conference 2018 Florianopolis
Akka.Net and .Net Core - The Developer Conference 2018 FlorianopolisAkka.Net and .Net Core - The Developer Conference 2018 Florianopolis
Akka.Net and .Net Core - The Developer Conference 2018 Florianopolis
 
Speeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCCSpeeding up Programs with OpenACC in GCC
Speeding up Programs with OpenACC in GCC
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL

  • 1. Boosting HTML Apps with WebCL Janakiram Raghumandala December 10, 2013 DevCon 2013: Dec 9th–11th 2013, Monte Carlo
  • 2. What is WebCL? • WebCL is an open API to achieve heterogeneous parallel computing in HTML5 web browsers. • It is a JavaScript binding to OpenCL • Generally used with Canvas element, SIMD computing, particle simulation • WebCL enables of computeintensive web applications. Source Name Placement 2 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 3. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 3 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 4. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 4 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 5. Motivation • New mobile apps with high-computing demands – Augmented Reality – Video Processing – 3D Games – lots of calculations and physics – Computational photography 5 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 6. Motivation… • GPU delivers improved FLOPS • 100 times faster 6 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 7. Motivation… • Higher FLOPS per WATT • • • • • RED – CPUs ORANGE – APUs Light BLUE – ARM GREEN – Grid Processors YELLO – GPUs • Top-left is the preferred zone 7 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 8. Webcentric Applications Webcentric Applications 8 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 9. Motivation… growing web centric applications 9 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 10. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 10 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 11. OpenCL • • • • 11 C-based cross-platform programming interface Kernels Run-time or build-time compilation Rich set of built-in functions, cross, dot, sin, cos, log … | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 12. OpenCL… • Host contains one or more OpenCL Devices (aka Compute Devices) • A Computer Device contains one or more Compute Units ~ Cores 12 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 13. OpenCL… • A Compute Unit contains one or more processing elements ~ Intel’s hyper threads • Processing elements execute code as SIMD 13 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 14. OpenCL… Work Items & Work Groups 14 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 15. OpenCL… Execution Model • Kernel – Basic unit of executable code, similar to a C function • Program – Collection of kernels and functions called by kernels – Analogous to a dynamic library (run-time linking) • Command Queue – Applications queue kernels and data transfers – Performed in-order or out-of-order • Work-item ~ Processing Element ~ Thread – An execution of a kernel by a processing element (~ thread) • Work-group ~ Compute Unit ~ Core – A collection of related work-items that execute on a single compute unit (~ core) 15 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 16. OpenCL… Execution Model • Kernel – Basic program element, similar to a C function • Program – Collection of kernels and functions called by kernels – Analogous to a dynamic library (run-time linking) • Command Queue – Applications queue kernels and data transfers – Performed in-order or out-of-order • Work-item ~ Processing Element ~ Thread – An execution of a kernel by a processing element (~ thread) • Work-group ~ Compute Unit ~ Core – A collection of related work-items that execute on a single compute unit (~ core) 16 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 17. OpenCL… Memory Model • • • • 17 | © 2013 Cisco and/or its affiliates. All rights reserved. Private Memory per PE Local Memory per Compute Unit Global Memory per Device Host Memory per Host System
  • 18. OpenCL:Kernels OpenCL: Kernels • OpenCL kernel – Defined on an N-dimensional computation domain – Execute a kernel at each point of the computation domain Traditional Loop void vectorMult( const float* a, const float* b, float* c, const unsigned int count) { for(int i=0; i<count; i++) c[i] = a[i] * b[i]; } 18 Data Parallel OpenCL Kernel kernel void vectorMult( global const float* a, global const float* b, global float* c) | © 2013 Cisco and/or its affiliates. All rights reserved. { int id = get_global_id(0); c[id] = a[id] * b[id]; }
  • 19. OpenCL:Execution of Programs - Steps 1. Query host for OpenCL devices. 2. Create a context to associate OpenCL devices. 3. Create programs for execution on one or more associated devices. 4. From the programs, select kernels to execute. 5. Create memory objects accessible from the host and/or the device. 6. Copy memory data to the device as needed. 7. Provide kernels to the command queue for execution. 8. Copy results from the device to the host. 19 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 20. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 20 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 21. WebCL • WebCL concepts were deliberately made similar to the OpenCL model – Programs – Kernels – Linking – Memory Transfers 21 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 22. WebCL:Programming • • • • Initialization Create kernel Run kernel WebCL image object creation – From Uint8Array – From <img>, <canvas>, or <video> – From WebGL vertex buffer – From WebGL texture • WebGL vertex animation • WebGL texture animation 22 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 23. WebCL:Initialization <script>
 var platforms = WebCL.getPlatforms(); var devices = platforms[0].getDevices(WebCL.DEVICE_TYPE_GPU); var context = WebCL.createContext({ WebCLDevice: devices[0] } ); var queue = context.createCommandQueue(); </script> 23 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 24. WebCL:Creating a kernel <script id="squareProgram" type="x-kernel"> __kernel square( __global float* input, __global float* output, const unsigned int count) { int i = get_global_id(0); if(i < count) output[i] = input[i] * input[i]; } </script> <script>
 var programSource = getProgramSource("squareProgram"); var program = context.createProgram(programSource); program.build(); var kernel = program.createKernel("square"); </script> 24 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 25. WebCL:Running WebCL Kernes <script> ... var inputBuf context.createBuffer(WebCL.MEM_READ_ONLY, Float32Array.BYTES_PER_ELEMENT * count);
var outputBuf = context.createBuffer(WebCL.MEM_WRITE_ONLY, Float32Array.BYTES_PER_ELEMENT * count); var data = new Float32Array(count); // populate data ... queue.enqueueWriteBuffer(inputBuf, data, true); //Last arg indicates blocking API kernel.setKernelArg(0, inputBuf); kernel.setKernelArg(1, outputBuf); kernel.setKernelArg(2, count, WebCL.KERNEL_ARG_INT); var workGroupSize = kernel.getWorkGroupInfo(devices[0], WebCL.KERNEL_WORK_GROUP_SIZE); queue.enqueueNDRangeKernel(kernel, [count], [workGroupSize]); queue.finish(); //This API blocks queue.enqueueReadBuffer(outputBuf, data, true); //Last arg indicates blocking API </script> 25 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 26. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 26 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 27. DevCon 2013: Dec 9th–11th 2013, Monte Carlo
  • 28. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 28 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 29. WebCL:Summary JavaScript binding for OpenCL Provides high performance parallel processing on multicore CPU & GPGPU – Portable and efficient access to heterogeneous multicore devices – Provides a single coherent standard across desktop and mobile devices WebCL HW and SW requirements – Need a modified browser with WebCL support – Hardware, driver, and runtime support for OpenCL WebCL stays close to the OpenCL standard – Preserves developer familiarity and facilitates adoption – Allows developers to translate their OpenCL knowledge to the web environment – Easier to keep OpenCL and WebCL in sync, as the two evolve Intended to be an interface above OpenCL Facilitates layering of higher level abstractions on top of WebCL API 29 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 30. WebCL:Goals • Compute intensive web applications – High Performance – Platform independent – Standards-compliant – Ease of application development 30 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 31. WebCL Standardization – Current Status • Work in progress • Working-draft is available 31 | © 2013 Cisco and/or its affiliates. All rights reserved.
  • 32. References • • • • 32 OpenCL 2.0 Tutorial https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/index.html - WebCL Working draft http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf- OpenCL 1.1 Specification http://streamcomputing.eu/blog/2012-08-27/processors-that-can-do-20-gflops-watt/ | © 2013 Cisco and/or its affiliates. All rights reserved.

Editor's Notes

  1. Typically there will be hundres of Processing elements
  2. Square function is anOpenCL C based on C99 (ie., ISO/IEC 9899:1999)
  3. ----- Meeting Notes (29/11/13 16:11) -----Slow - - lively interactive, explain challenges,show that you are confident,