Boosting HTML Apps with WebCL

Janakiram Raghumandala
December 10, 2013

DevCon 2013: Dec 9th–11th 2013,
Monte Carlo
What is WebCL?
• WebCL is an open API to achieve heterogeneous parallel
computing in HTML5 web browsers.
• It is a JavaScr...
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status ...
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status ...
Motivation
• New mobile apps with high-computing demands
– Augmented Reality
– Video Processing
– 3D Games – lots of calcu...
Motivation…
• GPU delivers
improved FLOPS
• 100 times faster

6

| © 2013 Cisco and/or its affiliates. All rights reserved...
Motivation…
• Higher FLOPS per WATT
•
•
•
•
•

RED – CPUs
ORANGE – APUs
Light BLUE – ARM
GREEN – Grid Processors
YELLO – G...
Webcentric Applications

Webcentric Applications
8

| © 2013 Cisco and/or its affiliates. All rights reserved.
Motivation… growing web centric applications

9

| © 2013 Cisco and/or its affiliates. All rights reserved.
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status ...
OpenCL
•
•
•
•

11

C-based cross-platform programming interface
Kernels
Run-time or build-time compilation
Rich set of bu...
OpenCL…
• Host contains one or
more OpenCL Devices
(aka Compute Devices)
• A Computer Device
contains one or more
Compute ...
OpenCL…
• A Compute Unit
contains one or more
processing elements
~ Intel’s hyper threads
• Processing elements
execute co...
OpenCL… Work Items & Work Groups

14

| © 2013 Cisco and/or its affiliates. All rights reserved.
OpenCL… Execution Model
• Kernel
– Basic unit of executable code, similar to a C function
• Program
– Collection of kernel...
OpenCL… Execution Model
• Kernel
– Basic program element, similar to a C function
• Program
– Collection of kernels and fu...
OpenCL… Memory Model
•
•
•
•

17

| © 2013 Cisco and/or its affiliates. All rights reserved.

Private Memory per PE
Local ...
OpenCL:Kernels
OpenCL: Kernels
• OpenCL kernel
– Defined on an N-dimensional computation domain
– Execute a kernel at each...
OpenCL:Execution of Programs - Steps
1. Query host for OpenCL devices.
2. Create a context to associate OpenCL devices.
3....
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status ...
WebCL
• WebCL concepts were deliberately made similar to the
OpenCL model
– Programs
– Kernels
– Linking
– Memory Transfer...
WebCL:Programming
•
•
•
•

Initialization
Create kernel
Run kernel
WebCL image object creation
– From Uint8Array
– From <i...
WebCL:Initialization
<script>
 var platforms = WebCL.getPlatforms();
var devices = platforms[0].getDevices(WebCL.DEVICE_TY...
WebCL:Creating a kernel
<script id="squareProgram" type="x-kernel">
__kernel square(
__global float* input,
__global float...
WebCL:Running WebCL Kernes
<script> ...

var inputBuf context.createBuffer(WebCL.MEM_READ_ONLY, Float32Array.BYTES_PER_ELE...
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status ...
DevCon 2013: Dec 9th–11th 2013,
Monte Carlo
Contents
– Motivation
– OpenCL, The Underlying Concept
– WebCL
– Performance Comparison & Demos
– Summary, current status ...
WebCL:Summary
JavaScript binding for OpenCL
Provides high performance parallel processing on multicore CPU & GPGPU
– Porta...
WebCL:Goals
• Compute intensive web applications
– High Performance
– Platform independent – Standards-compliant
– Ease of...
WebCL Standardization – Current Status
• Work in progress
• Working-draft is available

31

| © 2013 Cisco and/or its affi...
References
•
•
•
•

32

OpenCL 2.0 Tutorial
https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/inde...
Thank you.
Upcoming SlideShare
Loading in …5
×

Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL

624 views
456 views

Published on

WebCL enables you boost the performance of select HTML application where lots of computation is involved. For example, fluid simulation, image manipulation, video manipulation.
Since, OpenCL is the underlying platform, the same has been introduced in the beginning and then WebCL.

Contents:
Motivation
Introduction to OpenCL
Introduction to WebCL
Hello World Program of WebCL

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
624
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Typically there will be hundres of Processing elements
  • Square function is anOpenCL C based on C99 (ie., ISO/IEC 9899:1999)
  • ----- Meeting Notes (29/11/13 16:11) -----Slow - - lively interactive, explain challenges,show that you are confident,
  • Boosting your HTML Apps – Overview of OpenCL and Hello World of WebCL

    1. 1. Boosting HTML Apps with WebCL Janakiram Raghumandala December 10, 2013 DevCon 2013: Dec 9th–11th 2013, Monte Carlo
    2. 2. What is WebCL? • WebCL is an open API to achieve heterogeneous parallel computing in HTML5 web browsers. • It is a JavaScript binding to OpenCL • Generally used with Canvas element, SIMD computing, particle simulation • WebCL enables of computeintensive web applications. Source Name Placement 2 | © 2013 Cisco and/or its affiliates. All rights reserved.
    3. 3. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 3 | © 2013 Cisco and/or its affiliates. All rights reserved.
    4. 4. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 4 | © 2013 Cisco and/or its affiliates. All rights reserved.
    5. 5. Motivation • New mobile apps with high-computing demands – Augmented Reality – Video Processing – 3D Games – lots of calculations and physics – Computational photography 5 | © 2013 Cisco and/or its affiliates. All rights reserved.
    6. 6. Motivation… • GPU delivers improved FLOPS • 100 times faster 6 | © 2013 Cisco and/or its affiliates. All rights reserved.
    7. 7. Motivation… • Higher FLOPS per WATT • • • • • RED – CPUs ORANGE – APUs Light BLUE – ARM GREEN – Grid Processors YELLO – GPUs • Top-left is the preferred zone 7 | © 2013 Cisco and/or its affiliates. All rights reserved.
    8. 8. Webcentric Applications Webcentric Applications 8 | © 2013 Cisco and/or its affiliates. All rights reserved.
    9. 9. Motivation… growing web centric applications 9 | © 2013 Cisco and/or its affiliates. All rights reserved.
    10. 10. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 10 | © 2013 Cisco and/or its affiliates. All rights reserved.
    11. 11. OpenCL • • • • 11 C-based cross-platform programming interface Kernels Run-time or build-time compilation Rich set of built-in functions, cross, dot, sin, cos, log … | © 2013 Cisco and/or its affiliates. All rights reserved.
    12. 12. OpenCL… • Host contains one or more OpenCL Devices (aka Compute Devices) • A Computer Device contains one or more Compute Units ~ Cores 12 | © 2013 Cisco and/or its affiliates. All rights reserved.
    13. 13. OpenCL… • A Compute Unit contains one or more processing elements ~ Intel’s hyper threads • Processing elements execute code as SIMD 13 | © 2013 Cisco and/or its affiliates. All rights reserved.
    14. 14. OpenCL… Work Items & Work Groups 14 | © 2013 Cisco and/or its affiliates. All rights reserved.
    15. 15. OpenCL… Execution Model • Kernel – Basic unit of executable code, similar to a C function • Program – Collection of kernels and functions called by kernels – Analogous to a dynamic library (run-time linking) • Command Queue – Applications queue kernels and data transfers – Performed in-order or out-of-order • Work-item ~ Processing Element ~ Thread – An execution of a kernel by a processing element (~ thread) • Work-group ~ Compute Unit ~ Core – A collection of related work-items that execute on a single compute unit (~ core) 15 | © 2013 Cisco and/or its affiliates. All rights reserved.
    16. 16. OpenCL… Execution Model • Kernel – Basic program element, similar to a C function • Program – Collection of kernels and functions called by kernels – Analogous to a dynamic library (run-time linking) • Command Queue – Applications queue kernels and data transfers – Performed in-order or out-of-order • Work-item ~ Processing Element ~ Thread – An execution of a kernel by a processing element (~ thread) • Work-group ~ Compute Unit ~ Core – A collection of related work-items that execute on a single compute unit (~ core) 16 | © 2013 Cisco and/or its affiliates. All rights reserved.
    17. 17. OpenCL… Memory Model • • • • 17 | © 2013 Cisco and/or its affiliates. All rights reserved. Private Memory per PE Local Memory per Compute Unit Global Memory per Device Host Memory per Host System
    18. 18. OpenCL:Kernels OpenCL: Kernels • OpenCL kernel – Defined on an N-dimensional computation domain – Execute a kernel at each point of the computation domain Traditional Loop void vectorMult( const float* a, const float* b, float* c, const unsigned int count) { for(int i=0; i<count; i++) c[i] = a[i] * b[i]; } 18 Data Parallel OpenCL Kernel kernel void vectorMult( global const float* a, global const float* b, global float* c) | © 2013 Cisco and/or its affiliates. All rights reserved. { int id = get_global_id(0); c[id] = a[id] * b[id]; }
    19. 19. OpenCL:Execution of Programs - Steps 1. Query host for OpenCL devices. 2. Create a context to associate OpenCL devices. 3. Create programs for execution on one or more associated devices. 4. From the programs, select kernels to execute. 5. Create memory objects accessible from the host and/or the device. 6. Copy memory data to the device as needed. 7. Provide kernels to the command queue for execution. 8. Copy results from the device to the host. 19 | © 2013 Cisco and/or its affiliates. All rights reserved.
    20. 20. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 20 | © 2013 Cisco and/or its affiliates. All rights reserved.
    21. 21. WebCL • WebCL concepts were deliberately made similar to the OpenCL model – Programs – Kernels – Linking – Memory Transfers 21 | © 2013 Cisco and/or its affiliates. All rights reserved.
    22. 22. WebCL:Programming • • • • Initialization Create kernel Run kernel WebCL image object creation – From Uint8Array – From <img>, <canvas>, or <video> – From WebGL vertex buffer – From WebGL texture • WebGL vertex animation • WebGL texture animation 22 | © 2013 Cisco and/or its affiliates. All rights reserved.
    23. 23. WebCL:Initialization <script>
 var platforms = WebCL.getPlatforms(); var devices = platforms[0].getDevices(WebCL.DEVICE_TYPE_GPU); var context = WebCL.createContext({ WebCLDevice: devices[0] } ); var queue = context.createCommandQueue(); </script> 23 | © 2013 Cisco and/or its affiliates. All rights reserved.
    24. 24. WebCL:Creating a kernel <script id="squareProgram" type="x-kernel"> __kernel square( __global float* input, __global float* output, const unsigned int count) { int i = get_global_id(0); if(i < count) output[i] = input[i] * input[i]; } </script> <script>
 var programSource = getProgramSource("squareProgram"); var program = context.createProgram(programSource); program.build(); var kernel = program.createKernel("square"); </script> 24 | © 2013 Cisco and/or its affiliates. All rights reserved.
    25. 25. WebCL:Running WebCL Kernes <script> ... var inputBuf context.createBuffer(WebCL.MEM_READ_ONLY, Float32Array.BYTES_PER_ELEMENT * count);
var outputBuf = context.createBuffer(WebCL.MEM_WRITE_ONLY, Float32Array.BYTES_PER_ELEMENT * count); var data = new Float32Array(count); // populate data ... queue.enqueueWriteBuffer(inputBuf, data, true); //Last arg indicates blocking API kernel.setKernelArg(0, inputBuf); kernel.setKernelArg(1, outputBuf); kernel.setKernelArg(2, count, WebCL.KERNEL_ARG_INT); var workGroupSize = kernel.getWorkGroupInfo(devices[0], WebCL.KERNEL_WORK_GROUP_SIZE); queue.enqueueNDRangeKernel(kernel, [count], [workGroupSize]); queue.finish(); //This API blocks queue.enqueueReadBuffer(outputBuf, data, true); //Last arg indicates blocking API </script> 25 | © 2013 Cisco and/or its affiliates. All rights reserved.
    26. 26. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 26 | © 2013 Cisco and/or its affiliates. All rights reserved.
    27. 27. DevCon 2013: Dec 9th–11th 2013, Monte Carlo
    28. 28. Contents – Motivation – OpenCL, The Underlying Concept – WebCL – Performance Comparison & Demos – Summary, current status and References 28 | © 2013 Cisco and/or its affiliates. All rights reserved.
    29. 29. WebCL:Summary JavaScript binding for OpenCL Provides high performance parallel processing on multicore CPU & GPGPU – Portable and efficient access to heterogeneous multicore devices – Provides a single coherent standard across desktop and mobile devices WebCL HW and SW requirements – Need a modified browser with WebCL support – Hardware, driver, and runtime support for OpenCL WebCL stays close to the OpenCL standard – Preserves developer familiarity and facilitates adoption – Allows developers to translate their OpenCL knowledge to the web environment – Easier to keep OpenCL and WebCL in sync, as the two evolve Intended to be an interface above OpenCL Facilitates layering of higher level abstractions on top of WebCL API 29 | © 2013 Cisco and/or its affiliates. All rights reserved.
    30. 30. WebCL:Goals • Compute intensive web applications – High Performance – Platform independent – Standards-compliant – Ease of application development 30 | © 2013 Cisco and/or its affiliates. All rights reserved.
    31. 31. WebCL Standardization – Current Status • Work in progress • Working-draft is available 31 | © 2013 Cisco and/or its affiliates. All rights reserved.
    32. 32. References • • • • 32 OpenCL 2.0 Tutorial https://cvs.khronos.org/svn/repos/registry/trunk/public/webcl/spec/latest/index.html - WebCL Working draft http://www.khronos.org/registry/cl/specs/opencl-1.1.pdf- OpenCL 1.1 Specification http://streamcomputing.eu/blog/2012-08-27/processors-that-can-do-20-gflops-watt/ | © 2013 Cisco and/or its affiliates. All rights reserved.
    33. 33. Thank you.

    ×