0
Click to edit Master title style   This subtitle is 20 points   Bullets are blue   They have 110% line spacing, 2 point...
Click to edit Master title style   This subtitle is 20 points   Bullets are blue      Graphics Programming on the Web  ...
Click to edit Master title style   This subtitle is planks ;-) Blender/Bullet/SmallLuxGPU    Over 32000 20 points   Bul...
MotivationClick to edit Master title style   For compute intensive web applications    This subtitle is 20 points   Bul...
MotivationClick to edit Master title style   This subtitle is 20 exponential GFLOPS growth every    GPUs provide points ...
Content edit Master title styleClick to   Motivation and 20 points    This subtitle is Goals   General-Purpose computat...
Content edit Master title styleClick to   Motivation and 20 points    This subtitle is Goals   General-Purpose computat...
WebGL edit Master title style     Click to pipeline           Programmable vertex &            This subtitle is 20 points...
General Purpose computationsClick to edit Master title style on GPU With clever 20 points This subtitle ismapping of alg...
GPGPU with GL limitationsClick to edit Master title style   This subtitle is 20 points    Hard to map algorithms to grap...
Content edit Master title styleClick to   Motivation and 20 points    This subtitle is Goals   General-Purpose computat...
WebCL edit MasterClick to overview title style   WebCL brings parallel computing to    This subtitle is 20 points   the ...
OpenCL overviewClick to edit Master title style Features This subtitle is 20 points     C-based cross-platform API   B...
OpenCL Device ModelClick to edit Master title style This subtitle is 20 points or more Compute devices  A host is connect...
OpenCL Execution title styleClick to edit Master Model   Kernel   This subtitle is 20 code (~ DLL entry point)      Bas...
OpenCL Work-group 2D analogyClick to edit Master title style                                            Local   This subt...
OpenCL Memory ModelClick to edit Master title style On Host This subtitle    is 20 points    CPU RAM                   ...
OpenCL KernelClick to edit Master title style    This subtitle isa20 points     Defined on N-dimensional computation dom...
WebCL editClick to API Master title style                                                  Platform layer  OO model as Op...
Content edit Master title styleClick to   Motivation and 20 points    This subtitle is Goals   General-Purpose computat...
WebCL edit Master title styleClick to sequence (host side)                                           Select          Creat...
WebCL edit Master title style    Click to sequence (host side)    try {         This subtitle is 20 points/ / c r eat e t...
WebCL edit Master title style    Click to sequence (host side) <scr i pt i d=" m t i pl y_scr i pt " t ype=" x- webcl " > ...
WebCL edit Master title style Click to sequence (host side)    This subtitle is 20 pointsBUFFER_SI ZE=10;v ar A=new Ui nt...
WebCL edit Master title style Click to sequence (host side)     This subtitle is 20 points/ / Cr eat e com and queue     ...
WebCL edit Master title styleClick to sequence (host side)     This subtitle is 20 points                        Select  ...
WebCL edit Master title styleClick to sequence (host side) This subtitle is 20 points                     Select         ...
Example: Matrix multiplicationClick to edit Master title style                                     A            B   This...
Example: Matrix multiplicationClick to edit Master title style                                            A              B...
Example: Comparison with sequentialClick to edit Master title style MacBook Pro (early 2011), OSX 10.8 This subtitle is ...
WebCL WebGL interopClick to /edit Master title style WebCL context created This subtitle is 20 points                   ...
WebCL WebGL interop Click to /edit Master title style/ / Cr eat e WebGL c ont ex t                                        ...
WebCL WebGL interop (texture)     Click to /edit Master title style//   Cr eat e OpenGL t ext ur e obj ectgl . act i veTex...
Demo: GL Texture update withClick to edit Master title style CL   This subtitleEvgeny Demidov 2D ink droplet    Based on...
WebCL WebGL interop (vbo) Click to /edit Master title style/ / cr eat e buf f er obj ect                                  ...
Click to edit Master title style   This subtitle is 20 points   Bullets are blue   They have 110% line spacing, 2 point...
WebCL/WebGL interop style    Click to edit Master title(host side)    This subtitle is 20 points                         ...
Click to edit Master title style   This subtitle is 20 points   Bullets are blue   They have 110% line spacing, 2 point...
PerspectivesClick to edit Master title style This subtitle is 20 points applications in Web browsers  WebCL enables GPGPU...
WebCL edit Master title styleClick to Open process and Resources Khronos open process points Web community This subtitle...
Click to edit Master title style   This subtitle is 20 points   Bullets are blue   They have 110% line spacing, 2 point...
Click to edit Master title style        This slide has a 16:9 media window   This subtitle is 20 points   Bullets are bl...
Start to edit MasterClick learning Now! title style   OpenCL Programming Guide - The “Red Book” of OpenCL    This subtit...
Upcoming SlideShare
Loading in...5
×

Graphics Programming for the Web with WebCL

2,034

Published on

Annotated slides of Siggraph 2012 course "Graphics Programming on the Web with WebCL"
The full course is available at http://www.khronos.org/webgl/wiki/Presentations#SIGGRAPH_2012_Course_.22Graphics_Programming_for_the_Web.22

Published in: Technology, Business
2 Comments
2 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,034
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
46
Comments
2
Likes
2
Embeds 0
No embeds

No notes for slide
  • This demonstration is not working on a browser but uses OpenCL to speedup physics computations for the position of all the planks.Our goal with WebCL is to be able one day to perform such computations on your web browser.
  • While CPU tend to have 2 to 32 cores, GPU have much more.
  • Historically, when GPU became programmable, people try to use vertex and fragment shader programs to perform more general computations than rendering vector graphics.
  • The scatter &amp; gather operations are fundamental operations for GPGPU. Typically, scatter is difficult because in a graphics pipeline the fragment shader is called for writing one output value. One can still perform scatter using vertex shaders cleverly. Newer versions of graphics API &amp; drivers provide specific methods for scatter.Gather is no brainer since it can be achieved by reading textures.
  • To understand work-groups and work-items, suppose you have a matrix or an image to process. The image can be decomposed into tiles and each tile can be processed independently.A tile would be a work-group. Inside this work-group, each pixel would be processed by a work-item.Unlike typical CPU multithreading, all work-items (or threads) execute synchronously, thanks to the SIMD nature of GPUs. This has an important consequence: if each thread execute the same number of operations them they will complete at the same time. But if one is taking longer than other threads, e.g. due to branch divergence (like an if…else clause), then other threads will wait until it finishes its operations, thereby slowing down effective computational throughput.
  • Developers must manage memory explicitly. For best performance increase, move data closer to the cores. However, be aware that the closer you get to cores, the smaller the memory available.
  • For 1-Dimensional problems, in a sequential language like JavaScript, one would use a for loop to iterate across the 1D array. With CL, we tell the device to iterate over a 1-D domain and only provide the core of the loop. When CL calls the kernel, it provides methods to query which index (i.e. thread) is executing the kernel.By extension, for 2D problems, in JavaScript, we would have 2 imbricated for loops. CL’s work-items are going to iterate over the 2D domain and (x,y) index of the thread calling a kernel is provided by get_global_id(dimension), with dimension = 0 (1st dimension), or 1 (2nd dimension).
  • WebCL object cannot be constructed with new operator. It is like the Math object of JavaScript.
  • ----- Meeting Notes (8/2/12 16:54) -----kernels can come from anywhere
  • While we explain how to setup a simple vector multiplication kernel, this would apply to matrices too. Matrix multiplication is probably what I would call the best “Hello World” example for compute languages.
  • To optimize computations, recall the work-group/work-item analogy we explained earlier with an image. We said that work-groups are tiles onto which work-items operate.Here we do exactly that with P work-items (or threads) per work-group. Use WebCLKernel.getWorkGroupInfo(WebCL.KERNEL_PREFERRED_WORKGROUP_SIZE) to find out what this number is for a device. It is typically a power of 2 like 16, 32, 64.
  • However, that CPU being hyperthreaded, it is seen as 8 cores rather than 4.Onthis machine, the preferred workgroup size multiple is 1 for CPU, 64 for GPU. The maximum workgroup size is 128 for CPU, 256 for GPU. So we set the local workgroup size to 128x1 (=128) for CPU and 16x16 (=256 and divisible by 64) for GPU.As you can see, the performance of CL on CPU is pretty good and even better than GPU for small matrix sizes, less than 512x512. As the matrix size grow, the CPU performance remains constant but the GPU performance grows exponentially; as expected. Note: the OpenMP code uses the same tiling optimization as for GPU with 8 threads.If you recall the video at the beginning of this course, the physics engine is essentially doing matrix/vector multiplication for 32k items. With these results, a tremendous speedup can be achieved compared to a CPU approach.
  • ----- Meeting Notes (8/2/12 16:54) -----Vertex buffer objects
  • This example comes from Nvidia CUDA/OpenCL SDK. A sphere is rendered by GL but the vertices’ positions are modified by CL with some noise to create this cool effect.
  • This is the recommended way to synchronize GL and CL queue. However, there is a more optimal way using GL and CL events rather than flushing their queues. However, synchronization with events is an advanced subject we don’t have time to discuss in this course and you can found presentations online.
  • The Khronos web site has a wiki with links to all these WebCL implementation prototypes. On this web site, you will also find links to this presentation, course notes, and updates.All examples in this course were done with node-webcl from Motorola and rendered with node-webgl, both are freely available on github. While this is not an implementation within a web browser, it uses the same JavaScript engine as Chrome/Chromium browsers i.e. Google v8 engine. We use nodejs for server-side processing and the same code is being ported to Chrome browser. Using nodejs we can prototype new features quickly before adding them to browsers.
  • Transcript of "Graphics Programming for the Web with WebCL"

    1. 1. Click to edit Master title style This subtitle is 20 points Bullets are blue They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this 1
    2. 2. Click to edit Master title style This subtitle is 20 points Bullets are blue Graphics Programming on the Web They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if with WebCL there is insufficient line spacing.Motorola the maximum Mikaël Bourges-Sévenier, This is Mobility recommended number of lines 2012 slide (seven). August 9, per  Sub bullets look like this 2
    3. 3. Click to edit Master title style This subtitle is planks ;-) Blender/Bullet/SmallLuxGPU Over 32000 20 points Bullets are blue OpenCL They have 110% line spacing, 2 points before & after  By Alain Ducharme “Phymec” Longer bullets in the form of a paragraph are harder to read if http://www.youtube.com/watch?v=143k1fqPukk there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this 3
    4. 4. MotivationClick to edit Master title style For compute intensive web applications This subtitle is 20 points Bullets are blue  Games: physics, special effects They have 110% linephotography  Computational spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if  Scientific simulations there is insufficient line spacing. This is the maximum  Augmented reality recommended number of lines per slide (seven).  … bullets look like this  Sub Use many devices for general computations  CPU, GPU, DSP, FPGA… 4
    5. 5. MotivationClick to edit Master title style This subtitle is 20 exponential GFLOPS growth every GPUs provide points Chapter 1. Introduction Bullets areCPUs year vs. blue They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this NVidia CUDA/OpenCL C programming guide 5
    6. 6. Content edit Master title styleClick to Motivation and 20 points This subtitle is Goals General-Purpose computations on GPU (GPGPU) Bullets are blue  From to They have 110% line spacing, 2 points before & after  The need for more general data-parallel computations Longer overview the form of a paragraph are harder to read if WebCL bullets in there is insufficient line spacing. This is the maximum  A JavaScript API over OpenCL recommended number of lines per slide (seven).  OpenCL concepts  WebCL API look like this  Sub bullets WebCL programming  Pure computations  WebGL interoperability 6
    7. 7. Content edit Master title styleClick to Motivation and 20 points This subtitle is Goals General-Purpose computations on GPU (GPGPU) Bullets are blue  From to They have 110% line spacing, 2 points before & after  The need for more general data-parallel computations Longer overview the form of a paragraph are harder to read if WebCL bullets in there is insufficient line spacing. This is the maximum  A JavaScript API over OpenCL recommended number of lines per slide (seven).  OpenCL concepts  WebCL API look like this  Sub bullets WebCL programming  Pure computations  WebGL interoperability 7
    8. 8. WebGL edit Master title style Click to pipeline  Programmable vertex & This subtitle is 20 points fragment shaders  Bullets are blue Application GPU Frame Buffer  They have 110% line spacing, 2 points before & after vertex fragment Longer bullets in the form of a paragraph are harder to read if vertices (3D) vertices (screen) fragments pixels Vertex Fragment there is insufficient line spacing. This is the maximum processing Rasterizer processing recommended number of lines per slide (seven).  Sub bullets look like this Vertex Shader Textures Samplers Fragment Shader 8
    9. 9. General Purpose computationsClick to edit Master title style on GPU With clever 20 points This subtitle ismapping of algorithms to GL pipeline  Textures as data buffers Bullets are blue  Texture coordinates as computational domain They have 110% line spacing, 2 points before & after  Vertex coordinates as computational range Longer bullets in the form of a paragraph are harder to read if  Vertex shaders Scatter (write values) there is insufficient line spacing. This is the maximum • to start computations recommended number of lines per slide (seven). • scatter operations  Sub bullets look like this Fragment shaders Gather (read values) • for algorithms steps • gather operations 9
    10. 10. GPGPU with GL limitationsClick to edit Master title style This subtitle is 20 points Hard to map algorithms to graphics pipeline Bullets are blue Hard to do scatter operations They have 110% line spacing, 2 points before & after Shader instancesform of a paragraph are harder to read if Longer bullets in the can NOT directly communicate with is insufficient line spacing. This is the maximum there one another recommendedGPGPU of linesGL is hack-ish … number with per slide (seven).  Sub bullets look like this CL is made for GPGPU, not graphics 10
    11. 11. Content edit Master title styleClick to Motivation and 20 points This subtitle is Goals General-Purpose computations on GPU (GPGPU) Bullets are blue  From to They have 110% line spacing, 2 points before & after  The need for more general data-parallel computations Longer overview the form of a paragraph are harder to read if WebCL bullets in there is insufficient line spacing. This is the maximum  A JavaScript API over OpenCL recommended number of lines per slide (seven).  OpenCL concepts  WebCL API look like this  Sub bullets WebCL programming  Pure computations  WebGL interoperability 11
    12. 12. WebCL edit MasterClick to overview title style WebCL brings parallel computing to This subtitle is 20 points the Web through a secure Bullets are blue JavaScript binding to OpenCL 1.1 They have 110% line spacing, 2 points before & after (2011) Longer bullets inroyalty-freeof a paragraph are harder to read if  Open standard, the form there is insufficient line spacing. This is the maximum  Platform independent recommended number of lines per slide (seven).  Device independent  being standardized by Khronos  Sub bullets look like this First public working draft April 2012  http://www.khronos.org/webcl/ 12
    13. 13. OpenCL overviewClick to edit Master title style Features This subtitle is 20 points  C-based cross-platform API Bullets are blue  Kernels use a subset of C99 and extensions They have 110% line spacing, 2 points before & after • Vector extensions (<type>N) • No recursion, no function pointers Longer bullets memory (malloc,of a paragraph libc methods (memcpy…) if • No dynamic in the form free…), no standard are harder to read there is insufficient lineaccuracy both for intergers and floats  Well-defined numerical spacing. This is the maximum  Rich-set of built-in functions (e.g. as GLSL and more) recommended number of lines per slide (seven). • But no random method  Sub bullets look like this  Close to the hardware • Control over memory use • Control over thread scheduling 13
    14. 14. OpenCL Device ModelClick to edit Master title style This subtitle is 20 points or more Compute devices A host is connected to one Compute device Bullets are blue ... ...  A ... Theycollection of oneline spacing, 2 points before & after have 110% or more compute units (~ cores) ... Longer bullets incomposed of of a paragraph are harder to read if ...  A compute unit is the form ... Host (PC) there is insufficient line spacing. This is the maximum one or more processing ... elements (~ threads) ... recommended number of lines per slide (seven). ...  Processing elements execute code as SIMD or SPMD  Sub bullets look like this Device (GPU, CPU, …) Compute ... ... ... Compute Devices (GPU, CPU, DSP, FPGA…) Compute Unit (Core) ... ... ... Processing Element (Thread) 14
    15. 15. OpenCL Execution title styleClick to edit Master Model Kernel This subtitle is 20 code (~ DLL entry point)  Basic unit of executable points GPU CPU  Data-parallel or task-parallel Bullets are blue Program Context Queue Queue They have 110% line spacing, kernels  Collection of kernels and functions called by 2 points before & after  Analogous to a dynamic library (DLL) Commandbullets in the form of a paragraph are harder to read if Longer Queue  Control there is operations on OpenCL objects (memory transfers,is theexecution, synchronization) insufficient line spacing. This kernels maximum  Commands queued in order recommendedornumber of lines per slide (seven).  Execution in-order out-of-order  Applications may use multiple command-queues per device  Sub bullets look like this Work-item  An execution of a kernel by a processing element (~ thread) Work-group  A collection of work-items that execute on a single compute unit (~ core) 15
    16. 16. OpenCL Work-group 2D analogyClick to edit Master title style Local This subtitle is 20 points Global Bullets are blue They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide#(seven). = # pixels work-items  Sub bullets look like this # work-groups = # tiles Work-group size = tileW * tileH All threads in a workgroup run synchronously 16
    17. 17. OpenCL Memory ModelClick to edit Master title style On Host This subtitle is 20 points  CPU RAM Private Memory Private Memory Private Memory Private Memory Bullets are blue On Compute Device Work-Item 1 Work-Item M Work-Item 1 Work-Item M  Global memory = GPU RAM They have 110% lineglobal  Constant memory = cached spacing, 2 points before & after Workgroup 1 Workgroup N Longer bullets cached global memory  Texture memory = in the form of a paragraph are harder to read if Local Memory Local Memory there is insufficient linereads memory optimized for streaming spacing. This is the maximum Global Memory / Constant and Texture Caches  Local memory = high-speed memory Compute Device recommended number of lines per slide (seven). shared among work-items of a Command queues and API calls work-group (~ L1 cache)  Sub bullets look likeof a  Private memory = registers this Host Memory work-item, very fast memory Host Memory management is explicit  App must move data host ➞ global ➞ local and back 17
    18. 18. OpenCL KernelClick to edit Master title style This subtitle isa20 points Defined on N-dimensional computation domain Bullets areis executed at each point of the A kernel blue They have 110%domain computation line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if / / I n J av aSc r i pt / / I n OpenCL C99 there is insufficient line spacing. This is the maximum f unc t i on m t i pl e( a, b, n) { ul v ar c = [ ] ; / ** * @ am a, b, c ar e buf f er s i n gl obal par m or y em f or ( v ar i =0; i <n; ++i ) * @ am n num par ber of el em ent s i n a, b, and c recommended number of lines per slide (seven). c [ i ] = a[ i ] * b[ i ] ; */ __k er nel r et ur n c ; v oi d m t i pl y ( __gl obal c ons t f l oat * a, ul }  Sub bullets look like this __gl obal c ons t f l oat * b, __gl obal f l oat * c , uns i gned i nt n) { uns i gned i nt t i d = get _gl obal _i d( 0) ; / / t hr ead number i f ( t i d >= n) r et ur n; / / m e s ur e we ak don t pas s buf f er ar ea c [ t i d] = a[ t i d] * b[ t i d] ; } 18
    19. 19. WebCL editClick to API Master title style Platform layer  OO model as OpenCLSameThis subtitle is 20 points WebCLPlatform WebCLDevice WebCLExtensionwith JS classes  Bullets objectWebCL is globalare blue WebCL WebCLContext  They have 110% line spacing, 2 points before & after  Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven). * WebCLProgram * WebCLMemoryObject * CommandQueue * Event * Sampler {abstract}  Sub bullets look like this WebCLKernel WebCLBuffer WebCLImage Compiler layer Runtime layer 19
    20. 20. Content edit Master title styleClick to Motivation and 20 points This subtitle is Goals General-Purpose computations on GPU (GPGPU) Bullets are blue  From to They have 110% line spacing, 2 points before & after  The need for more general data-parallel computations Longer overview the form of a paragraph are harder to read if WebCL bullets in there is insufficient line spacing. This is the maximum  A JavaScript API over OpenCL recommended number of lines per slide (seven).  OpenCL concepts  WebCL API look like this  Sub bullets WebCL programming  Pure computations  WebGL interoperability 20
    21. 21. WebCL edit Master title styleClick to sequence (host side) Select Create buffers to store This subtitle is 20 points Create context Platform data on devices Select Bullets are blue Compile kernels Device Create command queues for each device They have 110% line spacing, 2 points before & after Setup command-queues Create Context Update kernels arguments Longerkernels in the form of a paragraph are harder to read if Setup bullets arguments Load and compile kernels on devices there is insufficient line spacing. This is the maximum Execute commands Send data to devices using their command queues recommended number of lines per slide (seven). Read results Platform layer Send commands to devices using their  Sub bullets look like this Compiler command queues Runtime layer Get data from devices using their command queues Release resources 21
    22. 22. WebCL edit Master title style Click to sequence (host side) try { This subtitle is 20 points/ / c r eat e t he OpenCL c ont ex t Select Platform Create buffers to store data on devices  c l Cont ex t = W ebCL. c r eat eCont ex t ( { Bullets are blue dev i c eTy pe: WebCL. DEVI CE_TYPE_GPU Select Device Create command }); queues for each device}  They have 110% line spacing, 2 points before & after Createc at c h( er r ) { Context  Update kernels Longer bullets in the form of a paragraph are harder to read if t hr ow " Er r or : Fai l ed t o c r eat e c ont ex t ! " +er r ; Load and compile arguments} there is insufficient line spacing. This is the maximum kernels on devices Send data to devices using their commandv ar dev i c es = c l Cont ex t . get I nf o( WebCL. CONTEXT_DEVI CES) ; queues recommended number of lines per slide (seven).i f ( ! dev i c es ) { Send commands to t hr ow " Er r or : Fai l ed t o r et r i ev e c omput e dev i c es  Sub bullets look like this devices using their f or c ont ex t ! " ; command queues} Get data from devices using their command queues Release resources 22
    23. 23. WebCL edit Master title style Click to sequence (host side) <scr i pt i d=" m t i pl y_scr i pt " t ype=" x- webcl " > ul __ker nel  This subtitle is 20 points voi d m t i pl y( __gl obal const f l oat * a, ul __gl obal const f l oat * b, Select Platform Create buffers to store data on devices  Bullets are blue __gl obal f l oat * c, unsi gned i nt n) Select Device Create command { queues for each device  They have 110% line spacing, 2 points before & after unsi gned i nt t i d = get _gl obal _i d( 0) ; / / t hr ead num i f ( t i d >= n) r et ur n; / / m ber ake sur e we don t pass buf f er ar ea Create Context  c[ t i d] = a[ t i d] * b[ t i d] ; Update kernels } Longer bullets in the form of a paragraph are harder to read if Load and compile arguments</ scr i pt > there is insufficient line spacing. This is the maximum/ / Cr eat e t he comput e pr ogr am f r om t he sour ce buf f er ( t ext ) kernels on devices Send data to devices using their commandcl Pr ogr am = cl Cont ext . cr eat ePr ogr am get Scour ce( " m t i pl y_scr i pt " ) ) ; ( ul queues recommended number of lines per slide (seven)./ / Bui l d t he pr ogr am execut abl e Send commands to  Sub bullets look like thistry { devices using their command queues cl Pr ogr am bui l d( cl Devi ce, - cl - f ast - r el axed- m h - DDEBUG=1 ) ; . at} cat ch ( er r ) { Get data from devices t hr ow " Er r or : Fai l ed t o bui l d pr ogr am execut abl e! n" using their command + c l Pr ogr am get Bui l dI nf o( c l Dev i c e, W . ebCL. PROGRAM_BUI LD_LOG) ; queues} Release resourcescl Ker nel = cl Pr ogr am cr eat eKer nel ( " m t i pl y" ) ; . ul 23
    24. 24. WebCL edit Master title style Click to sequence (host side)  This subtitle is 20 pointsBUFFER_SI ZE=10;v ar A=new Ui nt 32Ar r ay ( BUFFER_SI ZE) ; Select Platform Create buffers to store data on devicesv ar B=new Ui nt 32Ar r ay ( BUFFER_SI ZE) ;  Bullets are blue Select Device Create command/ / s t or e dat a i n A and B queues for each device…  They have 110% line spacing, 2 points before & after Create Context  Longer bullets in the form ENT; a/ paragraph are harder to read ifv ar s i z e=BUFFER_SI ZE* Ui nt 32Ar r ay . BYTES_PER_ELEM Update kernels of / s i z e i n by t es Load and compile arguments/ / Cr eat e buf f er f or A and B and c opy hos t c ont ent sv ar aBuf f er = c lis insufficient ( line M _READ_ONLY, This; is the maximum there Cont ex t . c r eat eBuf f er WebCL. spacing. s i z e) kernels on devices Send data to devices EM using their commandv ar bBuf f er = c l Cont ex t . c r eat eBuf f er ( WebCL. M _READ_ONLY, s i z e) ; EM queues recommended number of lines per slide (seven). Send commands to/ / Cr eat e buf f er f or C t o r ead r es ul t s  Sub bullets look like this devices using theirv ar c Buf f er = c l Cont ex t . c r eat eBuf f er ( WebCL. M _W TE_ONLY, s i z e) ; EM RI command queues Get data from devices using their command queues Release resources 24
    25. 25. WebCL edit Master title style Click to sequence (host side)  This subtitle is 20 points/ / Cr eat e com and queue mcl Queue=cont ext . cr eat eCom andQueue( devi ces[ 0] ) ; m Select Platform Create buffers to store data on devices/ /  Bullets are blue enqueue buf f er s Select Device Create commandcl Queue. enqueueW i t eBuf f er ( aBuf f er , f al se, 0, si ze, A) ; r queues for each devicecl  They have 110% line spacing, ze, points before & after Queue. enqueueW i t eBuf f er ( bBuf f er , f al se, 0, si 2 B) ; r Create Context  Longer bullets in the form of a paragraph are harder to read if Update kernels arguments / / Set ker nel ar gs Load and compile cl Ker nel . set Aris 0, aBuf f er ) ; there g( insufficient line spacing. This is the maximum kernels on devices Send data to devices using their command cl Ker nel . set Ar g( 1, bBuf f er ) ; queues cl Ker nel . set Ar g( 2, cBuf f er ) number of lines per slide (seven). recommended ;cl Ker nel . set Ar g( 3, BUFFER_SI ZE, WebCL. t ype. UI NT) ; Send commands to  Sub bullets look like this devices using their command queues Get data from devices __ker nel using their command voi d m t i pl y( __gl ul obal const f l oat * a, queues __gl obal const f l oat * b, __gl obal f l oat * c, Release resources unsi gned i nt n) ; 25
    26. 26. WebCL edit Master title styleClick to sequence (host side)  This subtitle is 20 points Select Platform Create buffers to store data on devices/ /  Bullets are blue Execut e ( enqueue) ker nel Select Device Create commandcl Queue. enqueueNDRangeKer nel ( cl Ker nel , queues for each device  They have 110% line spacing,obal pointsset nul l , / / gl 2 wor k of f before & after Create [ BUFFER_SI ZE] , / / gl obal wor k si ze Context  Longer bullets in2]the form/ /ofocal paragraph are harder to read if Update kernels [ ); l a wor k si ze Load and compile arguments there is insufficient line spacing. This is the maximum kernels on devices Send data to devices using their command queues Note: Use local work size =number of lines per slide (seven). recommended [] or null (default) Send commands toto let  Sub bullets best values. driver chose the look like this devices using their command queues Get data from devices using their command queues Release resources 26
    27. 27. WebCL edit Master title styleClick to sequence (host side) This subtitle is 20 points Select Platform Create buffers to store data on devices/ / Bulletst are bluewhi l e get t i ng t hem get r esul s and bl ock Select Device Create command queues for each devicecl Queue. enqueueReadBuf f er ( lineerspacing, 2 points before & after They have 110% cBuf f , var C=new Ui nt 32Ar r ay( BUFFER_SI ZE) ; Create Context Longer bullets in 0,r ue, ze, / / bl of a paragraph are harder to read if t Update kernels the form ocki ng cal l si Load and compile arguments C) ; there is insufficient line spacing. This is the maximum kernels on devices Send data to devices using their command queues recommended number of lines per slide (seven). Send commands to  Sub bullets look like this devices using their command queues Get data from devices using their command queues Release resources 27
    28. 28. Example: Matrix multiplicationClick to edit Master title style A B This subtitle is 20 points “Hello World of CL” Bullets are blue C=AxB They have 110% line spacing, 2 points before & after N x N matrices form of a paragraph are harder to read if Longer bullets in the there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this C 28
    29. 29. Example: Matrix multiplicationClick to edit Master title style A B This subtitle is 20 points Optimization Bullets are blue  N x N matrices They have 110% line spacing, 2 points before & after  C divided into m x m tiles Longer bullets in the form of a paragraph are harder to read if  With there is insufficient line spacing. This is the maximum • m=N/P recommended number of lines per slide (seven). • bullets look like this  SubP = # threads per workgroup (16) C 29
    30. 30. Example: Comparison with sequentialClick to edit Master title style MacBook Pro (early 2011), OSX 10.8 This subtitle is 20 points  CPU: BulletsIntel Core i7, 2.2GHz, 4 cores are blue  GPU: AMD Radeon HD 6750M, 1 GB, 480 SPU, 600 MHz, 576 GFLOPS They have 110% line spacing, 2 points before & after 250 Longer bullets in the form of a paragraph are harder to read if 200 there is insufficient line spacing. This is the maximum Speedup factor 150 OpenMP recommended number of lines per slide (seven). CL (CPU) 100 CL (GPU)  Sub bullets look like this CL (GPU opt) 50 0 128 256 512 1024 2048 30
    31. 31. WebCL WebGL interopClick to /edit Master title style WebCL context created This subtitle is 20 points Initialization Initialize WebGL from WebGL context Bullets are blue CL objects Configure shared Initialize WebCL They GL counterparts spacing, 2 points before & after from have 110% line Configure shared CL-GL Sync GL bullets in the form of a paragraph are harder to read if data Longer and CL Rendering loop  Flush GL, acquire GL object Set kernels args there is insufficient line spacing. This is the maximum  Execute CL (per frame) recommended number of lines per slide (seven).  Release CL object, flush CL Enqueue commands  Sub bullets look like this Vertex arrays, textures, Execute kernels render-buffers can be shared Update Scene with CL Render scene 31
    32. 32. WebCL WebGL interop Click to /edit Master title style/ / Cr eat e WebGL c ont ex t Initialize WebGL  This subtitle is 20 pointsv ar gl = c anv as . get Cont ex t ( " ex per i ment al - webgl " ) ;/ / I ni t GL Initialize WebCL Bullets are blue…  They have 110% line spacing, 2 points before & after Configure shared CL-GL data/ / c r eat e t he OpenCL c ont ex tt r  { Longer bullets in the form of a paragraph are harder to read if y Set kernels args there is insufficient line {spacing. This is the maximum c l Cont ex t = W ebCL. c r eat eCont ex t ( dev i c eTy pe: WebCL. DEVI CE_TYPE_GPU, s recommended number of lines per slide (seven). Enqueue commands har eGr oup: gl });}  Sub bullets look like this Execute kernelsc at c h( er r ) { t hr ow " Er r or : Fai l ed t o c r eat e c ont ex t ! " +er r ; Update Scene} Render scene 32
    33. 33. WebCL WebGL interop (texture) Click to /edit Master title style// Cr eat e OpenGL t ext ur e obj ectgl . act i veText ur e( gl . TEXTURE0) ; Initialize WebGLglgl  This subtitle is 20 points Text ur e = gl . cr eat eText ur e( ) ; . bi ndText ur e( gl . TEXTURE_2D, gl Text ur e) ;glgl  Bullets are blue . t exPar am er i ( gl . TEXTURE_2D, gl . TEXTURE_M et AG_FI LTER, gl . NEAREST) ; . t exPar am er i ( gl . TEXTURE_2D, gl . TEXTURE_M N_FI LTER, gl . NEAREST) ; et I Initialize WebCLgl . t exI mage2D( gl . TEXTURE_2D, 0, gl . RGBA, Text ur eW dt h, Text ur eHei ght , 0, i  They have 110% line spacing, 2 points before & after gl . RGBA, gl . UNSI GNED_BYTE, nul l ) ;gl . bi ndText ur e( gl . TEXTURE_2D, nul l ) ; Configure shared CL-GL data  Longerput e pr ogr aminom t he formbuf f era( paragraph are harder to read if/ / Cr eat e t he com bullets f r the sour ce of t ext ) Set kernels argscl Pr ogr there isext . cr eat ePr ogr am get Scourspacing. This "is ; the maximum am = cl Cont insufficient line ce( " m t i pl y_scr i pt ) ) ( ul/ / Bui l recommended number of lines per slide (seven). Enqueue commands d t he pr ogr am execut abl etry {  Sub bullets look like this cl Pr ogr am bui l d( cl Devi ce, - cl - f ast - r el axed- m h - DDEBUG=1 ) ; .} cat ch ( er r ) { at Execute kernels t hr ow " Er r or : Fai l ed t o bui l d pr ogr am execut abl e! n" + c l Pr ogr am get Bui l dI nf o( c l Dev i c e, W . ebCL. PROGRAM_BUI LD_LOG) ; Update Scene}cl Ker nel = cl Pr ogr am cr eat eKer nel ( " m t i pl y" ) ; . ul Render scene 33
    34. 34. Demo: GL Texture update withClick to edit Master title style CL This subtitleEvgeny Demidov 2D ink droplet Based on is 20 points Bullets are fps WebGL ~26 blue WebCL ~124 fps They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this 34
    35. 35. WebCL WebGL interop (vbo) Click to /edit Master title style/ / cr eat e buf f er obj ect Initialize WebGL  This subtitle is 20 pointsgl VBO = gl . cr eat eBuf f er ( ) ;gl . bi ndBuf f er ( gl . ARRAY_BUFFER, gl VBO) ;/ /  ni Bullets are blue Initialize WebCL i t i al i ze buf f er obj ectvar si zeI nByt es = m esh_wi dt h * m esh_hei ght * 4 *  They have 110% line spacing, 2 points before & after Fl oat Ar r ay . BYTES_PER_ELEM ENT; Configure shared CL-GL datagl . buf f er Dat a( gl . ARRAY_BUFFER, si zeI nByt es, gl . DYNAM C_DRAW ; I )  Longer bullets in the form of a paragraph are harder to read if/ / cr eat e OpenCL buf f er f r om GL VBO Set kernels argscl VBO there ext . insufficient line spacing. This VBO) the maximum = cl Cont is cr eat eFr om GLBuf f er ( WebCL. M _W TE_ONLY, gl is ; EM RI recommended number of lines per slide (seven). Enqueue commands// set ker nel ar gs val uescl  Sub bullets look like this Ker nel . set Ar g( 0, cl VBO) ; Execute kernelscl Ker nel . set Ar g( 1, mesh_wi dt h, WebCL. t ype. UI NT) ;cl Ker nel . set Ar g( 2, mesh_hei ght , WebCL. t ype. UI NT) ; Update Scene Render scene 35
    36. 36. Click to edit Master title style This subtitle is 20 points Bullets are blue They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this 36
    37. 37. WebCL/WebGL interop style Click to edit Master title(host side)  This subtitle is 20 points Initialize WebGL/ / Sy nc GL and ac qui r e buf f er f r om GLgl . f l us h( ) ;  Bullets are bluec l Queue. enqueueAc qui r eGLObj ec t s ( c l Tex t ur e) ; Initialize WebCL  They have 110% line spacing, 2 points before & after/ / Set gl obal and l oc al wor k s i z es f or k er nelv ar l oc al = nul l ; Configure shared CL-GL datav ar gl obal = [ Tex t ur eW dt h, Tex t ur eHei ght ] ; i  Longer bullets in the form of a paragraph are harder to read if Set kernels argstry { c l Queue. enqueueNDRangeKer nel ( c l Ker nel , nul l , gl obal , l This is the maximum there is insufficient line spacing. oc al ) ;} c at c h ( er r ) { t hr ow " Fai l ed t o enqueue k er nel ! " + er r ;of lines per slide (seven). recommended number Enqueue commands}  Sub bullets look like this/ / Rel eas e GL t ex t ur e Execute kernelsc l Queue. enqueueRel eas eGLObj ec t s ( c l Tex t ur e) ;c l Queue. f l us h( ) ; Update Scene Render scene 37
    38. 38. Click to edit Master title style This subtitle is 20 points Bullets are blue They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this 38
    39. 39. PerspectivesClick to edit Master title style This subtitle is 20 points applications in Web browsers WebCL enables GPGPU Bullets are usage of architecture can lead to impressive  Careful blue They have 110% line spacing, 2 points before & after speedup Longer bullets ininteroperability, rich graphicsharder to read if  With WebGL the form of a paragraph are Web there is insufficient now spacing. This is the maximum applications are line possible recommended number of lines per slide (seven). DRAFT WebCL specification  Sub bullets look like this  Quite stable JavaScript API  Focusing on more security and robustness 39
    40. 40. WebCL edit Master title styleClick to Open process and Resources Khronos open process points Web community This subtitle is 20 to engage  Public specification Bullets are blue drafts, mailing lists, forums  http://www.khronos.org/webcl/ They have 110% line spacing, 2 points before & after  webcl_public@khronos.org Longer bullets in the form of a paragraph are harder to read if Nokia open source prototype for Firefox in May 2011 (LGPL) there is insufficient line spacing. This is the maximum  http://webcl.nokiaresearch.com recommended number of lines per in July (seven). Samsung open source prototype for WebKit slide 2011 (BSD)  Sub bullets look like this http://code.google.com/p/webcl/ Motorola open source prototype for NodeJS in March 2012 (BSD)  https://github.com/Motorola-Mobility/node-webcl 40
    41. 41. Click to edit Master title style This subtitle is 20 points Bullets are blue They have 110% line spacing, 2 points before & after Thaank Longer bullets in the form of paragraph are harder to read if you! there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this 41
    42. 42. Click to edit Master title style This slide has a 16:9 media window This subtitle is 20 points Bullets are blue They have 110% line spacing, 2 points before & after Longer bullets in the form of a paragraph are harder to read if there is insufficient line spacing. This is the maximum recommended number of lines per slide (seven).  Sub bullets look like this 42
    43. 43. Start to edit MasterClick learning Now! title style OpenCL Programming Guide - The “Red Book” of OpenCL This subtitle is 20 points  http://www.amazon.com/OpenCL-Programming-Guide-Aaftab-Munshi/dp/0321749642 OpenCL in Action blue Bullets are  http://www.amazon.com/OpenCL-Action-Accelerate-Graphics-Computations/dp/1617290173/ They have 110% line spacing, 2 points before & after Heterogeneous Computing with OpenCL  http://www.amazon.com/Heterogeneous-Computing-with-OpenCL-ebook/dp/B005JRHYUS LongerProgramming Bookthe form of a paragraph are harder to read if The OpenCL bullets in there is insufficient line spacing. This is the maximum  http://www.fixstars.com/en/opencl/book/ recommended number of lines per slide (seven).  Sub bullets look like this 43
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×