OpenCL Programming 101

7,502 views

Published on

OpenCL Programing Part 1
This lecture reviews Host programing according to OpenCL, a Khronos standard for parallel programming

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
7,502
On SlideShare
0
From Embeds
0
Number of Embeds
52
Actions
Shares
0
Downloads
147
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

OpenCL Programming 101

  1. 1. OpenCL Host Programming Fast Forward Your Development www.dsp-ip.com
  2. 2. OPENCL™ EXECUTION MODEL Fast Forward Your Development
  3. 3. OpenCL™ Execution Model •Kernel ▫ Basic unit of executable code - similar to a C function ▫ Data-parallel or task-parallel ▫ H.264Encode is not a kernel ▫ Kernel should be a small separate function (SAD) •Program ▫ Collection of kernels and other functions ▫ Analogous to a dynamic library •Applications queue kernel execution instances ▫ Queued in-order ▫ Executed in-order or out-of-order 3 Fast Forward Your Development
  4. 4. Data-Parallelism in OpenCL™ •Define N-dimensional computation domain (N = 1, 2 or 3) ▫ Each independent element of execution in N-D domain is called a work-item ▫ The N-D domain defines the total number of work- items that execute in parallel Scalar Data-Parallel 1024 x 1024 image: void kernel void problem dimensions: scalar_mul(int n, dp_mul(global const float *a, 1024 x 1024 = 1 kernel const float *a, global const float *b, execution per pixel: const float *b, global float *result) 1,048,576 total executions float *result) { { int id = get_global_id(0); int i; result[id] = a[id] * b[id]; for (i=0; i<n; i++) } result[i] = a[i] * b[i]; // execute dp_mul over “n” work-items } 4 Fast Forward Your Development
  5. 5. Compiling Kernels • Create a program ▫ Input: String (source code) or precompiled binary ▫ Analogous to a dynamic library: A collection of kernels • Compile the program ▫ Specify the devices for which kernels should be compiled ▫ Pass in compiler flags ▫ Check for compilation/build errors • Create the kernels ▫ Returns a kernel object used to hold arguments for a given execution 5 Fast Forward Your Development
  6. 6. EX-1:OPENCL-”HELLO WORLD” Fast Forward Your Development
  7. 7. Fast Forward Your Development
  8. 8. BASIC Program structure Include Get Platform Info Create Context Load & compile program Create Queue Load and Run Kernel 8 Fast Forward Your Development
  9. 9. Includes • Pay attention to include ALL OpenCL include files #include <cstdio> #include <cstdlib> #include <iostream> #include <SDKFile.hpp> #include <SDKCommon.hpp> #include <SDKApplication.hpp> #include <CL/cl.hpp> 9 Fast Forward Your Development
  10. 10. GetPlatformInfo • Detects the OpenCL “Devices” in the system: ▫ CPUs, GPUs & DSPs err = cl::Platform::get(&platforms); if(err != CL_SUCCESS) { std::cerr << "Platform::get() failed (" << err << ")" << std::endl; return SDK_FAILURE; } std::vector<cl::Platform>::iterator i; if(platforms.size() > 0) { for(i = platforms.begin(); i != platforms.end(); ++i) { if(!strcmp((*i).getInfo<CL_PLATFORM_VENDOR>(&err).c_str(), "Advanced Micro Devices, Inc.")) { break;} } } 10 Fast Forward Your Development
  11. 11. Create Context • Context enables operation (Queue) and memory sharing between devices cl_context_properties cps[3] = { CL_CONTEXT_PLATFORM, (cl_context_properties)(*i)(), 0 }; std::cout<<"Creating a context AMD platformn"; cl::Context context(CL_DEVICE_TYPE_CPU, cps, NULL, NULL, &err); if (err != CL_SUCCESS) { std::cerr << "Context::Context() failed (" << err << ")n"; return SDK_FAILURE; } 11 Fast Forward Your Development
  12. 12. Load Program • Loads the kernel program (*.cl) std::cout<<"Loading and compiling CL sourcen"; streamsdk::SDKFile file; if (!file.open("HelloCL_Kernels.cl")) { std::cerr << "We couldn't load CL source coden"; return SDK_FAILURE;} cl::Program::Sources sources(1, std::make_pair(file.source().data(), file.source().size())); cl::Program program = cl::Program(context, sources, &err); if (err != CL_SUCCESS) { std::cerr << "Program::Program() failed (" << err << ")n"; return SDK_FAILURE; } 12 Fast Forward Your Development
  13. 13. Compile program • Host program compiles Kernel program per device. • Why compile in RT? - Like Java we don’t know the device till we run. We can decide in real-time based on load-balancing on which device to run err = program.build(devices); if (err != CL_SUCCESS) { if(err == CL_BUILD_PROGRAM_FAILURE) { //Handle Error std::cerr << "Program::build() failed (" << err << ")n"; return SDK_FAILURE; } 13 Fast Forward Your Development
  14. 14. Create Kernel with program • Associate Kernel object with our loaded and compiled program cl::Kernel kernel(program, "hello", &err); if (err != CL_SUCCESS) { std::cerr << "Kernel::Kernel() failed (" << err << ")n"; return SDK_FAILURE; } if (err != CL_SUCCESS) { std::cerr << "Kernel::setArg() failed (" << err << ")n"; return SDK_FAILURE; } 14 Fast Forward Your Development
  15. 15. Create Queue per device & Run it • Loads the kernel program (*.cl). This does not have to happen immediately • Attention: enqueue() is Asynchronous call meaning : function return does not imply Kernel was executed or even started to execute cl::CommandQueue queue(context, devices[0], 0, &err); std::cout<<"Running CL programn"; err = queue.enqueueNDRangeKernel(…..) err = queue.finish(); if (err != CL_SUCCESS) { std::cerr << "Event::wait() failed (" << err << ")n"; } 15 Fast Forward Your Development
  16. 16. And that’s All Folks? • Naaaa…..We still need to learn: • Writing Kernel functions • Synchronizing Kernel Functions • Setting arguments to kernel functions • Passing data from/to Host 16 Fast Forward Your Development
  17. 17. References • “OpenCL Hello World” is an ATI OpenCL SDK programming exercise • ATI OpenCL slides 17 Fast Forward Your Development
  18. 18. DSP-IP Contact information Download slides at: www.dsp-ip.com Course materials & lecture request Yossi Cohen info@dsp-ip.com +972-9-8850956 www.dsp-ip.com Mail : info@dsp-ip.com Phone: +972-9-8850956, Fax : +972-50- 8962910 Fast Forward Your Development

×