SlideShare a Scribd company logo
1 of 25
Download to read offline
INTRODUCTION
Hammad Ghulam Mustafa
Hafiz Muhammad Noman Zahid
Muhammad Abdullah Ijaz
Malik Waqas Bashir Abid
Muhammad Umar Arshad
Umair Javaid
Suleman Khan
Ali Islal
• Introduction
• Programming Basics
• OpenCL Execution Model
• “Hello World”
• Conclusion
• Standard for the development of data parallel applications
• Most used for the development of GPGPU applications
• General Purpose computing on Graphics Processing Units
• A GPU is comprised of hundreds of compute cores
• Specialized for massively data parallel computation
• GPGPU: Take advantage of GPU’s computing power to make massively parallel
applications
• Parallel applications with huge acceleration in Molecular Dynamics, Image
Processing, Evolutionary Computation,…
• All cases based on data parallelism:
each thread processes a subset of the data
For example, a vector addition:
• Furthermore, OpenCL provides portability:
same code can run on different architectures
• For Example:
• Provides the following abstraction: A compute device is composed by
compute units
• OpenCL platform: Host + Compute Devices
Each manufacturer provides an SDK:
• NVIDIA SDK for GPUs
• AMD APP for CPUs/GPU
• Intel for CPUs
• IBM for PowerPC and Cell B/E
• Kernel: function that defines the behavior of each thread
• For example, kernel for vector addition:
__kernel void sumKernel (
__global int* a, __global int* b, __global int* c)
{
int i = get_global_id(0);
c[i] = a[i] + b[i];
}
Written in OpenCL-C: ANSI-C + Set of kernel functions, e.g.:
• get_global_id: obtains thread index
• barrier: synchronizes threads
• An OpenCL applications consists of:
• Basic host application flow:
a. Load and Compilation of kernel
b. Data copy from host to device (e.g. from CPU to GPU)
c. Execution of kernel
d. Data copy from device to host
e. Release kernels and data from device memory
• Execution using command queue in each device
• Host code: programmed using OpenCL API
• API Calls, such as:
• clCreateProgramWithSource: Load kernel from char*
• clBuildProgram: Compile kernel
• clSetKernelArgs: Set kernel arguments for the device
• clEnqueueWriteBuffer/clEnqueueRead: Copy data vector to device
• clEnqueueNDRangerKernel: Launch kernel in device
• API Types, such as:
• cl_mem: Pointer to device memory objects
• cl_program: Kernel object
• cl_float / cl_int / cl_uint: Redefinition of C types
• Kernel
• Basic unit of executable code -similar to a C function
• Data-parallel or task-parallel
• H.264Encode is not a kernel
• Kernel should be a small separate function (SAD)
• Program
• Collection of kernels and other functions
• Analogous to a dynamic library
• Applications queue kernel execution instances
• Queued in-order
• Executed in-order or out-of-order
• Define N-dimensional computation domain (N = 1, 2 or 3)
• Each independent element of execution in N-D domain is called a work-item
• The N-D domain defines the total number of work-items that execute in parallel
• Create a program
• Input: String (source code) or precompiled binary
• Analogous to a dynamic library: A collection of kernels
• Compile the program
• Specify the devices for which kernels should be compiled
• Pass in compiler flags
• Check for compilation/build errors
• Create the kernels
• Returns a kernel object used to hold arguments for a given execution
• OpenCL does not provide performance portability
• Alternative to NVIDIA CUDA:
Programming paradigm for NVIDIA GPU cards
• Combinable with other parallel programming models:
OpenMP for SMPs / MPI for MPPs
• Huge ecosystems for OpenCL, e.g. OpenACC:
Develop GPGPU applications using directives
#pragma acc kernels
for(i = 0; i< N; i++)
c[i] = b[i] + a[i];
Introduction to OpenCL By Hammad Ghulam Mustafa

More Related Content

What's hot

OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGOOpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGOOpenNebula Project
 
Hyperloglog Lightning Talk
Hyperloglog Lightning TalkHyperloglog Lightning Talk
Hyperloglog Lightning TalkSimon Prickett
 
OpenNebula and StorPool: Building Powerful Clouds
OpenNebula and StorPool: Building Powerful CloudsOpenNebula and StorPool: Building Powerful Clouds
OpenNebula and StorPool: Building Powerful CloudsOpenNebula Project
 
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...OpenNebula Project
 
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...OpenNebula Project
 
WSO2Con USA 2015: Deployment Patterns and Capacity Planning
WSO2Con USA 2015: Deployment Patterns and Capacity PlanningWSO2Con USA 2015: Deployment Patterns and Capacity Planning
WSO2Con USA 2015: Deployment Patterns and Capacity PlanningWSO2
 
Kubernetes Webinar - Using ConfigMaps & Secrets
Kubernetes Webinar - Using ConfigMaps & Secrets Kubernetes Webinar - Using ConfigMaps & Secrets
Kubernetes Webinar - Using ConfigMaps & Secrets Janakiram MSV
 
OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...
OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...
OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...NETWAYS
 
Concourse ci container based ci for the cloud
Concourse ci   container based ci for the cloudConcourse ci   container based ci for the cloud
Concourse ci container based ci for the cloudJohannes Rudolph
 
AWS guerrilla orchestration
AWS guerrilla orchestrationAWS guerrilla orchestration
AWS guerrilla orchestrationSlobodan Utvić
 
Whats new in Havana--Swift
Whats new in Havana--SwiftWhats new in Havana--Swift
Whats new in Havana--SwiftMirantis
 
OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...
OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...
OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...NETWAYS
 
Micheal Pershyn "Coljure 4 Big Data"
Micheal Pershyn "Coljure 4 Big Data"Micheal Pershyn "Coljure 4 Big Data"
Micheal Pershyn "Coljure 4 Big Data"Lviv Startup Club
 
Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!Krishna-Kumar
 
OSOM - Operations in the Cloud
OSOM - Operations in the CloudOSOM - Operations in the Cloud
OSOM - Operations in the CloudMarcela Oniga
 

What's hot (20)

OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGOOpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
OpenNebulaConf2017EU: FairShare Scheduling by Valentina Zaccolo, INDIGO
 
Hyperloglog Lightning Talk
Hyperloglog Lightning TalkHyperloglog Lightning Talk
Hyperloglog Lightning Talk
 
OpenNebula and StorPool: Building Powerful Clouds
OpenNebula and StorPool: Building Powerful CloudsOpenNebula and StorPool: Building Powerful Clouds
OpenNebula and StorPool: Building Powerful Clouds
 
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
OpenNebulaConf2017EU: Transforming an Old Supercomputer into a Cloud Platform...
 
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
OpenNebulaconf2017EU: OpenNebula 5.4 and Beyond by Tino Vázquez and Ruben S. ...
 
WSO2Con USA 2015: Deployment Patterns and Capacity Planning
WSO2Con USA 2015: Deployment Patterns and Capacity PlanningWSO2Con USA 2015: Deployment Patterns and Capacity Planning
WSO2Con USA 2015: Deployment Patterns and Capacity Planning
 
Apache Gobblin
Apache GobblinApache Gobblin
Apache Gobblin
 
Kubernetes Webinar - Using ConfigMaps & Secrets
Kubernetes Webinar - Using ConfigMaps & Secrets Kubernetes Webinar - Using ConfigMaps & Secrets
Kubernetes Webinar - Using ConfigMaps & Secrets
 
OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...
OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...
OpenNebula Conf 2014 | Building Hybrid Cloud Federated Environments with Open...
 
Concourse ci container based ci for the cloud
Concourse ci   container based ci for the cloudConcourse ci   container based ci for the cloud
Concourse ci container based ci for the cloud
 
Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012 Workshop actualización SVG CESGA 2012
Workshop actualización SVG CESGA 2012
 
Airflow introduction
Airflow introductionAirflow introduction
Airflow introduction
 
Hadoop analytics provisioning based on a virtual infrastructure
Hadoop analytics provisioning based on a virtual infrastructureHadoop analytics provisioning based on a virtual infrastructure
Hadoop analytics provisioning based on a virtual infrastructure
 
AWS guerrilla orchestration
AWS guerrilla orchestrationAWS guerrilla orchestration
AWS guerrilla orchestration
 
Whats new in Havana--Swift
Whats new in Havana--SwiftWhats new in Havana--Swift
Whats new in Havana--Swift
 
OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...
OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...
OpenNebula Conf 2014 | Bootstrapping a virtual infrastructure using OpenNebul...
 
Intro to Kubernetes
Intro to KubernetesIntro to Kubernetes
Intro to Kubernetes
 
Micheal Pershyn "Coljure 4 Big Data"
Micheal Pershyn "Coljure 4 Big Data"Micheal Pershyn "Coljure 4 Big Data"
Micheal Pershyn "Coljure 4 Big Data"
 
Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!Kubernetes Application Deployment with Helm - A beginner Guide!
Kubernetes Application Deployment with Helm - A beginner Guide!
 
OSOM - Operations in the Cloud
OSOM - Operations in the CloudOSOM - Operations in the Cloud
OSOM - Operations in the Cloud
 

Similar to Introduction to OpenCL By Hammad Ghulam Mustafa

MattsonTutorialSC14.pptx
MattsonTutorialSC14.pptxMattsonTutorialSC14.pptx
MattsonTutorialSC14.pptxgopikahari7
 
OpenCL Programming 101
OpenCL Programming 101OpenCL Programming 101
OpenCL Programming 101Yoss Cohen
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...AMD Developer Central
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerGeorge Markomanolis
 
clWrap: Nonsense free control of your GPU
clWrap: Nonsense free control of your GPUclWrap: Nonsense free control of your GPU
clWrap: Nonsense free control of your GPUJohn Colvin
 
Server-side JS with NodeJS
Server-side JS with NodeJSServer-side JS with NodeJS
Server-side JS with NodeJSLilia Sfaxi
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusJakob Karalus
 
Freedreno on Android – XDC 2023
Freedreno on Android          – XDC 2023Freedreno on Android          – XDC 2023
Freedreno on Android – XDC 2023Igalia
 
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapUtilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapGeorge Markomanolis
 
Kubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformKubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformBob Killen
 
Energy profiler for android emulator
Energy profiler for android emulatorEnergy profiler for android emulator
Energy profiler for android emulatorDiego Ruggeri
 
Lec 10-linux-review
Lec 10-linux-reviewLec 10-linux-review
Lec 10-linux-reviewabinaya m
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8AbdullahMunir32
 

Similar to Introduction to OpenCL By Hammad Ghulam Mustafa (20)

Introduction to OpenCL
Introduction to OpenCLIntroduction to OpenCL
Introduction to OpenCL
 
MattsonTutorialSC14.pptx
MattsonTutorialSC14.pptxMattsonTutorialSC14.pptx
MattsonTutorialSC14.pptx
 
Hands on OpenCL
Hands on OpenCLHands on OpenCL
Hands on OpenCL
 
OpenCL Programming 101
OpenCL Programming 101OpenCL Programming 101
OpenCL Programming 101
 
MattsonTutorialSC14.pdf
MattsonTutorialSC14.pdfMattsonTutorialSC14.pdf
MattsonTutorialSC14.pdf
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
OpenCL Heterogeneous Parallel Computing
OpenCL Heterogeneous Parallel ComputingOpenCL Heterogeneous Parallel Computing
OpenCL Heterogeneous Parallel Computing
 
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by  Mikael ...
WT-4069, WebCL: Enabling OpenCL Acceleration of Web Applications, by Mikael ...
 
Evaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI SupercomputerEvaluating GPU programming Models for the LUMI Supercomputer
Evaluating GPU programming Models for the LUMI Supercomputer
 
Neptune @ SoCal
Neptune @ SoCalNeptune @ SoCal
Neptune @ SoCal
 
clWrap: Nonsense free control of your GPU
clWrap: Nonsense free control of your GPUclWrap: Nonsense free control of your GPU
clWrap: Nonsense free control of your GPU
 
Server-side JS with NodeJS
Server-side JS with NodeJSServer-side JS with NodeJS
Server-side JS with NodeJS
 
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
 
Freedreno on Android – XDC 2023
Freedreno on Android          – XDC 2023Freedreno on Android          – XDC 2023
Freedreno on Android – XDC 2023
 
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapUtilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmap
 
Kubernetes: The Next Research Platform
Kubernetes: The Next Research PlatformKubernetes: The Next Research Platform
Kubernetes: The Next Research Platform
 
Energy profiler for android emulator
Energy profiler for android emulatorEnergy profiler for android emulator
Energy profiler for android emulator
 
TechBeats #2
TechBeats #2TechBeats #2
TechBeats #2
 
Lec 10-linux-review
Lec 10-linux-reviewLec 10-linux-review
Lec 10-linux-review
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
 

Recently uploaded

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Recently uploaded (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Introduction to OpenCL By Hammad Ghulam Mustafa

  • 1. INTRODUCTION Hammad Ghulam Mustafa Hafiz Muhammad Noman Zahid Muhammad Abdullah Ijaz Malik Waqas Bashir Abid Muhammad Umar Arshad Umair Javaid Suleman Khan Ali Islal
  • 2. • Introduction • Programming Basics • OpenCL Execution Model • “Hello World” • Conclusion
  • 3. • Standard for the development of data parallel applications • Most used for the development of GPGPU applications • General Purpose computing on Graphics Processing Units • A GPU is comprised of hundreds of compute cores • Specialized for massively data parallel computation
  • 4. • GPGPU: Take advantage of GPU’s computing power to make massively parallel applications • Parallel applications with huge acceleration in Molecular Dynamics, Image Processing, Evolutionary Computation,… • All cases based on data parallelism: each thread processes a subset of the data For example, a vector addition:
  • 5. • Furthermore, OpenCL provides portability: same code can run on different architectures • For Example:
  • 6. • Provides the following abstraction: A compute device is composed by compute units • OpenCL platform: Host + Compute Devices Each manufacturer provides an SDK: • NVIDIA SDK for GPUs • AMD APP for CPUs/GPU • Intel for CPUs • IBM for PowerPC and Cell B/E
  • 7. • Kernel: function that defines the behavior of each thread • For example, kernel for vector addition: __kernel void sumKernel ( __global int* a, __global int* b, __global int* c) { int i = get_global_id(0); c[i] = a[i] + b[i]; } Written in OpenCL-C: ANSI-C + Set of kernel functions, e.g.: • get_global_id: obtains thread index • barrier: synchronizes threads
  • 8. • An OpenCL applications consists of: • Basic host application flow: a. Load and Compilation of kernel b. Data copy from host to device (e.g. from CPU to GPU) c. Execution of kernel d. Data copy from device to host e. Release kernels and data from device memory • Execution using command queue in each device
  • 9. • Host code: programmed using OpenCL API • API Calls, such as: • clCreateProgramWithSource: Load kernel from char* • clBuildProgram: Compile kernel • clSetKernelArgs: Set kernel arguments for the device • clEnqueueWriteBuffer/clEnqueueRead: Copy data vector to device • clEnqueueNDRangerKernel: Launch kernel in device • API Types, such as: • cl_mem: Pointer to device memory objects • cl_program: Kernel object • cl_float / cl_int / cl_uint: Redefinition of C types
  • 10.
  • 11. • Kernel • Basic unit of executable code -similar to a C function • Data-parallel or task-parallel • H.264Encode is not a kernel • Kernel should be a small separate function (SAD) • Program • Collection of kernels and other functions • Analogous to a dynamic library • Applications queue kernel execution instances • Queued in-order • Executed in-order or out-of-order
  • 12. • Define N-dimensional computation domain (N = 1, 2 or 3) • Each independent element of execution in N-D domain is called a work-item • The N-D domain defines the total number of work-items that execute in parallel
  • 13. • Create a program • Input: String (source code) or precompiled binary • Analogous to a dynamic library: A collection of kernels • Compile the program • Specify the devices for which kernels should be compiled • Pass in compiler flags • Check for compilation/build errors • Create the kernels • Returns a kernel object used to hold arguments for a given execution
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
  • 24. • OpenCL does not provide performance portability • Alternative to NVIDIA CUDA: Programming paradigm for NVIDIA GPU cards • Combinable with other parallel programming models: OpenMP for SMPs / MPI for MPPs • Huge ecosystems for OpenCL, e.g. OpenACC: Develop GPGPU applications using directives #pragma acc kernels for(i = 0; i< N; i++) c[i] = b[i] + a[i];