SlideShare a Scribd company logo
1 of 20
CUDA 基本介绍
GPU 介绍

GPU 英文全称 Graphic Processing Unit ,中
 文
译为“图形处理器”。 GPU 是相
对于 CPU 的一个概念, GPU 是
显示卡的“心脏”,也就相当于
CPU 在电脑中的作用,它决定
了该显卡的档次和大部分性能。
GPU 的优势
   强大的处理能力 GPU 接近 1Tflops/s
   高带宽     140GB/s
   低成本     Gflop/$ 和 Gflops/w 高于
    CPU

   当前世界超级计算机五百强的入门门槛为
    12Tflops/s
   一个三节点,每节点 4GPU 的集群,总处理能
    力就超过 12Tflops/s ,如果使用 GTX280 只需
    10 万元左右,使用专用的 Tesla 也只需 20 万左
    右
GPU /CPU 计算能力比较
GPU/CPU 存储器带宽比较
GPU/CPU 晶体管的使用
GPU 发展

   1993 年,黄仁勋与 C.P 和 C.M 共同成立
    NVIDIA 公司。
   2007 年, CUDA 正式发布,引发 GPU 通用计
    算的革命。
   2009 年, GeForce GTS 250 发布。
   2011 年 3 月 24 日, GeForce GTX 590 发布,
    这是 Nvidia 公司最新的显卡。
CUDA
   CUDA(Compute Unified Device Architecture ,
    计算统一设备架构 ) ,由显卡厂商 Nvidia 推出
    的运算平台。
   CUDA 是一种通用并行计算架构,该架构使
    GPU 能够解决复杂的计算问题。开发人员可以
    使用 C 语言来为 CUDA 架构编写程序。
   CUDA的硬件架构
   CUDA的软件架构
体系架构

   体系架构由两部分组成,分别是流处理器阵列
    ( SPA )和存储器系统。 (GT200)
   GPU 的巨大计算能力来自 SPA 中的大量计算单
    元, SPA 的结构又分为两层: TPC(线程处理
    器群)和 SM(流多处理器);
   存储器系统由几个部分组成:存储器控制器
    ( MMC ),固定功能的光栅操作单元( ROP
    ),以及二级纹理缓存。
GT200 架构
TPC

•3 SM
•Instruction and constant cache
•Texture
•Load/store
SM
ROP

•对 DRAM
进行访问
•TEXTURE
机制
•对 global
的 atomic 操
作
CUDA 执行模型
   将 CPU 作为主机 (Host) ,而 GPU 作为协处理
    器 (Coprocessor) 或者设备( Device ),从而
    让 GPU 来运行一些能够被高度线程化的程序。
   在这个模型中, CPU 与 GPU 协同工作, CPU
    负责进行逻辑性强的事务处理和串行计
    算, GPU 则专注于执行高度线程化的并行处理
    任务。
   一个完整的 CUDA 程序是由一系列的设备端
    kernel 函数并行步骤和主机端的串行处理步骤
    共同组成的。
CUDA 执行模型

   grid 运行在 SPA 上
   block 运行在 SM 上
   thread 运行在 SP 上
grid block thread
   Kernel 不是一个完整的程序,而只是其中的一
    个关键并行计算步骤。
   Kernel 以一个网格 (Grid) 的形式执行,每个网
    格由若干个线程块( block )组成,每一个线程
    块又由若干个线程 (thread) 组成。
   一个 grid 最多可以有 65535 * 65535 个 block
   一个 block 总共最多可以有 512 个 thread ,在
    三个维度上的最大值分别为 512, 512 和 64
存储器模型
   Register
   Local
   shared
   Global
   Constant
   Texture
   Host memory
   Pinned host memory
CUDA C 语言
   由 Nvidia 的 CUDA 编译器 (nvcc) 编译
   CUDA C 不是 C 语言,而是对 C 语言进行扩展形成的变种
   引入了函数类型限定符: __device__ , __host__ 和
    __global__ 。
   引入了变量限定符: __device__ , __shared__ 和
    __constant__ 。
   引入了内置矢量类型: char1 , dim3 , double2 等
   引入了内建变量: blockIdx , threadIdx , gridDim , blockDim 和
    warpSize
   引入了 <<<>>> 运算符
   引入了一些函数:同步函数,原子函数,纹理函数等
主机端代码主要完成的功能:
   启动 CUDA
   为输入数据分配内存空间
   初始化输入数据
   为 GPU 分配显存,存放输入数据
   将内存输入数据拷贝到显存
   为 GPU 分配显存,存放输出数据
   调用 device 端的 kernel 计算
   为 CPU 分配内存,存放输出数据
   将显存结果读到内存
   使用 CPU 进行其他处理
   释放内存和显存空间
   退出 CUDA
设备端代码主要完成的功能

   从显存读数据到 GPU 中
   对数据处理
   将处理后的数据写回显存

More Related Content

What's hot

了解内存
了解内存了解内存
了解内存Feng Yu
 
实习报告
实习报告实习报告
实习报告PengFan
 
数据库备份瓶颈及Hdfs方案考虑 最终版
数据库备份瓶颈及Hdfs方案考虑 最终版数据库备份瓶颈及Hdfs方案考虑 最终版
数据库备份瓶颈及Hdfs方案考虑 最终版jinqing zhu
 
FtnApp 的缩略图实践
FtnApp 的缩略图实践FtnApp 的缩略图实践
FtnApp 的缩略图实践Frank Xu
 
Gdb principle
Gdb principleGdb principle
Gdb principlelibfetion
 
利用新硬件提升数据库性能
利用新硬件提升数据库性能利用新硬件提升数据库性能
利用新硬件提升数据库性能Feng Yu
 
云计算环境中Ssd在cassandra测试的性能表现
云计算环境中Ssd在cassandra测试的性能表现 云计算环境中Ssd在cassandra测试的性能表现
云计算环境中Ssd在cassandra测试的性能表现 july19850903
 
Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇bluedavy lin
 
排队论及其应用浅析
排队论及其应用浅析排队论及其应用浅析
排队论及其应用浅析frogd
 
Mongo简介
Mongo简介Mongo简介
Mongo简介wuda0112
 
Sun JDK 1.6内存管理 -调优篇
Sun JDK 1.6内存管理 -调优篇Sun JDK 1.6内存管理 -调优篇
Sun JDK 1.6内存管理 -调优篇bluedavy lin
 
Aka Kernel 2008 Herbert Xu
Aka Kernel 2008 Herbert XuAka Kernel 2008 Herbert Xu
Aka Kernel 2008 Herbert Xukylexlau
 
百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010Chuanying Du
 

What's hot (16)

了解内存
了解内存了解内存
了解内存
 
10 存储系统02
10 存储系统0210 存储系统02
10 存储系统02
 
实习报告
实习报告实习报告
实习报告
 
Last
LastLast
Last
 
09 存储系统01
09 存储系统0109 存储系统01
09 存储系统01
 
数据库备份瓶颈及Hdfs方案考虑 最终版
数据库备份瓶颈及Hdfs方案考虑 最终版数据库备份瓶颈及Hdfs方案考虑 最终版
数据库备份瓶颈及Hdfs方案考虑 最终版
 
FtnApp 的缩略图实践
FtnApp 的缩略图实践FtnApp 的缩略图实践
FtnApp 的缩略图实践
 
Gdb principle
Gdb principleGdb principle
Gdb principle
 
利用新硬件提升数据库性能
利用新硬件提升数据库性能利用新硬件提升数据库性能
利用新硬件提升数据库性能
 
云计算环境中Ssd在cassandra测试的性能表现
云计算环境中Ssd在cassandra测试的性能表现 云计算环境中Ssd在cassandra测试的性能表现
云计算环境中Ssd在cassandra测试的性能表现
 
Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇Sun jdk 1.6内存管理 -使用篇
Sun jdk 1.6内存管理 -使用篇
 
排队论及其应用浅析
排队论及其应用浅析排队论及其应用浅析
排队论及其应用浅析
 
Mongo简介
Mongo简介Mongo简介
Mongo简介
 
Sun JDK 1.6内存管理 -调优篇
Sun JDK 1.6内存管理 -调优篇Sun JDK 1.6内存管理 -调优篇
Sun JDK 1.6内存管理 -调优篇
 
Aka Kernel 2008 Herbert Xu
Aka Kernel 2008 Herbert XuAka Kernel 2008 Herbert Xu
Aka Kernel 2008 Herbert Xu
 
百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010百度系统部分布式系统介绍 马如悦 Sacc2010
百度系统部分布式系统介绍 马如悦 Sacc2010
 

Viewers also liked

Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...
Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...
Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...Shahadat Hossain Shakil
 
Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...
Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...
Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...Shahadat Hossain Shakil
 
Foursquare per sales i clubs
Foursquare per sales i clubsFoursquare per sales i clubs
Foursquare per sales i clubsSem Pons Puig
 
Topographic Analysis Linkages among Climate, Erosion and Tectonics
Topographic Analysis Linkages among Climate, Erosion and TectonicsTopographic Analysis Linkages among Climate, Erosion and Tectonics
Topographic Analysis Linkages among Climate, Erosion and TectonicsShahadat Hossain Shakil
 
Ecological Footprint as a Sustainability Indicator
Ecological Footprint as a Sustainability IndicatorEcological Footprint as a Sustainability Indicator
Ecological Footprint as a Sustainability IndicatorShahadat Hossain Shakil
 
Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...
Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...
Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...Shahadat Hossain Shakil
 
Public Participation and Lay Knowledge in Environmental Governance: A Case St...
Public Participation and Lay Knowledge in Environmental Governance: A Case St...Public Participation and Lay Knowledge in Environmental Governance: A Case St...
Public Participation and Lay Knowledge in Environmental Governance: A Case St...Shahadat Hossain Shakil
 
Image of Chawk Bazar an Analysis from Physical and Socio Economic Perspectives
Image of Chawk Bazar an Analysis from Physical and Socio Economic PerspectivesImage of Chawk Bazar an Analysis from Physical and Socio Economic Perspectives
Image of Chawk Bazar an Analysis from Physical and Socio Economic PerspectivesShahadat Hossain Shakil
 
Does Distribution of Schools Matter in Human Development? - A Case Study of B...
Does Distribution of Schools Matter in Human Development? - A Case Study of B...Does Distribution of Schools Matter in Human Development? - A Case Study of B...
Does Distribution of Schools Matter in Human Development? - A Case Study of B...Shahadat Hossain Shakil
 
Effectiveness of Environmental Impact Assessment (EIA): Bangladesh Perspective
Effectiveness of Environmental Impact Assessment (EIA):  Bangladesh PerspectiveEffectiveness of Environmental Impact Assessment (EIA):  Bangladesh Perspective
Effectiveness of Environmental Impact Assessment (EIA): Bangladesh PerspectiveShahadat Hossain Shakil
 
Climate Change Adaptation through Multi-level Governance: Perspectives from C...
Climate Change Adaptation through Multi-level Governance: Perspectives from C...Climate Change Adaptation through Multi-level Governance: Perspectives from C...
Climate Change Adaptation through Multi-level Governance: Perspectives from C...Shahadat Hossain Shakil
 
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...Shahadat Hossain Shakil
 
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...Shahadat Hossain Shakil
 
Role of Environmental Statement Review in EIA Process
Role of Environmental Statement Review in EIA ProcessRole of Environmental Statement Review in EIA Process
Role of Environmental Statement Review in EIA ProcessShahadat Hossain Shakil
 
Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...
Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...
Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...Shahadat Hossain Shakil
 
Stakeholder Debate in Policy Implementation: An Evaluation of Bangladesh Leat...
Stakeholder Debate in Policy Implementation:An Evaluation of Bangladesh Leat...Stakeholder Debate in Policy Implementation:An Evaluation of Bangladesh Leat...
Stakeholder Debate in Policy Implementation: An Evaluation of Bangladesh Leat...Shahadat Hossain Shakil
 
Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...
Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...
Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...Shahadat Hossain Shakil
 
Technical and Financial Proposal-Consultancy Services for the Preparation of ...
Technical and Financial Proposal-Consultancy Services for the Preparation of ...Technical and Financial Proposal-Consultancy Services for the Preparation of ...
Technical and Financial Proposal-Consultancy Services for the Preparation of ...Shahadat Hossain Shakil
 
Impact of Different Types of Land Use on Transportation System of Dhaka City ...
Impact of Different Types of Land Use on Transportation System of Dhaka City ...Impact of Different Types of Land Use on Transportation System of Dhaka City ...
Impact of Different Types of Land Use on Transportation System of Dhaka City ...Shahadat Hossain Shakil
 

Viewers also liked (20)

Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...
Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...
Exploration of Pipeline Water System in Doldoli Tea Garden and Its Feasibilit...
 
Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...
Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...
Assessment of the Extent to which Strategic Environmental Assessment (SEA) ca...
 
Foursquare per sales i clubs
Foursquare per sales i clubsFoursquare per sales i clubs
Foursquare per sales i clubs
 
Topographic Analysis Linkages among Climate, Erosion and Tectonics
Topographic Analysis Linkages among Climate, Erosion and TectonicsTopographic Analysis Linkages among Climate, Erosion and Tectonics
Topographic Analysis Linkages among Climate, Erosion and Tectonics
 
Ecological Footprint as a Sustainability Indicator
Ecological Footprint as a Sustainability IndicatorEcological Footprint as a Sustainability Indicator
Ecological Footprint as a Sustainability Indicator
 
Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...
Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...
Carbon Emission from Domestic Level Consumption: Ecological Footprint Account...
 
Public Participation and Lay Knowledge in Environmental Governance: A Case St...
Public Participation and Lay Knowledge in Environmental Governance: A Case St...Public Participation and Lay Knowledge in Environmental Governance: A Case St...
Public Participation and Lay Knowledge in Environmental Governance: A Case St...
 
Ua nmp session1
Ua nmp session1Ua nmp session1
Ua nmp session1
 
Image of Chawk Bazar an Analysis from Physical and Socio Economic Perspectives
Image of Chawk Bazar an Analysis from Physical and Socio Economic PerspectivesImage of Chawk Bazar an Analysis from Physical and Socio Economic Perspectives
Image of Chawk Bazar an Analysis from Physical and Socio Economic Perspectives
 
Does Distribution of Schools Matter in Human Development? - A Case Study of B...
Does Distribution of Schools Matter in Human Development? - A Case Study of B...Does Distribution of Schools Matter in Human Development? - A Case Study of B...
Does Distribution of Schools Matter in Human Development? - A Case Study of B...
 
Effectiveness of Environmental Impact Assessment (EIA): Bangladesh Perspective
Effectiveness of Environmental Impact Assessment (EIA):  Bangladesh PerspectiveEffectiveness of Environmental Impact Assessment (EIA):  Bangladesh Perspective
Effectiveness of Environmental Impact Assessment (EIA): Bangladesh Perspective
 
Climate Change Adaptation through Multi-level Governance: Perspectives from C...
Climate Change Adaptation through Multi-level Governance: Perspectives from C...Climate Change Adaptation through Multi-level Governance: Perspectives from C...
Climate Change Adaptation through Multi-level Governance: Perspectives from C...
 
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
 
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
Sustainable City Design: Developing Conceptual Planning Proposal for Eastern ...
 
Role of Environmental Statement Review in EIA Process
Role of Environmental Statement Review in EIA ProcessRole of Environmental Statement Review in EIA Process
Role of Environmental Statement Review in EIA Process
 
Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...
Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...
Parking Demand & Supply Analysis of Different Commercial Land Uses Along Mirp...
 
Stakeholder Debate in Policy Implementation: An Evaluation of Bangladesh Leat...
Stakeholder Debate in Policy Implementation:An Evaluation of Bangladesh Leat...Stakeholder Debate in Policy Implementation:An Evaluation of Bangladesh Leat...
Stakeholder Debate in Policy Implementation: An Evaluation of Bangladesh Leat...
 
Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...
Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...
Promotion of Agricultural Product (Jute, Sweetmeat) in Districts (Shariatpur,...
 
Technical and Financial Proposal-Consultancy Services for the Preparation of ...
Technical and Financial Proposal-Consultancy Services for the Preparation of ...Technical and Financial Proposal-Consultancy Services for the Preparation of ...
Technical and Financial Proposal-Consultancy Services for the Preparation of ...
 
Impact of Different Types of Land Use on Transportation System of Dhaka City ...
Impact of Different Types of Land Use on Transportation System of Dhaka City ...Impact of Different Types of Land Use on Transportation System of Dhaka City ...
Impact of Different Types of Land Use on Transportation System of Dhaka City ...
 

Similar to Cuda基本介绍

GPU通用计算调研报告
GPU通用计算调研报告GPU通用计算调研报告
GPU通用计算调研报告onemonkey
 
Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境
Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境
Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境dbabc
 
硬件体系架构浅析
硬件体系架构浅析硬件体系架构浅析
硬件体系架构浅析frogd
 
MySQL新技术探索与实践
MySQL新技术探索与实践MySQL新技术探索与实践
MySQL新技术探索与实践Lixun Peng
 
0118 Windows Server 2008 的伺服器核心 (Server Core)
0118 Windows Server 2008 的伺服器核心 (Server Core)0118 Windows Server 2008 的伺服器核心 (Server Core)
0118 Windows Server 2008 的伺服器核心 (Server Core)Timothy Chen
 
LinkIt Smart 7688程式開發
LinkIt Smart 7688程式開發LinkIt Smart 7688程式開發
LinkIt Smart 7688程式開發Wei-Tsung Su
 
Deep learning hardware architecture and software deploy with docker
Deep learning hardware architecture and software deploy with dockerDeep learning hardware architecture and software deploy with docker
Deep learning hardware architecture and software deploy with dockerYa-Lun Li
 
Hp刀片机测试
Hp刀片机测试Hp刀片机测试
Hp刀片机测试alex1x
 
Exadata training
Exadata trainingExadata training
Exadata trainingLouis liu
 
尚观Linux研究室 linux驱动程序全解析
尚观Linux研究室   linux驱动程序全解析尚观Linux研究室   linux驱动程序全解析
尚观Linux研究室 linux驱动程序全解析hangejnu
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentAnna Yen
 
计算机硬件基础知识 台式机
计算机硬件基础知识 台式机计算机硬件基础知识 台式机
计算机硬件基础知识 台式机yeminwang
 
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure National Cheng Kung University
 
淘宝商品库MySQL优化实践
淘宝商品库MySQL优化实践淘宝商品库MySQL优化实践
淘宝商品库MySQL优化实践Feng Yu
 

Similar to Cuda基本介绍 (20)

GPU通用计算调研报告
GPU通用计算调研报告GPU通用计算调研报告
GPU通用计算调研报告
 
Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境
Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境
Dbabc.net 利用heartbeat + drbd搭建my sql高可用环境
 
硬件体系架构浅析
硬件体系架构浅析硬件体系架构浅析
硬件体系架构浅析
 
MySQL新技术探索与实践
MySQL新技术探索与实践MySQL新技术探索与实践
MySQL新技术探索与实践
 
0118 Windows Server 2008 的伺服器核心 (Server Core)
0118 Windows Server 2008 的伺服器核心 (Server Core)0118 Windows Server 2008 的伺服器核心 (Server Core)
0118 Windows Server 2008 的伺服器核心 (Server Core)
 
LinkIt Smart 7688程式開發
LinkIt Smart 7688程式開發LinkIt Smart 7688程式開發
LinkIt Smart 7688程式開發
 
Deep learning hardware architecture and software deploy with docker
Deep learning hardware architecture and software deploy with dockerDeep learning hardware architecture and software deploy with docker
Deep learning hardware architecture and software deploy with docker
 
Tcfsh bootcamp day2
 Tcfsh bootcamp day2 Tcfsh bootcamp day2
Tcfsh bootcamp day2
 
Hp刀片机测试
Hp刀片机测试Hp刀片机测试
Hp刀片机测试
 
Exadata training
Exadata trainingExadata training
Exadata training
 
尚观Linux研究室 linux驱动程序全解析
尚观Linux研究室   linux驱动程序全解析尚观Linux研究室   linux驱动程序全解析
尚观Linux研究室 linux驱动程序全解析
 
How to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environmentHow to plan a hadoop cluster for testing and production environment
How to plan a hadoop cluster for testing and production environment
 
计算机硬件基础知识 台式机
计算机硬件基础知识 台式机计算机硬件基础知识 台式机
计算机硬件基础知识 台式机
 
Godson x86
Godson x86Godson x86
Godson x86
 
9439AD2
9439AD29439AD2
9439AD2
 
Ch2 4
Ch2 4Ch2 4
Ch2 4
 
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
Accelerating or Complicating PHP execution by LLVM Compiler Infrastructure
 
17 cpu01
17 cpu0117 cpu01
17 cpu01
 
Gpu簡介
Gpu簡介Gpu簡介
Gpu簡介
 
淘宝商品库MySQL优化实践
淘宝商品库MySQL优化实践淘宝商品库MySQL优化实践
淘宝商品库MySQL优化实践
 

Cuda基本介绍

  • 2. GPU 介绍 GPU 英文全称 Graphic Processing Unit ,中 文 译为“图形处理器”。 GPU 是相 对于 CPU 的一个概念, GPU 是 显示卡的“心脏”,也就相当于 CPU 在电脑中的作用,它决定 了该显卡的档次和大部分性能。
  • 3. GPU 的优势  强大的处理能力 GPU 接近 1Tflops/s  高带宽 140GB/s  低成本 Gflop/$ 和 Gflops/w 高于 CPU  当前世界超级计算机五百强的入门门槛为 12Tflops/s  一个三节点,每节点 4GPU 的集群,总处理能 力就超过 12Tflops/s ,如果使用 GTX280 只需 10 万元左右,使用专用的 Tesla 也只需 20 万左 右
  • 7. GPU 发展  1993 年,黄仁勋与 C.P 和 C.M 共同成立 NVIDIA 公司。  2007 年, CUDA 正式发布,引发 GPU 通用计 算的革命。  2009 年, GeForce GTS 250 发布。  2011 年 3 月 24 日, GeForce GTX 590 发布, 这是 Nvidia 公司最新的显卡。
  • 8. CUDA  CUDA(Compute Unified Device Architecture , 计算统一设备架构 ) ,由显卡厂商 Nvidia 推出 的运算平台。  CUDA 是一种通用并行计算架构,该架构使 GPU 能够解决复杂的计算问题。开发人员可以 使用 C 语言来为 CUDA 架构编写程序。  CUDA的硬件架构  CUDA的软件架构
  • 9. 体系架构  体系架构由两部分组成,分别是流处理器阵列 ( SPA )和存储器系统。 (GT200)  GPU 的巨大计算能力来自 SPA 中的大量计算单 元, SPA 的结构又分为两层: TPC(线程处理 器群)和 SM(流多处理器);  存储器系统由几个部分组成:存储器控制器 ( MMC ),固定功能的光栅操作单元( ROP ),以及二级纹理缓存。
  • 11. TPC •3 SM •Instruction and constant cache •Texture •Load/store
  • 12. SM
  • 14. CUDA 执行模型  将 CPU 作为主机 (Host) ,而 GPU 作为协处理 器 (Coprocessor) 或者设备( Device ),从而 让 GPU 来运行一些能够被高度线程化的程序。  在这个模型中, CPU 与 GPU 协同工作, CPU 负责进行逻辑性强的事务处理和串行计 算, GPU 则专注于执行高度线程化的并行处理 任务。  一个完整的 CUDA 程序是由一系列的设备端 kernel 函数并行步骤和主机端的串行处理步骤 共同组成的。
  • 15. CUDA 执行模型  grid 运行在 SPA 上  block 运行在 SM 上  thread 运行在 SP 上
  • 16. grid block thread  Kernel 不是一个完整的程序,而只是其中的一 个关键并行计算步骤。  Kernel 以一个网格 (Grid) 的形式执行,每个网 格由若干个线程块( block )组成,每一个线程 块又由若干个线程 (thread) 组成。  一个 grid 最多可以有 65535 * 65535 个 block  一个 block 总共最多可以有 512 个 thread ,在 三个维度上的最大值分别为 512, 512 和 64
  • 17. 存储器模型  Register  Local  shared  Global  Constant  Texture  Host memory  Pinned host memory
  • 18. CUDA C 语言  由 Nvidia 的 CUDA 编译器 (nvcc) 编译  CUDA C 不是 C 语言,而是对 C 语言进行扩展形成的变种  引入了函数类型限定符: __device__ , __host__ 和 __global__ 。  引入了变量限定符: __device__ , __shared__ 和 __constant__ 。  引入了内置矢量类型: char1 , dim3 , double2 等  引入了内建变量: blockIdx , threadIdx , gridDim , blockDim 和 warpSize  引入了 <<<>>> 运算符  引入了一些函数:同步函数,原子函数,纹理函数等
  • 19. 主机端代码主要完成的功能:  启动 CUDA  为输入数据分配内存空间  初始化输入数据  为 GPU 分配显存,存放输入数据  将内存输入数据拷贝到显存  为 GPU 分配显存,存放输出数据  调用 device 端的 kernel 计算  为 CPU 分配内存,存放输出数据  将显存结果读到内存  使用 CPU 进行其他处理  释放内存和显存空间  退出 CUDA
  • 20. 设备端代码主要完成的功能  从显存读数据到 GPU 中  对数据处理  将处理后的数据写回显存