IAP09 CUDA@MIT 6.963 - Lecture 01: Logistics (Nicolas Pinto, MIT)

•

1 like•1,937 views

This document provides an overview of a course on programming massively parallel processors using CUDA. The course goals are to teach students how to achieve high performance, functionality, maintainability and scalability when programming parallel processors. Students will learn principles and patterns of parallel programming, processor architecture and programming tools and techniques. The course will be taught at MIT by Nicolas Pinto and feature guest lectures from NVIDIA researchers on GPU computing and scientific applications. Students will complete assignments, work in teams on projects and have access to over 30 GPUs and other hardware resources.

Education Technology

6.963
IT /
A@M
CUD
9
IAP0

Supercomputing on your desktop:
Programming the next generation of cheap
and massively parallel hardware using CUDA

Lecture 01
Nicolas Pinto (MIT)

Kick - Off session

Still doing your
computations the old way?

Fresh New
Technology
Available NOW!

09)
IAP
(
63
6.9

Course Goals
• Learn how to program massively parallel
processors and achieve
–high performance
–functionality and maintainability
–scalability across future generations
• Acquire technical knowledge required to
achieve the above goals
–principles and patterns of parallel programming
–processor architecture features and constraints
–programming API, tools and techniques
6.963
d for
apte
© David Kirk/NVIDIA and Wen-mei W. Hwu, 2007

ad
ECE 498AL1, University of Illinois, Urbana-Champaign

Class logistics
Teaching Staff (MIT)

GPU Computing with
CUDA David Luebke (NVIDIA)
CUDA Demos
Marc Adams (NVIDIA)

Class logistics
Teaching Staff (MIT)

GPU Computing with
CUDA David Luebke (NVIDIA)
CUDA Demos
Marc Adams (NVIDIA)

High-Throughput
Scientiﬁc Computing
Hanspeter Pﬁster (Harvard)

af f
St
ing
ach
Te

Faculty: Prof. Steven G. Johnson

af f
St
ing
ach
Te

TAs: Justin Riley and Nicolas Poilvert

af f
St
ing
ach
Te

Instructor: Nicolas Pinto
Contact: pinto@mit.edu

ule
ed
ch
S
Lectures: M/W/F 10-12 (#32-155)
HandsOn: M/W/F 2-5 (#32-141)

Project Hours: T/R 2-5 (#3-370)

ule
ed
ch
S
/ CUDA Basics
/
/ CUDA Advanced
/
/ Theory
/
/ Case Studies
/
/ Projects
/

are
rdw
Ha

$70,000+
from NVIDIA, Rowland/Harvard and MIT
(OEIT, DiCarlo Lab, Graphics CSAIL, EECS)

ect
oj
Pr
he
T Project Presentations
@the_end_of_the_course
MIT
6.963

DO
TO
1) Discussion Group
2) Team Project
3) Assignments
4) Enjoy!
Contact: pinto@mit.edu

This document discusses NVIDIA's deep learning technologies and platforms. It highlights NVIDIA's GPUs and deep learning software that accelerate major deep learning frameworks and power applications like self-driving cars, medical robotics, and natural language processing. It also introduces NVIDIA's deep learning supercomputer DGX-1 and embedded module Jetson TX1 for edge devices. The document promotes NVIDIA's deep learning events and career opportunities.

Possibilities of generative models

Alison B. Lowndes

08 Supercomputer Fugaku

RCCSRENKEI

Fugaku is a Japanese supercomputer utilizing Fujitsu's A64FX CPU. It was designed through an iterative co-design process between application developers and Fujitsu to achieve over 100x performance gain compared to the previous K computer within a 30-40MW power budget. The A64FX CPU utilizes 7nm technology and features 48 Arm-based cores with high bandwidth memory to achieve superior floating point and memory bandwidth performance efficiently. Early evaluations show Fugaku meeting performance and power targets and outperforming x86 processors for real applications.

Hire a Machine to Code - Michael Arthur Bucko & Aurélien Nicolas

WithTheBest

Harnessing AI for the Benefit of All.

Alison B. Lowndes

This document provides an overview of AI and GPU technologies from NVIDIA. It discusses NVIDIA's GPU computing platforms like DGX, Jetson, and AGX which are used for AI training and inference. It also summarizes NVIDIA's tools and frameworks like CUDA, TensorRT, and DeepStream which help accelerate AI workflows. Finally, it promotes NVIDIA's training resources like the Deep Learning Institute to help developers get started with AI.

09 The Extreme-scale Scientific Software Stack for Collaborative Open Source

RCCSRENKEI

The Extreme-scale Scientific Software Stack (E4S) provides a curated collection of open source software for exascale computing. It is based on the Spack package manager and delivers pre-built binaries and container images for popular ECP software like MPI, OpenMP, math libraries, I/O libraries and more. The E4S aims to ensure the interoperability of these software packages and their portability across different computer architectures. It provides a standardized way to install and deploy exascale software and helps application developers integrate these tools into their own projects.

Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang

PAPIs.io

This document introduces multi-GPU deep learning with DIGITS 2. It begins with an overview of deep learning and how GPUs are well-suited for deep learning tasks due to their parallel processing capabilities. It then discusses NVIDIA DIGITS, an interactive deep learning system that allows users to design neural networks, visualize activations, and manage training across multiple GPUs. The document concludes by discussing deep learning deployment workflows.

Takaki Hatsui from RIKEN presented on new opportunities in photon science with the development of the CITIUS high-speed X-ray imaging detector. The CITIUS detector can achieve frame rates of 17.4 kHz with 1800 photons per frame and high dynamic range. This represents a significant advancement over conventional detectors. However, the large data output from such detectors poses challenges that will require edge computing and high performance computing to process and store the data. Major synchrotron facilities expect data outputs to reach tens to hundreds of petabytes per year requiring proper infrastructure to manage exascale data from next generation photon science experiments.

10 Abundant-Data Computing

RCCSRENKEI

1. The document discusses a new approach called Abundant-Data Computing or N3XT that uses nano-engineered computing systems to immerse computation in dense, ultra-efficient memory for massive performance and energy benefits. 2. Key aspects of N3XT include using carbon nanotube field-effect transistors (CNFETs) and resistive RAM (RRAM) in a 3D monolithic integration scheme to build highly dense and efficient computing and memory. 3. Simulation results show N3XT could deliver over 1,000x benefits in energy and execution time for applications like deep learning compared to conventional architectures, enabling continuous machine learning for years on low power.

Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...

Intel® Software

組み込みから HPC まで ARM コアで実現するエコシステム

Shinnosuke Furuya

This document discusses NVIDIA's chips for automotive, HPC, and networking. For automotive, it describes the Tegra line of SOC chips used in cars like Tesla, and upcoming chips like Orin and Atlan. For HPC, it introduces the upcoming Grace CPU designed for giant AI models. For networking, it presents the BlueField line of data processing units (DPUs) including the new 400Gbps BlueField-3 chip and the DOCA software framework. The document emphasizes that NVIDIA's GPU, CPU, and DPU chips make yearly leaps while sharing a common architecture.

NVIDIA @ Infinite Conference, London

Alison B. Lowndes

This document provides an overview of Alison B Lowndes' work in artificial intelligence and how AI is being applied across various industries. It discusses using sensors and AI for applications in automotive, communications, consumer goods, financial services, education, manufacturing, media, online services, healthcare, oil and gas, retail, transportation, and utilities. It also briefly outlines Alison's role at a frontier development lab focusing on using AI to help "spaceship earth."

NVIDIA深度學習教育機構 (DLI): Object detection with jetson

NVIDIA Taiwan

運用 NVIDIA JETSON TX1 開發者套件進行圖像分類與物體檢測課程目標與介紹學習構建一個從訓練數據庫到智慧終端深度學習流程。在本課程終您將實際操作開發技能，不僅訓練深層神經網絡，而且還將如何在生產環境中進行部署。在本實驗中，您可以將預先訓練的圖像分類和物體檢測的神經網絡，並將其部署在 Jetson TX1 或 TX2 開發者套件上，接著使用內建攝影機測試這些網絡，以分類和檢測幾個真實世界的對象。網絡將部署在各種編程環境中，我們甚至將介紹如何使用 NVIDIA 的 TensorRT 推理引擎庫在運行時優化分類和檢測性能。

Classification of aerial photographs using DIGITS 2 - Mike Wang

PAPIs.io

NVIDIA's DIGITS 2 is an interactive deep learning GPU training system. It allows users to easily design, train, and monitor deep neural networks on multiple GPUs. Key features include visualizing network training, performing inference on trained models, and accessing a REST API. DIGITS 2 improves on the previous version with new network types, solvers, multi-GPU training, layer visualization, and classification of many images. It is available via a web installer or GitHub and supports frameworks like Caffe.

GPGPU: что это такое и для чего. Александр Титов. CoreHard Spring 2019

corehard_by

GPGPU -- это использование графического процессора (GPU) для выполнения общих вычислений, которые обычно проводит центральный процессор (CPU). Благодаря большим вычислительным ресурсам GPU, данный подход позволяет ускорить некоторые приложения в десятки раз по сравнению с традиционным CPU. Принимая во внимание, что GPU есть во множестве современных устройств, данный подход может стать полезных инструментом для программиста, заботящегося о производительности своих программ. Доклад является введением в технологию GPGPU. В ходе презентации, обсуждаются различия между CPU и GPU на аппаратном уровне и объясняется, как эти различия привели к разным моделям программирования этих устройств. Будут рассмотрены классы задач, которые хорошо ускоряются при помощи GPGPU, и когда GPU может оказаться медленнее чем CPU. Доклад не фокусируются на каком-то определенном GPGPU API (OpenCL, CUDA и т.д.) и не требует от слушателей предварительных знаний аппаратуры GPU или CPU.

IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...

IAP09 CUDA@MIT 6.963 - Lecture 01: Logistics (Nicolas Pinto, MIT)

Recommended

Recommended

More Related Content

What's hot

What's hot (8)

Viewers also liked

Viewers also liked (8)

Similar to IAP09 CUDA@MIT 6.963 - Lecture 01: Logistics (Nicolas Pinto, MIT)

Similar to IAP09 CUDA@MIT 6.963 - Lecture 01: Logistics (Nicolas Pinto, MIT) (20)

More from npinto

More from npinto (20)

Recently uploaded

Recently uploaded (20)

IAP09 CUDA@MIT 6.963 - Lecture 01: Logistics (Nicolas Pinto, MIT)