Triston Cao, for Alluxio Meetup on May 23, 2024
PERSPECTIVE ON DEEP LEARNING FRAMEWORK
2
3
COMPUTATION GRAPH AND GRADIENT DECENT
Image credit to Deniz Yuret's Homepage: Alec Radford's animations for optimization
algorithms
4
OPEN-SOURCE FRAMEWORKS
2014 2017 2020
2016 2019
2015 2018 2024
ChatGPT
AlexNet ResNet Transformer
5
WHAT DOES A FRAMEWORK LOOK LIKE
A Hybrid Programming Language Environment
6
NVIDIA NGC CONTAINERS
7
OPS, TENSORS, AND PARALLEL EXECUTION
System Level Optimization
https://mxnet.apache.org/versions/1.9.1/api/architecture/note_engine
https://www.oreilly.com/library/view/elegant-scipy/9781491922927/ch01.html
8
9
CONVOLUTIONS
https://paperswithcode.com/methods/category/convolutional-neural-networks https://cv.gluon.ai/contents.html
https://epynn.net/Convolution.html
10
CUDNN 10TH
ANNIVERSARY
April 2014 – April 2024
11
CUDA
TensorRT NCCL DALI
cuDNN
cuBLAS
…
Deep Learning Frameworks
CPU Libraries
12
SYMBOLIC VS EAGER (IMPERATIVE)
Performance vs Easy of use
Imperative
Static graph
Eager mode JIT
Hybrid
13
TENSOR CORES AND MIXED PRECISION
14
MORE TENSOR CORES
15
DATA LAYOUT MATTERS TOO
Reference: Convolutional Layers User's Guide - NVIDIA Docs
16
NVIDIA DALI
17
NCCL FOR MULTI-NODE TRAINING
https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/usage/operations.html
18
https://www.nvidia.com/en-us/data-center/resources/mlperf-benchmarks/
19
TRAINING VS INFERENCE
20
FRAMEWORK + TENSORRT FOR INFERENCE
21
INFERENCE WITH INT8
Ref: Achieving FP32 Accuracy for INT8 Inference Using Quantization Aware Training with NVIDIA TensorRT | NVIDIA Technical Blog
22
COMPILER BASED FRAMEWORK
https://tvm.apache.org/docs/tutorial/relay_quick_start.html
https://www.linkedin.com/pulse/exploring-jax-googles-high-performance-py
thon-library-nagilla-hwauc/
Thunder can optimize Pytorch module with
• torch.compile
• nvFuser
• cuDNN
• Apex
• TransformerEngine
• PyTorch eager
• Custom CUDA kernels through PyCUDA,
Numba, CuPy
• Custom kernels written in OpenAI Triton
https://github.com/Lightning-AI/lightning-thun
der
23
TAKE AWAYS
• Deep learning frameworks are large software projects
• NVIDIA keeps making libraries to server deep learning frameworks for GPU acceleration
• Training and inference have different challenges
• More stabilized by still fast evolving
• Compiler technology getting more integrated into the framework
AI/ML Infra Meetup | Perspective on Deep Learning Framework

AI/ML Infra Meetup | Perspective on Deep Learning Framework