The Universal GPU system architecture combines the latest technologies that support multiple GPU form factors, CPU choices, storage, and networking options.Together, these components are optimized to deliver high performance in a balanced architecture in a highly scalable system. Systems can be optimized for each customer’s specific Artificial Intelligence (AI), Machine Learning (ML), or High Performance Computing (HPC) applications. Organizations worldwide are demanding new options for their future computing environments, which have the thermal headroom for the next generation of CPUs and GPUs.
Join this webinar to learn how to leverage Supermicro's Universal GPU system to simplify customer deployments, deliver ultimate modularity and customization options for AI to Omniverse environments.
4. Confidential
AI Market Projection
• 13 trillion dollar overall market Size According to Mckinsey
• AI market size (USA) will expand at a Compound Annual Growth
Rate (CAGR) of 40.2% from 2021 to 2028.
• 83% of companies share that having access to AI is a top priority
in their business plans.
v
9. Confidential
9
Summary Universal GPU Server
• The Most Optimized and Flexible GPU Server Platform available today
o CPU MB Support
• AMD H12 Milan
• Intel X12 Ice Lake
o GPU Support
• NVIDIA Redstone with GPU to GPU NVLink
• AMD MI-250 with Infinity Fabric xGMI
• Traditional PCIe Form Factor GPU
• Modular Design for Flexibility
• Improved Thermal Capability
o Support up to 500W/700W GPU, 280W AMD CPU and 350W/400W Intel CPU
• 1U Expansion Module available for all 4U Servers
UBB/OAM
Intel PVC
Redstone
AMD MI-
250
PCIe
Supermicro Confidential
11. 11
Universal Design and AMD Instinct MI250 OAM
Supermicro Confidential/Internal Only
• Significant HPC performance increase
over competition
• Also good for AI/ML workloads
• 128GB HBM2e ECC Memory per OAM
• GPU to GPU xGMI Infinity Fabric 2.5TB/s
15. Driving Innovation and
Discovery with AMD Instinct™
accelerators on ROCm™ Stack
Martin Huarte, Ph.D.
Developer Relations Manager, martin.huarte@amd.com
16. 16
[AMD Official Use Only]
Open APIs
Open
Libraries
Compilers
Developer
Tools
Kernel /
Runtime
HPC
Frameworks
ISV Apps
Open-
Source
Codes
Operating
Systems
Deployment
Tools
Mgmt Tools
ML
Frameworks
17. 17
[AMD Official Use Only]
Drivers/Runtimes
Programming
models
Libraries
Compilers & Tools
Deployment Tools
Compiler
OpenMP API HIP API OpenCL™
RedHat, CentOS, SLES & Ubuntu Device Drivers and Run-Time
BLAS FFT
RAND
SPARSE
Debugger
Profiler
ROCm Validation Suite ROCm Data Center Tool
SOLVER
TENSILE
ALUTION THRUST MIOpen
MIVisionX
Tracer
RCCL
MIGraphX PRIM
hipify
ROCm SMI
18. 18
[AMD Official Use Only]
AMD Infinity Hub
Containerized HPC Apps and ML Frameworks
Purpose-built accelerators for HPC and AI workloads
Full range of leading OEMs/ODMs supplying AMD
Accelerated systems to HPC and AI market segments
Open software platform for developers to build
HPC applications on AMD Accelerators
Single location for researchers and data scientists to
download containerized HPC apps and ML
frameworks
Compilers, Libraries, Dev
Tools, APIs, Kernels/Runtimes
Validated, Optimized Systems & Platforms
19. 19
[AMD Official Use Only]
DRIVING MAINSTREAM ADOPTION & ECOSYSTEM ENABLEMENT
19
EXPANDED
OPTIMIZED
ENABLING
SUPPORT FOR AMD INSTINCT™
MI200 & AMD RADEON™ PRO
W6800 GPUS
COMPILER & LIBRARY
OPTIMIZATIONS FOR HPC &
AI/ML
NEW ROCm DOCUMENTATION
PORTAL & IMPROVED DEBUG
TOOLS
20. 20
[AMD Official Use Only]
Re-architected ROCm Documentation
Support Guides
Installation & Deployment Guides
API / SDK Documentation
Access to ROCm Learning Center
GPU programming tutorials, videos and labs
https://docs.amd.com/
Canned questions:
Is HIP a drop-in replacement for CUDA?
No. HIP provides porting tools which do most of the work to convert CUDA code into portable C++ code that uses the HIP APIs. Most developers will port their code from CUDA to HIP and then maintain the HIP version. HIP code provides the same performance as native CUDA code, plus the benefits of running on AMD platforms.
What APIs and features does HIP support?
HIP provides the following:
Devices (hipSetDevice(), hipGetDeviceProperties())
Memory management (hipMalloc(), hipMemcpy(), hipFree())
Streams (hipStreamCreate(),hipStreamSynchronize(), hipStreamWaitEvent())
Events (hipEventRecord(), hipEventElapsedTime())
Kernel launching (hipLaunchKernel is a standard C/C++ function that replaces <<< >>>)
HIP Module API to control when adn how code is loaded.
CUDA-style kernel coordinate functions (threadIdx, blockIdx, blockDim, gridDim)
Cross-lane instructions including shfl, ballot, any, all - Most device-side math built-ins.
Error reporting (hipGetLastError(), hipGetErrorString())
The HIP API documentation describes each API and its limitations, if any, compared with the equivalent CUDA API.
https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-FAQ.html#what-apis-and-features-does-hip-support