SlideShare a Scribd company logo
1 of 15
Download to read offline
Jeremy Main シニアソリューションアーキテクト GRID
GRID Technical Session
vGPU Top10 PoC Survival Tips
Most Common Mistakes During POCs
Not defining PoC success criteria with stakeholders
Define measureable metrics
Use actual applications and data
Don’t use GPU-centric benchmarks to simulate multiple users
Most Common Mistakes During POCs
Attempt to add PoC into existing IT infrastructure
Use an isolated and controlled environment
Retain PoC environment for tuning and troubleshooting after
deployment
Setup a gateway for license server access if required
Most Common Mistakes During POCs
Not understanding application resource requirements
During typical user workloads, performance limiting factor is?
Application is CPU or memory-bound?
GPU frame buffer or rendering-bound?
Perfmon on existing workstations : “NVIDIA_GPU” counters
Most Common Mistakes During POCs
Not using all available resources of information
NVIDIA deployment guides, application sizing guides
Citrix and VMware reviewer guides and best practices
Most Common Mistakes During POCs
Attempting to use non GRID certified servers
There are many versions of GRID / Tesla cards
Not every card works in every server
NVIDIA GRID™ Certified Platforms
UCS C240 M3, M4
UCS C460 M4
PowerEdge R720, R730,
T620, T630
PowerEdge C4130, VRTX
PowerEdge C8220X GPU
Sled
Precision R7610, Rack
9710
Celsius C620, R940, M740
Primergy CX400M1,
RX2540M1,RX350S8,
TX300S8
ProLiant WS460c Gen8
ProLiant DL380p Gen8 and
Gen9, DL580 Gen 8
ProLiant SL250s Gen 8,
SL270s Gen 8 SE
iDataPlex dx360 M4
NeXtScale nx360 M4, M5
Flex System
ThinkStation D30
System x3650M4/M5,
x3850X6, x3950X6
For more information
on GRID enabled servers visit
www.nvidia.com/buygrid
Most Common Mistakes During POCs
Optimal CPUs for the workload are not used
Most CAD applications are very single threaded
Focus on higher CPU frequency, not number of cores
Most Common Mistakes During POCs
BIOS power profile is set incorrectly
Set power profile to “Maximum Performance”
Ensure CPUs can reach their highest clock speeds
Most Common Mistakes During POCs
Servers don’t have enough memory
Memory overcommit does not work with vGPU
4GB : Power User, Entry Level Engineering
8GB : Mid-range Engineering, Video
16GB : Advanced Engineering
32GB : CAD/CAM
64GB : Digital Mock Up
Most Common Mistakes During POCs
Insufficient storage IOPS
Workstation class users expect…
SSD performance since they use it locally as well
Most Common Mistakes During POCs
Inadequate network environment for VDI
Don’t use legacy network type in VM : prefer VMXNET3
Confirm network’s ability to deliver enough bandwidth
“iperf” may be used to simulate single and parallel TCP/UDP
networkd streams to confirm available bandwidth exists
Most Common Mistakes During POCs
Not enough vCPUs assigned to a VM
Assign at least 4 vCPU to a vGPU enabled VM
Two vCPUs for application
One vCPU for OS and system-calls
One vCPU for remoting protocol compression
Most Common Mistakes During POCs
Not optimizing virtual machine base image
Eliminate OS-level performance inhibitors
Citrix : “TargetOSOptimizer” tool
VMware : “VMwareOSOptimizationTool”
Resources on www.nvidia.com/grid
 White papers
 Application guides
 Deployment guides
 Success stories
 GRID 2.0 Datasheet and FAQ
 Videos
 Blogs

More Related Content

More from NVIDIA Japan

More from NVIDIA Japan (20)

HPC 的に H100 は魅力的な GPU なのか?
HPC 的に H100 は魅力的な GPU なのか?HPC 的に H100 は魅力的な GPU なのか?
HPC 的に H100 は魅力的な GPU なのか?
 
NVIDIA cuQuantum SDK による量子回路シミュレーターの高速化
NVIDIA cuQuantum SDK による量子回路シミュレーターの高速化NVIDIA cuQuantum SDK による量子回路シミュレーターの高速化
NVIDIA cuQuantum SDK による量子回路シミュレーターの高速化
 
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
Physics-ML のためのフレームワーク NVIDIA Modulus 最新事情
 
20221021_JP5.0.2-Webinar-JP_Final.pdf
20221021_JP5.0.2-Webinar-JP_Final.pdf20221021_JP5.0.2-Webinar-JP_Final.pdf
20221021_JP5.0.2-Webinar-JP_Final.pdf
 
開発者が語る NVIDIA cuQuantum SDK
開発者が語る NVIDIA cuQuantum SDK開発者が語る NVIDIA cuQuantum SDK
開発者が語る NVIDIA cuQuantum SDK
 
NVIDIA Modulus: Physics ML 開発のためのフレームワーク
NVIDIA Modulus: Physics ML 開発のためのフレームワークNVIDIA Modulus: Physics ML 開発のためのフレームワーク
NVIDIA Modulus: Physics ML 開発のためのフレームワーク
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
HPC+AI ってよく聞くけど結局なんなの
HPC+AI ってよく聞くけど結局なんなのHPC+AI ってよく聞くけど結局なんなの
HPC+AI ってよく聞くけど結局なんなの
 
Magnum IO GPUDirect Storage 最新情報
Magnum IO GPUDirect Storage 最新情報Magnum IO GPUDirect Storage 最新情報
Magnum IO GPUDirect Storage 最新情報
 
データ爆発時代のネットワークインフラ
データ爆発時代のネットワークインフラデータ爆発時代のネットワークインフラ
データ爆発時代のネットワークインフラ
 
Hopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないことHopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないこと
 
GPU と PYTHON と、それから最近の NVIDIA
GPU と PYTHON と、それから最近の NVIDIAGPU と PYTHON と、それから最近の NVIDIA
GPU と PYTHON と、それから最近の NVIDIA
 
GTC November 2021 – テレコム関連アップデート サマリー
GTC November 2021 – テレコム関連アップデート サマリーGTC November 2021 – テレコム関連アップデート サマリー
GTC November 2021 – テレコム関連アップデート サマリー
 
テレコムのビッグデータ解析 & AI サイバーセキュリティ
テレコムのビッグデータ解析 & AI サイバーセキュリティテレコムのビッグデータ解析 & AI サイバーセキュリティ
テレコムのビッグデータ解析 & AI サイバーセキュリティ
 
必見!絶対におすすめの通信業界セッション 5 つ ~秋の GTC 2020~
必見!絶対におすすめの通信業界セッション 5 つ ~秋の GTC 2020~必見!絶対におすすめの通信業界セッション 5 つ ~秋の GTC 2020~
必見!絶対におすすめの通信業界セッション 5 つ ~秋の GTC 2020~
 
2020年10月29日 プロフェッショナルAI×Roboticsエンジニアへのロードマップ
2020年10月29日 プロフェッショナルAI×Roboticsエンジニアへのロードマップ2020年10月29日 プロフェッショナルAI×Roboticsエンジニアへのロードマップ
2020年10月29日 プロフェッショナルAI×Roboticsエンジニアへのロードマップ
 
2020年10月29日 Jetson活用によるAI教育
2020年10月29日 Jetson活用によるAI教育2020年10月29日 Jetson活用によるAI教育
2020年10月29日 Jetson活用によるAI教育
 
2020年10月29日 Jetson Nano 2GBで始めるAI x Robotics教育
2020年10月29日 Jetson Nano 2GBで始めるAI x Robotics教育2020年10月29日 Jetson Nano 2GBで始めるAI x Robotics教育
2020年10月29日 Jetson Nano 2GBで始めるAI x Robotics教育
 
COVID-19 研究・対策に活用可能な NVIDIA ソフトウェアと関連情報
COVID-19 研究・対策に活用可能な NVIDIA ソフトウェアと関連情報COVID-19 研究・対策に活用可能な NVIDIA ソフトウェアと関連情報
COVID-19 研究・対策に活用可能な NVIDIA ソフトウェアと関連情報
 
Jetson Xavier NX クラウドネイティブをエッジに
Jetson Xavier NX クラウドネイティブをエッジにJetson Xavier NX クラウドネイティブをエッジに
Jetson Xavier NX クラウドネイティブをエッジに
 

Recently uploaded

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Recently uploaded (20)

Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
Measures in SQL (a talk at SF Distributed Systems meetup, 2024-05-22)
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024Top 10 Symfony Development Companies 2024
Top 10 Symfony Development Companies 2024
 

1102: GRID 技術セッション 3:vGPU Top10 PoC Survival Tips

  • 1. Jeremy Main シニアソリューションアーキテクト GRID GRID Technical Session vGPU Top10 PoC Survival Tips
  • 2. Most Common Mistakes During POCs Not defining PoC success criteria with stakeholders Define measureable metrics Use actual applications and data Don’t use GPU-centric benchmarks to simulate multiple users
  • 3. Most Common Mistakes During POCs Attempt to add PoC into existing IT infrastructure Use an isolated and controlled environment Retain PoC environment for tuning and troubleshooting after deployment Setup a gateway for license server access if required
  • 4. Most Common Mistakes During POCs Not understanding application resource requirements During typical user workloads, performance limiting factor is? Application is CPU or memory-bound? GPU frame buffer or rendering-bound? Perfmon on existing workstations : “NVIDIA_GPU” counters
  • 5. Most Common Mistakes During POCs Not using all available resources of information NVIDIA deployment guides, application sizing guides Citrix and VMware reviewer guides and best practices
  • 6. Most Common Mistakes During POCs Attempting to use non GRID certified servers There are many versions of GRID / Tesla cards Not every card works in every server
  • 7. NVIDIA GRID™ Certified Platforms UCS C240 M3, M4 UCS C460 M4 PowerEdge R720, R730, T620, T630 PowerEdge C4130, VRTX PowerEdge C8220X GPU Sled Precision R7610, Rack 9710 Celsius C620, R940, M740 Primergy CX400M1, RX2540M1,RX350S8, TX300S8 ProLiant WS460c Gen8 ProLiant DL380p Gen8 and Gen9, DL580 Gen 8 ProLiant SL250s Gen 8, SL270s Gen 8 SE iDataPlex dx360 M4 NeXtScale nx360 M4, M5 Flex System ThinkStation D30 System x3650M4/M5, x3850X6, x3950X6 For more information on GRID enabled servers visit www.nvidia.com/buygrid
  • 8. Most Common Mistakes During POCs Optimal CPUs for the workload are not used Most CAD applications are very single threaded Focus on higher CPU frequency, not number of cores
  • 9. Most Common Mistakes During POCs BIOS power profile is set incorrectly Set power profile to “Maximum Performance” Ensure CPUs can reach their highest clock speeds
  • 10. Most Common Mistakes During POCs Servers don’t have enough memory Memory overcommit does not work with vGPU 4GB : Power User, Entry Level Engineering 8GB : Mid-range Engineering, Video 16GB : Advanced Engineering 32GB : CAD/CAM 64GB : Digital Mock Up
  • 11. Most Common Mistakes During POCs Insufficient storage IOPS Workstation class users expect… SSD performance since they use it locally as well
  • 12. Most Common Mistakes During POCs Inadequate network environment for VDI Don’t use legacy network type in VM : prefer VMXNET3 Confirm network’s ability to deliver enough bandwidth “iperf” may be used to simulate single and parallel TCP/UDP networkd streams to confirm available bandwidth exists
  • 13. Most Common Mistakes During POCs Not enough vCPUs assigned to a VM Assign at least 4 vCPU to a vGPU enabled VM Two vCPUs for application One vCPU for OS and system-calls One vCPU for remoting protocol compression
  • 14. Most Common Mistakes During POCs Not optimizing virtual machine base image Eliminate OS-level performance inhibitors Citrix : “TargetOSOptimizer” tool VMware : “VMwareOSOptimizationTool”
  • 15. Resources on www.nvidia.com/grid  White papers  Application guides  Deployment guides  Success stories  GRID 2.0 Datasheet and FAQ  Videos  Blogs