This is a presentation I presented at NVIDIA AI Conference in Korea. It's about building the largest GPU - DGX-2, the most powerful supercomputer in one node.
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
In this webinar, members of the Server Solution Team as well as a member of Supermicro’s Product Office will discuss Supermicro’s Universal GPU Server, the server’s modular, standards-based design, the important role of OCP Accelerator Module (OAM) form factor, and Universal Baseboard (UBB) in the system, as well as touching on AMD's next generation HPC accelerator. In addition, we will get some insights into trends in the HPC and AI/Machine Learning space, including the different software platforms and best practices that are driving innovation in our industry and daily lives. In particular: • Tools to enable use of the high performance hardware for HPC and Deep Learning applications • Tools to enable use of multiple GPUs, including RDMA, to solve highly demanding HPC and deep learning models, such as BERT • Running applications in containers with AMD’s next generation GPU system
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureRebekah Rodriguez
The Universal GPU system architecture combines the latest technologies that support multiple GPU form factors, CPU choices, storage, and networking options.Together, these components are optimized to deliver high performance in a balanced architecture in a highly scalable system. Systems can be optimized for each customer’s specific Artificial Intelligence (AI), Machine Learning (ML), or High Performance Computing (HPC) applications. Organizations worldwide are demanding new options for their future computing environments, which have the thermal headroom for the next generation of CPUs and GPUs.
Join this webinar to learn how to leverage Supermicro's Universal GPU system to simplify customer deployments, deliver ultimate modularity and customization options for AI to Omniverse environments.
Join us for an exciting and informative preview of the broadest range of next-generation systems optimized for tomorrow’s data center workloads, Powered by 4th Gen Intel® Xeon® Scalable Processors (formerly codenamed Sapphire Rapids).
Experts from Supermicro and Intel will discuss how the upcoming Supermicro X13 systems will enable new performance levels utilizing state-of-the-art technology, including DDR5, PCIe 5.0, Compute Express Link™ 1.1, and Intel® Advanced Matrix Extensions (Intel AMX).
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableRebekah Rodriguez
The worlds of HPC and AI are evolving at a tremendous rate. The demands of modern-day applications put immense pressure on local IT teams and resources. More often than not, this pressure can come from requiring an AI strategy to speed up mission-critical applications - but this can come at a cost which can hinder adoption. In this webinar, Supermicro, together with International Computer Concepts (ICC) and Define Tech, will demonstrate their AI Super Pod that delivers on AI strategy needs without breaking the bank.
The Power of HPC with Next Generation Supermicro Systems Rebekah Rodriguez
Witness the astonishing improvement in performance and security with the next new generation of Supermicro platforms. New Supermicro systems deliver unprecedented levels of compute power for the most challenging high-performance workloads. In this Supercomputing roundtable, learn how the new Supermicro products provide a differentiated advantage for early adopters of the most advanced accelerated computing infrastructure in the world.
Modular by Design: Supermicro’s New Standards-Based Universal GPU ServerRebekah Rodriguez
In this webinar, members of the Server Solution Team as well as a member of Supermicro’s Product Office will discuss Supermicro’s Universal GPU Server, the server’s modular, standards-based design, the important role of OCP Accelerator Module (OAM) form factor, and Universal Baseboard (UBB) in the system, as well as touching on AMD's next generation HPC accelerator. In addition, we will get some insights into trends in the HPC and AI/Machine Learning space, including the different software platforms and best practices that are driving innovation in our industry and daily lives. In particular: • Tools to enable use of the high performance hardware for HPC and Deep Learning applications • Tools to enable use of multiple GPUs, including RDMA, to solve highly demanding HPC and deep learning models, such as BERT • Running applications in containers with AMD’s next generation GPU system
Supermicro’s Universal GPU: Modular, Standards Based and Built for the FutureRebekah Rodriguez
The Universal GPU system architecture combines the latest technologies that support multiple GPU form factors, CPU choices, storage, and networking options.Together, these components are optimized to deliver high performance in a balanced architecture in a highly scalable system. Systems can be optimized for each customer’s specific Artificial Intelligence (AI), Machine Learning (ML), or High Performance Computing (HPC) applications. Organizations worldwide are demanding new options for their future computing environments, which have the thermal headroom for the next generation of CPUs and GPUs.
Join this webinar to learn how to leverage Supermicro's Universal GPU system to simplify customer deployments, deliver ultimate modularity and customization options for AI to Omniverse environments.
Join us for an exciting and informative preview of the broadest range of next-generation systems optimized for tomorrow’s data center workloads, Powered by 4th Gen Intel® Xeon® Scalable Processors (formerly codenamed Sapphire Rapids).
Experts from Supermicro and Intel will discuss how the upcoming Supermicro X13 systems will enable new performance levels utilizing state-of-the-art technology, including DDR5, PCIe 5.0, Compute Express Link™ 1.1, and Intel® Advanced Matrix Extensions (Intel AMX).
Supermicro AI Pod that’s Super Simple, Super Scalable, and Super AffordableRebekah Rodriguez
The worlds of HPC and AI are evolving at a tremendous rate. The demands of modern-day applications put immense pressure on local IT teams and resources. More often than not, this pressure can come from requiring an AI strategy to speed up mission-critical applications - but this can come at a cost which can hinder adoption. In this webinar, Supermicro, together with International Computer Concepts (ICC) and Define Tech, will demonstrate their AI Super Pod that delivers on AI strategy needs without breaking the bank.
The Power of HPC with Next Generation Supermicro Systems Rebekah Rodriguez
Witness the astonishing improvement in performance and security with the next new generation of Supermicro platforms. New Supermicro systems deliver unprecedented levels of compute power for the most challenging high-performance workloads. In this Supercomputing roundtable, learn how the new Supermicro products provide a differentiated advantage for early adopters of the most advanced accelerated computing infrastructure in the world.
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World WorkloadsRebekah Rodriguez
With the recent announcements from Intel and Supermicro, we are seeing a number of new systems that support exciting new technologies and which provide a scalable foundation for the data center of the future. These systems also deliver significant benefits to meet today's real-world problems and help to optimize existing HPC and business applications.
Realizing these benefits demands innovations in performance, cost management, and integration. It is not one company's problem to solve but an ecosystem that can collaborate and configure the right solutions to meet today's needs.
Please join Supermicro and Micron to learn about their close collaboration, the roles of Supermicro X13 systems and DDR5 memory to meet these requirements, and learn about the results we are seeing today.
New Accelerated Compute Infrastructure Solutions from SupermicroRebekah Rodriguez
Join us for a special edition of Supermicro’s TECHTalk as we introduce Supermicro’s new accelerated compute infrastructure solutions. A number of Supermicro experts will share insights and updates on one of the industry’s broadest portfolios of NVIDIA-Certified GPU systems, which deliver new levels of performance for AI infrastructure with the new H100 Tensor Core GPUs.
In this deck from the UK HPC Conference, Gunter Roeth from NVIDIA presents: Hardware & Software Platforms for HPC, AI and ML.
"Data is driving the transformation of industries around the world and a new generation of AI applications are effectively becoming programs that write software, powered by data, vs by computer programmers. Today, NVIDIA’s tensor core GPU sits at the core of most AI, ML and HPC applications, and NVIDIA software surrounds every level of such a modern application, from CUDA and libraries like cuDNN and NCCL embedded in every deep learning framework and optimized and delivered via the NVIDIA GPU Cloud to reference architectures designed to streamline the deployment of large scale infrastructures."
Watch the video: https://wp.me/p3RLHQ-l2Y
Learn more: http://nvidia.com
and
http://hpcadvisorycouncil.com/events/2019/uk-conference/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
GPU acceleration has been at the heart of scientific computing and artificial intelligence for many years now. GPUs provide the computational power needed for the most demanding applications such as Deep Neural Networks, nuclear or weather simulation. Since the launch of RAPIDS in mid-2018, this vast computational resource has become available for Data Science workloads too. The RAPIDS toolkit, which is now available on the Databricks Unified Analytics Platform, is a GPU-accelerated drop-in replacement for utilities such as Pandas/NumPy/ScikitLearn/XGboost. Through its use of Dask wrappers the platform allows for true, large scale computation with minimal, if any, code changes.
The goal of this talk is to discuss RAPIDS, its functionality, architecture as well as the way it integrates with Spark providing on many occasions several orders of magnitude acceleration versus its CPU-only counterparts.
Outlining a sweeping vision for the “age of AI,” NVIDIA CEO Jensen Huang Monday kicked off the GPU Technology Conference.
Huang made major announcements in data centers, edge AI, collaboration tools and healthcare in a talk simultaneously released in nine episodes, each under 10 minutes.
“AI requires a whole reinvention of computing – full-stack rethinking – from chips, to systems, algorithms, tools, the ecosystem,” Huang said, standing in front of the stove of his Silicon Valley home.
Behind a series of announcements touching on everything from healthcare to robotics to videoconferencing, Huang’s underlying story was simple: AI is changing everything, which has put NVIDIA at the intersection of changes that touch every facet of modern life.
More and more of those changes can be seen, first, in Huang’s kitchen, with its playful bouquet of colorful spatulas, that has served as the increasingly familiar backdrop for announcements throughout the COVID-19 pandemic.
“NVIDIA is a full stack computing company – we love working on extremely hard computing problems that have great impact on the world – this is right in our wheelhouse,” Huang said. “We are all-in, to advance and democratize this new form of computing – for the age of AI.”
This GTC is one of the biggest yet. It features more than 1,000 sessions—400 more than the last GTC—in 40 topic areas. And it’s the first to run across the world’s time zones, with sessions in English, Chinese, Korean, Japanese, and Hebrew.
AMD has been away from the HPC space for a while, but now they are coming back in a big way with an open software approach to GPU computing. The Radeon Open Compute Platform (ROCm) was born from the Boltzman Initiative announced last year at SC15. Now available on GitHub, the ROCm Platform bringing a rich foundation to advanced computing by better integrating the CPU and GPU to solve real-world problems.
"We are excited to present ROCm, the first open-source HPC/ultrascale-class platform for GPU computing that’s also programming-language independent. We are bringing the UNIX philosophy of choice, minimalism and modular software development to GPU computing. The new ROCm foundation lets you choose or even develop tools and a language run time for your application."
Watch the video presentation: http://wp.me/p3RLHQ-fJT
Learn more: https://radeonopencompute.github.io/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univainside-BigData.com
In this deck from the Univa Breakfast Briefing at ISC 2018, Duncan Poole from NVIDIA describes how the company is accelerating HPC in the Cloud.
Learn more: https://www.nvidia.com/en-us/data-center/dgx-systems/
and
http://univa.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Today’s groundbreaking scientific discoveries are taking place in HPC data centers. Using containers, researchers and scientists gain the flexibility to run HPC application containers on NVIDIA Volta-powered systems including Quadro-powered workstations, NVIDIA DGX Systems, and HPC clusters.
Supermicro Servers with Micron DDR5 & SSDs: Accelerating Real World WorkloadsRebekah Rodriguez
With the recent announcements from Intel and Supermicro, we are seeing a number of new systems that support exciting new technologies and which provide a scalable foundation for the data center of the future. These systems also deliver significant benefits to meet today's real-world problems and help to optimize existing HPC and business applications.
Realizing these benefits demands innovations in performance, cost management, and integration. It is not one company's problem to solve but an ecosystem that can collaborate and configure the right solutions to meet today's needs.
Please join Supermicro and Micron to learn about their close collaboration, the roles of Supermicro X13 systems and DDR5 memory to meet these requirements, and learn about the results we are seeing today.
New Accelerated Compute Infrastructure Solutions from SupermicroRebekah Rodriguez
Join us for a special edition of Supermicro’s TECHTalk as we introduce Supermicro’s new accelerated compute infrastructure solutions. A number of Supermicro experts will share insights and updates on one of the industry’s broadest portfolios of NVIDIA-Certified GPU systems, which deliver new levels of performance for AI infrastructure with the new H100 Tensor Core GPUs.
In this deck from the UK HPC Conference, Gunter Roeth from NVIDIA presents: Hardware & Software Platforms for HPC, AI and ML.
"Data is driving the transformation of industries around the world and a new generation of AI applications are effectively becoming programs that write software, powered by data, vs by computer programmers. Today, NVIDIA’s tensor core GPU sits at the core of most AI, ML and HPC applications, and NVIDIA software surrounds every level of such a modern application, from CUDA and libraries like cuDNN and NCCL embedded in every deep learning framework and optimized and delivered via the NVIDIA GPU Cloud to reference architectures designed to streamline the deployment of large scale infrastructures."
Watch the video: https://wp.me/p3RLHQ-l2Y
Learn more: http://nvidia.com
and
http://hpcadvisorycouncil.com/events/2019/uk-conference/agenda.php
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
GPU acceleration has been at the heart of scientific computing and artificial intelligence for many years now. GPUs provide the computational power needed for the most demanding applications such as Deep Neural Networks, nuclear or weather simulation. Since the launch of RAPIDS in mid-2018, this vast computational resource has become available for Data Science workloads too. The RAPIDS toolkit, which is now available on the Databricks Unified Analytics Platform, is a GPU-accelerated drop-in replacement for utilities such as Pandas/NumPy/ScikitLearn/XGboost. Through its use of Dask wrappers the platform allows for true, large scale computation with minimal, if any, code changes.
The goal of this talk is to discuss RAPIDS, its functionality, architecture as well as the way it integrates with Spark providing on many occasions several orders of magnitude acceleration versus its CPU-only counterparts.
Outlining a sweeping vision for the “age of AI,” NVIDIA CEO Jensen Huang Monday kicked off the GPU Technology Conference.
Huang made major announcements in data centers, edge AI, collaboration tools and healthcare in a talk simultaneously released in nine episodes, each under 10 minutes.
“AI requires a whole reinvention of computing – full-stack rethinking – from chips, to systems, algorithms, tools, the ecosystem,” Huang said, standing in front of the stove of his Silicon Valley home.
Behind a series of announcements touching on everything from healthcare to robotics to videoconferencing, Huang’s underlying story was simple: AI is changing everything, which has put NVIDIA at the intersection of changes that touch every facet of modern life.
More and more of those changes can be seen, first, in Huang’s kitchen, with its playful bouquet of colorful spatulas, that has served as the increasingly familiar backdrop for announcements throughout the COVID-19 pandemic.
“NVIDIA is a full stack computing company – we love working on extremely hard computing problems that have great impact on the world – this is right in our wheelhouse,” Huang said. “We are all-in, to advance and democratize this new form of computing – for the age of AI.”
This GTC is one of the biggest yet. It features more than 1,000 sessions—400 more than the last GTC—in 40 topic areas. And it’s the first to run across the world’s time zones, with sessions in English, Chinese, Korean, Japanese, and Hebrew.
AMD has been away from the HPC space for a while, but now they are coming back in a big way with an open software approach to GPU computing. The Radeon Open Compute Platform (ROCm) was born from the Boltzman Initiative announced last year at SC15. Now available on GitHub, the ROCm Platform bringing a rich foundation to advanced computing by better integrating the CPU and GPU to solve real-world problems.
"We are excited to present ROCm, the first open-source HPC/ultrascale-class platform for GPU computing that’s also programming-language independent. We are bringing the UNIX philosophy of choice, minimalism and modular software development to GPU computing. The new ROCm foundation lets you choose or even develop tools and a language run time for your application."
Watch the video presentation: http://wp.me/p3RLHQ-fJT
Learn more: https://radeonopencompute.github.io/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
NVIDIA GPUs Power HPC & AI Workloads in Cloud with Univainside-BigData.com
In this deck from the Univa Breakfast Briefing at ISC 2018, Duncan Poole from NVIDIA describes how the company is accelerating HPC in the Cloud.
Learn more: https://www.nvidia.com/en-us/data-center/dgx-systems/
and
http://univa.com
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
Today’s groundbreaking scientific discoveries are taking place in HPC data centers. Using containers, researchers and scientists gain the flexibility to run HPC application containers on NVIDIA Volta-powered systems including Quadro-powered workstations, NVIDIA DGX Systems, and HPC clusters.
Best practices for optimizing Red Hat platforms for large scale datacenter de...Jeremy Eder
This presentation is from NVIDIA GTC DC on Oct 23, 2018:
https://youtu.be/z5gEUL6dJRI
Corresponding Press Release: https://www.redhat.com/en/about/press-releases/red-hat-nvidia-align-open-source-solutions-fuel-emerging-workloads
Blog: https://www.redhat.com/en/blog/red-hat-and-nvidia-positioning-red-hat-enterprise-linux-and-openshift-primary-platforms-artificial-intelligence-and-other-gpu-accelerated-workloads
Demo Video:
https://www.youtube.com/watch?v=9iVYjA_WJgU
GPU Accelerated Virtual Desktop Infrastructure (VDI) on OpenStackBrian Schott
This is a talk presented at the OpenStack DC Meetup #9 of GPU pass-through of an Nvidia GRID K2 card with XenServer, Microsoft Hyper-V, and open source Xen hypervisors. We looked at
Benefits of Multi-rail Cluster Architectures for GPU-based Nodesinside-BigData.com
Craig Tierney from NVIDIA presented this deck at the MVAPICH User Group meeting.
"As high performance computing moves toward GPU-accelerated architectures, single node application performance can be between 3x and 75x faster than the CPUs alone. Performance increases of this size will require increases in network bandwidth and message rate to prevent the network from becoming the bottleneck in scalability. In this talk, we will present results from NVLink enabled systems connected via quad-rail EDR Infiniband."
Watch the video: https://wp.me/p3RLHQ-hkr
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
1) NVIDIA-Iguazio Accelerated Solutions for Deep Learning and Machine Learning (30 mins):
About the speaker:
Dr. Gabriel Noaje, Senior Solutions Architect, NVIDIA
http://bit.ly/GabrielNoaje
2) GPUs in Data Science Pipelines ( 30 mins)
- GPU as a Service for enterprise AI
- A short demo on the usage of GPUs for model training and model inferencing within a data science workflow
About the speaker:
Anant Gandhi, Solutions Engineer, Iguazio Singapore. https://www.linkedin.com/in/anant-gandhi-b5447614/
Webinar: NVIDIA JETSON – A Inteligência Artificial na palma de sua mãoEmbarcados
Objetivo do Webinar: Venha saber como a plataforma NVIDIA Jetson e suas ferramentas habilitam você a desenvolver e implantar robôs, drones, aplicativos de IVA e outras máquinas autônomas com tecnologia AI que pensam por conta própria.
Apoio: Arrow e NVIDIA.
Convidado: Marcel Saraiva
Gerente de Contas Enterprise da NVIDIA, executivo com 20 anos de expereincia no mercado de TI, teve na sua carreia passagens pela SGI (Silicon Graphics), Intel e Scansource. Engenheiro eletrico formado pela FEI, com pós-graduação em Marketing pela FAAP e MBA em Gestão Empresarial pela FGV.
Link para o Webinar: https://www.embarcados.com.br/webinars/nvidia-jetson-a-inteligencia-artificial-na-palma-de-sua-mao/
Nvidia Deep Learning Solutions - Alex SabatierSri Ambati
Alex Sabatier from Nvidia talks about the future of Deep Learning from an chipmaker perspective
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
Harnessing the virtual realm for successful real world artificial intelligenceAlison B. Lowndes
Artificial Intelligence is impacting all areas of society, from healthcare and transportation to smart cities and energy. How NVIDIA invests both in internal pure research and accelerated computation to enable its diverse customer base, across gaming & extended reality, graphics, AI, robotics, simulation, high performance scientific computing, healthcare & more. You will be introduced to the GPU computing platform & shown real world successfully deployed applications as well as a glimpse into the current state of the art across academia, enterprise and startups.
Axel Koehler from Nvidia presented this deck at the 2016 HPC Advisory Council Switzerland Conference.
“Accelerated computing is transforming the data center that delivers unprecedented through- put, enabling new discoveries and services for end users. This talk will give an overview about the NVIDIA Tesla accelerated computing platform including the latest developments in hardware and software. In addition it will be shown how deep learning on GPUs is changing how we use computers to understand data.”
In related news, the GPU Technology Conference takes place April 4-7 in Silicon Valley.
Watch the video presentation: http://insidehpc.com/2016/03/tesla-accelerated-computing/
See more talks in the Swiss Conference Video Gallery:
http://insidehpc.com/2016-swiss-hpc-conference/
Sign up for our insideHPC Newsletter:
http://insidehpc.com/newsletter
Fórum E-Commerce Brasil | Tecnologias NVIDIA aplicadas ao e-commerce. Muito a...E-Commerce Brasil
Tecnologias NVIDIA aplicadas ao e-commerce. Muito além do hardware.
Jomar Silva
Gerente de relacionamento com desenvolvedores para a América Latina - NVIDIA
https://eventos.ecommercebrasil.com.br/forum/
Medical imaging refers to several different technologies that are used to view the human body in order to diagnose, monitor, or treat medical conditions. Today, GPUs are found in almost all imaging modalities - including CT, MRI, x-ray, and ultrasound -- bringing compute capabilities to the edge devices. With the boom of deep learning research in medical imaging, more efficient and improved approaches are being developed to enable AI-assisted workflows.
Women L.E.A.D. Toastmasters Appreciation eventRenee Yao
This slideshare is used to facilitate the Women L.E.A.D. toastmasters public speaking appreciation event: https://womenleadtm.com/meetings/happy-hour-in-person-optional/
This slide deck is put together to support Women L.E.A.D. Toastmasters workshop, How to be An Effective Mentor. YouTube: https://www.youtube.com/watch?v=RHH6-cE2zKM. Meeting: https://womenleadtm.com/meetings/workshop-how-to-be-an-effective-mentor/
Why Toastmasters and How it Helps Your Daily Job Renee Yao
This slide deck is created for Women L.E.A.D. Toastmasters workshop on May 7th 2021. Recording: https://www.youtube.com/watch?v=3vZqVKWmrCw
Meeting Notes:
https://womenleadtm.com/meetings/workshop-why-toastmasters/
AI in Healthcare | Future of Smart Hospitals Renee Yao
In this talk, I specifically talk about how NVIDIA healthcare AI software and hardware were used to support healthcare AI startups' innovation. Three startups featured: Caption Health, Artisight, and Hyperfine. Audience: healthcare systems CXOs.
This deck help public speakers to give good and effective evaluations to others, provide step-by-step guide on how to win an evaluation contest in a Toastmasters competition, and why evaluation matters in our daily life.
Startups Step Up - how healthcare ai startups are taking action during covid-...Renee Yao
All around the world, people are facing unprecedented challenges and uncertainties as a result of COVID-19. At NVIDIA Inception program, a virtual incubation startup program, which hosts 5000+ AI startups, we see an army of healthcare AI startups that have mobilized to address this global health crisis. This webinar will share real world examples on how each offering plays a critical role during this pandemic.
Live event: https://www.meetup.com/Women-in-Big-Data-Meetup/events/270191555/?action=rsvp&response=3.
YouTube Link: https://www.youtube.com/watch?v=QWkKINi8u4o&feature=youtu.be
Simplifying AI Infrastructure: Lessons in Scaling on DGX SystemsRenee Yao
Simplifying AI Infrastructure: Lessons in Scaling on DGX Systems, the world's most powerful AI Systems. This is a presentation I did at GTC Israel in 2018
This deck summarizes NetApp Insights 2018 joint ONTAP AI activities with NVIDIA and NetApp. List of activities includes Women In Tech Panel, Fireside chat, Spotlight sessions, the Cube live interview, and Partner Success video.
Accelerate AI w/ Synthetic Data using GANsRenee Yao
Strata Data Conference in Sep 2018 Presentation
Description:
Synthetic data will drive the next wave of deployment and application of deep learning in the real world across a variety of problems involving speech recognition, image classification, object recognition and language. All industries and companies will benefit, as synthetic data can create conditions through simulation, instead of authentic situations (virtual worlds enable you to avoid the cost of damages, spare human injuries, and other factors that come into play); unparalleled ability to test products, and interactions with them in any environment.
Join us for this introductory session to learn more about how Generative Adversarial Networks (GAN) are successfully used to improve data generation. We will cover specific real-world examples where customers have deployed GAN to solve challenges in healthcare, space, transportation, and retail industries.
Renee Yao explains how generative adversarial networks (GAN) are successfully used to improve data generation and explores specific real-world examples where customers have deployed GANs to solve challenges in healthcare, space, transportation, and retail industries.
HPE and NVIDIA are delivering a leading portfolio of optimized AI solutions that transform business and industry to gain deeper insights and facilitate solving the world’s greatest challenges. Join this session to learn about how NVIDIA V100, the world’s most powerful GPU, powering HPE 6500 Systems, the HPE AI Systems, to provide new business insights and outcomes.
Dell and NVIDIA for Your AI workloads in the Data CenterRenee Yao
Join us and learn more about how Dell PowerEdge C4140 Rack Server, powered by four of NVIDIA V100s, the world’s most powerful GPU, address training and inference for the most demanding HPC, data visualization and AI workloads. This enables organizations to take advantage of the convergence of HPC and data analytics and realize advancements in areas including fraud detection, image processing, financial investment analysis and personalized medicine.
Orchestrate Your AI Workload with Cisco Hyperflex, Powered by NVIDIA GPUs Renee Yao
Deep learning, a collection of statistical machine learning techniques, is transforming every digital business. As data grows, businesses need to find new ways of capitalizing on the volume of information to drive their competitive advantage. GPUs are becoming mainstream in the datacenter for accelerating containerized AI workloads. Kubernetes is a popular management framework for orchestrating containers at scale. However, managing GPUs in Kubernetes is still nascent, and setting up a Kubernetes cluster with GPUs can be challenging for customers. Join this session to learn more about how to use Kubernetes to orchestrate your AI workloads on Cisco Hyperflex, powered by NVIDIA V100, world’s most powerful GPU.
This is a supporting deck for my personal blog, "A Toast to My Public Speaking Journey". Link can be found here: https://wordpress.com/post/reneeyao.wordpress.com/27
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
PHP Frameworks: I want to break free (IPC Berlin 2024)
Building the World's Largest GPU
1. Renee Yao, NVIDIA Senior Product
Marketing Manager, AI Systems
Twitter: @ReneeYao1
BUILDING THE WORLD'S
LARGEST GPU
2. 2Twitter: @ReneeYao1Twitter: @ReneeYao1
THE DGX FAMILY OF AI SUPERCOMPUTERS
AI WORKSTATIONCLOUD-SCALE AI AI DATA CENTER
Cloud platform with the highest
deep learning efficiency
NVIDIA GPU Cloud
The Essential
Instrument for AI
Research
DGX-1
with
Tesla V100 32GB
The Personal
AI Supercomputer
DGX Station
with
Tesla V100 32GB
The World’s Most Powerful
AI System for the Most
Complex AI Challenges
DGX-2
with
Tesla V100 32GB
3. 3Twitter: @ReneeYao1
10X PERFORMANCE GAIN IN LESS THAN A YEAR
DGX-1, SEP’17 DGX-2, Q3‘18
software improvements across the stack including NCCL, cuDNN, etc.
Workload: FairSeq, 55 epochs to solution. PyTorch training performance.
Time to Train (days)
1.5
15
0 5 10 15 20
DGX-2
DGX-1 with V100
10 Times Fasterdays
days
4. 4Twitter: @ReneeYao1
DGX-2 NOW SHIPPING
1
2
3
5
4
6 Two Intel Xeon Platinum CPUs
7 1.5 TB System Memory
4
30 TB NVME SSDs
Internal Storage
NVIDIA Tesla V100 32GB
Two GPU Boards
8 V100 32GB GPUs per board
6 NVSwitches per board
512GB Total HBM2 Memory
interconnected by
Plane Card
Twelve NVSwitches
2.4 TB/sec bi-section
bandwidth
Eight EDR Infiniband/100 GigE
1600 Gb/sec Total
Bi-directional Bandwidth
PCIe Switch Complex
8
9
9Dual 10/25 Gb/sec
Ethernet
5. 5Twitter: @ReneeYao1Twitter: @ReneeYao1
MULTI-CORE AND CUDA WITH ONE GPU
GPU
GPC
GPC
HBM2
Memory
Controller
Memory
Controller
HBM2
Memory
Controller
Memory
Controller
XBAR High-Speed Hub
NVLink
NVLink
Copy
Engines
PCIe I/O
Work (data and
CUDA Kernels)
Results
(data)
CPU
• Users explicitly
express parallel
work in CUDA
• GPU Driver
distributes work
to available
GPC/SM cores
• GPC/SM cores
use shared
HBM2 to
exchange data
6. 6Twitter: @ReneeYao1Twitter: @ReneeYao1
TWO-GPUS WITH PCIE
GPU0
GPC
GPC
XBAR
HBM2+MemCtrlHBM2+MemCtrl
High-Speed
Hub
NVLink
NVLink
Copy
Engines
PCIe I/O
GPU1
GPC
GPC
XBAR
HBM2+MemCtrlHBM2+MemCtrl
High-Speed
Hub
NVLink
NVLink
Copy
Engines
PCIe I/O
Work (data and
CUDA Kernels)
Results
(data)
CPU
• Access to HBM2
of other GPU is
at PCIe BW
(16 GBps)
• PCIe is the
“Wild West”
(lots of perf
bandits)
• Interactions
with CPU
compete with
GPU-to-GPU
7. 7Twitter: @ReneeYao1Twitter: @ReneeYao1
TWO-GPUS WITH NVLINK
• Access to HBM2
of other GPU is
at multi-NVLink
BW (150 GBps
in V100 GPUs)
• All GPCs can
access all HBM2
memories
• NVLinks are
effectively a
“bridge”
between XBARs
GPU0
GPC
GPC
XBAR
HBM2+MemCtrlHBM2+MemCtrl
High-Speed
Hub
NVLink
NVLink
Copy
Engines
PCIe I/O
GPU1
GPC
GPC
XBAR
HBM2+MemCtrlHBM2+MemCtrl
High-Speed
Hub
NVLink
NVLink
Copy
Engines
PCIe I/O
Work (data and
CUDA Kernels)
Results
(data)
CPU
8. 8Twitter: @ReneeYao1Twitter: @ReneeYao1
THE “ONE GIGANTIC GPU” IDEAL
• Number of GPUs is as high as
possible
• Single GPU Driver process controls
all work across all GPUs
• From perspective of GPCs, all
HBM2s can be accessed without
intervention by other processes
(LD/ST instructions, Copy Engine
RDMA, everything “just works”)
• Access to all HBM2s is
independent of PCIe
• BW across bridged XBARs is as
high as possible (some NUMA is
unavoidable)
GPU0
GPU1
GPU2
GPU3
GPU0
GPU1
GPU2
GPU3
GPU0
GPU1
GPU2
GPU3
GPU0
GPU1
GPU2
GPU3
NVLink XBAR
CPU
CPU
?
10. 10Twitter: @ReneeYao1
DGX-2 NOW SHIPPING
1
2
3
5
4
6 Two Intel Xeon Platinum CPUs
7 1.5 TB System Memory
10
30 TB NVME SSDs
Internal Storage
NVIDIA Tesla V100 32GB
Two GPU Boards
8 V100 32GB GPUs per board
6 NVSwitches per board
512GB Total HBM2 Memory
interconnected by
Plane Card
Twelve NVSwitches
2.4 TB/sec bi-section
bandwidth
Eight EDR Infiniband/100 GigE
1600 Gb/sec Total
Bi-directional Bandwidth
PCIe Switch Complex
8
9
9Dual 10/25 Gb/sec
Ethernet
11. 11Twitter: @ReneeYao1Twitter: @ReneeYao1
EXPANDABLE SYSTEM
• Taking this to the limit - connect one NVLink from each
GPU to each of 6 switches
• No routing between different switch planes required
• 8 NVLinks of the 18 available per switch are used to
connect to GPUs
• 10 NVLinks available per switch for communication
outside the local group (only 8 are required to support
full BW)
• This is the GPU baseboard configuration for DGX-2
V100
V100
V100
V100
V100
V100
V100
V100
NVSWITCH
12. 12Twitter: @ReneeYao1Twitter: @ReneeYao1
DGX-2 NVLINK FABRICV100
V100
V100
V100
V100
V100
V100
V100
NVSWITCH
NVSWITCH
V100
V100
V100
V100
V100
V100
V100
V100
• Two of these building blocks together form a
fully connected 16GPU cluster
• Non-blocking, non-interfering (unless same
destination is involved)
• Regular load, store, atomics just work
• Presenters note: The astute among you will
note that there is a redundant level of
switches here, but configuration simplifies
system-level design and manufacturing
13. 14
Data Science HW Architecture
128x memory I/O
300x core-to-core I/O
100x processing cores
128 GB/s
20
Cores
512
GB
128 GB/s
20
Cores
512
GB
128 GB/s
20
Cores
512
GB
128 GB/s
20
Cores
512
GB
CPU Cluster
DGX-2
Larger datasets but slower
● CPU/memory bandwidth
● # of processing cores
● Network I/O
128 GB/s
20
Cores
512
GB
Single CPU Node
Typically very slow
With 20GB+ datasets
14. 15Twitter: @ReneeYao1Twitter: @ReneeYao1
DGX-2 PCIE NETWORK
PCIE
SW
x86x86
PCIE
SW
PCIE
SW
PCIE
SW
PCIE
SW
PCIE
SW
x6x6
PCIE
SW
PCIE
SW
PCIE
SW
PCIE
SW
PCIE
SW
PCIE
SW
PCIE
SW
PCIE
SW
200G
NIC
200G
NIC
200G
NIC
200G
NIC
200G
NIC
200G
NIC
200G
NIC
200G
NIC
QPIQPI
V100
V100
V100
V100
V100
V100
V100
V100
NVSWITCH
NVSWITCH
V100
V100
V100
V100
V100
V100
V100
V100
• Xeon sockets are
QPI connected, but
affinity-binding
keeps GPU-related
traffic off QPI
• PCIe tree has NICs
connected to pairs
of GPUs to facilitate
GPUDirect RDMAs
over IB network
• Configuration and
control of the
NVSwitches is via
driver process
running on CPUs
16. 17Twitter: @ReneeYao1Twitter: @ReneeYao1
NVIDIA DGX-2: SYSTEM COOLING
• Forced-air cooling
of Baseboards, I/O
Expander, and CPU
provided by 10
92 mm fans
• 4 supplemental
60 mm internal fans
to cool NVMe drives
and PSUs
• Air to NVSwitches is
pre-heated by
GPUs, so use “full
height” heatsinks
17. 18Twitter: @ReneeYao1
DGX-2: cuFFT
• Results are “iso-
problem instance”
(more GFLOPS means
shorter running time)
• As problem is split
over more GPUs, it
takes longer to
transfer data than to
calculate locally
DGX-1V½ DGX-2
19. 20Twitter: @ReneeYao1Twitter: @ReneeYao1
DGX-2: UP TO 2.7X ON TARGET APPS
2 DGX-1V servers have dual socket Xeon E5 2698v4 Processor. 8 x V100 32GB GPUs. Servers connected via 4 EDR IB ports | DGX-2 server has dual-socket Xeon Platinum 8168 Processor. 16 V100 32GB GPUs
13K
GFLOPS
26K
GFLOPS
Physics
(MILC benchmark)
4D Grid
Weather
(IFS benchmark)
FFT, All-to-all
Recommender
(Sparse Embedding)
Reduce & Broadcast
22B
lookups
/sec
11B
Lookups
/sec
Language Model
(Transformer with MoE)
All-to-all
9.3Hr
3.4Hr
DGX-2 with NVSwitch2x DGX-1 (Volta)
2X FASTER 2.4X FASTER 2X FASTER 2.7X FASTER
11 Steps/
sec
26 Steps/
sec
20. 21Twitter: @ReneeYao1 21
FLEXIBILITY WITH
VIRTUALIZATION
Enable your own private DL Training
Cloud for your Enterprise
• KVM hypervisor for Ubuntu Linux
• Enable teams of developers to
simultaneously access DGX-2
• Flexibly allocate GPU resources to
each user and their experiments
• Full GPU’s and NVSwitch access
within VMs — either all GPU’s or as
few as 1
21. 22
CRISIS MANAGEMENT
SOLUTION
Natural disasters are increasingly causing major destruction
to life, property and economies. DFKI is using the NVIDIA
DGX-2 to evolve DeepEye —which uses satellite images
enriched with social media content to identify natural
disasters— into a crisis management solution. With
the increased GPU memory and fully connected
GPUs based on the NVSwitch architecture, DFKI
can build bigger models and process more
data to aid rescuers in their decision-
making for faster, more efficient
dispatching of
resources.
22. 23
“Fujifilm applies AI in a wide range of fields. In
healthcare, multiple NVIDIA GPUs will deliver
high-speed computation to develop AI
supporting image diagnostics.The introduction
of this supercomputer will massively increase our
processing power.We expect that AI learning that
once took days to complete can now be
completed within hours.”
AkiraYoda
chief digital officer of FUJIFILMCorporation
- Pharmaceuticals
- BioCDMO
- Regenerative medicine
- Analyzing and
recognizing medical
images
- Simulations display
materials and fine
chemicals
23. 24Twitter: @ReneeYao1Twitter: @ReneeYao1
AI ADOPTERS IMPEDED BY
INFRASTRUCTURE
AI Boosts Profit
Margins up to 15%
40% see infrastructure
as impeding AI
source: 2018 CTA Market Research
24. 25Twitter: @ReneeYao1Twitter: @ReneeYao1
THE CHALLENGE OF AI INFRASTRUCTURE
Short term thinking leads to longer term problems
Ensuring the
architecture delivers
predictable performance
that scales
DESIGN
GUESSWORK
Procuring, installing and
troubleshooting compute,
storage, networking and
software
DEPLOYMENT
COMPLEXITY
MULTIPLE POINTS
OF SUPPORT
Contending with
multiple vendors across
multiple layers in the
stack
25. 27Twitter: @ReneeYao1Twitter: @ReneeYao1
DESIGNING INFRASTRUCTURE THAT SCALES
Insights gained from deep learning data centers
Rack Design Networking Storage Facilities Software
• DL drives
close to
operational
limits
• Similarities
to HPC best
practices
• IB or
Ethernet
based fabric
• 100Gbps
inter-
connect
• High-
bandwidth,
ultra-low
latency
• Datasets
range from
10k’s to
millions
objects
• terabyte
levels of
storage and
up
• High IOPS,
low latency
• assume
higher watts
per-rack
• Higher
FLOPS/watt
= DC less
floorspace
required
• Scale
requires
“cluster-
aware”
software
Example:
• Autonomous vehicle = 1TB / hr
• Training sets up to 500 PB
• RN50: 113 days to train
• Objective: 7 days
• 6 simultaneous developers
= 97 node cluster
26. 28Twitter: @ReneeYao1Twitter: @ReneeYao1
NVIDIA DGX POD™
• Initial reference architecture based on the NVIDIA® DGX-1™ server
• Designed for deep learning training workflow
• Baseline for other reference architectures:
• Easily upgraded to NVIDIA DGX-2™ and NVIDIA HGX-2™ servers
• Industry-specific PODs
• Storage and network partners
• Server OEM solutions
A Reference Architecture For GPU Data Centers
27. 29Twitter: @ReneeYao1Twitter: @ReneeYao1
DGX DATA CENTER REFERENCE DESIGN
Easy Deployment of DGX Servers for Deep Learning
Content:
• AI Workflow and Sizing
• NVIDIA AI Software
• DGX POD Design
• DGX POD Installation and
Management
28. 30Twitter: @ReneeYao1Twitter: @ReneeYao1
NVIDIA AUTOMOTIVE WORKFLOW ON SATURNV
Research Workflow
Training
• Many node – user submits 1 job with
many single node training sessions -
hyper parm sweep
• Multi-node – user submits 1 job with
single multi-node training session
Inference
• Many GPU – user submits many jobs
each with single GPU inference
Inference
Many node
Training Multi
node
Training
StoragePerformance
Interconnect performance
30. 32Twitter: @ReneeYao1Twitter: @ReneeYao1
NVIDIA DGX POD — DGX-1
Reference Architecture in a Single 35 kW High-Density Rack
In real-life DL application development, one to two
DGX-1 servers per developer are often required
One DGX POD supports five developers (AV workload)
Each developer works on two experiments per day
One DGX-1/developer/experiment/day*
*300,000 0.5M images * 120 epochs @ 480 images/sec
Resnet-18 backbone detection network per experiment
Fit within a standard-height
42 RU data center rack
• Nine DGX-1 servers
(9 x 3 RU = 27 RU)
• Twelve storage servers
(12 x 1 RU = 12 RU)
• 10 GbE (min) storage and
management switch
(1 RU)
• Mellanox 100 Gbps intra-
rack high speed network
switches
(1 or 2 RU)
31. 33Twitter: @ReneeYao1Twitter: @ReneeYao1
NVIDIA DGX POD — DGX-2
Reference Architecture in a Single 35 kW High-Density Rack
Fit within a standard-height
48 RU data center rack
• Three DGX-2 servers
(3 x 10 RU = 30 RU)
• Twelve storage servers
(12 x 1 RU = 12 RU)
• 10 GbE (min) storage and
management switch
(1 RU)
• Mellanox 100 Gbps intra-
rack high speed network
switches
(1 or 2 RU)
In real-life DL application development, one DGX-2 per
developer minimizes model training time
One DGX POD supports at least three developers
(AV workload)
Each developer works on two experiments per day
One DGX-2/developer/2 experiments/day*
*300,000 0.5M images * 120 epochs @ 480 images/sec
Resnet-18 backbone detection network per experiment
32. 34Twitter: @ReneeYao1
NEW DGX PODS
DELIVERY, DEPLOYMENT, DEEP LEARNING IN A DAY
95% Reduction in Deployment Time
5X Increase in Data Scientist Productivity
$0 Integration Cost
Adopted by Leading Auto, Healthcare & Telco Companies
33. 35Twitter: @ReneeYao1
NVIDIA DGX
SYSTEMS
Faster AI Innovation
and Insight
The World’s First Portfolio of
Purpose-Built AI Supercomputers
• Powered by NVIDIA GPU Cloud
• Get Started in AI – Faster
• Effortless Productivity
• Performance Without Compromise
For More Information
DGX Systems: nvidia.com/dgx
DGX Pod: https://www.nvidia.com/en-us/data-
center/resources/nvidia-dgx-pod-reference-
architecture/
DGX Reference Architecture:
https://www.nvidia.com/en-us/data-center/dgx-
reference-architecture/ 35