SlideShare a Scribd company logo
––




Accelerating High Performance Applications
Strategic Focus on Applications

 Senior-level relationship and market
 managers

 Dedicated technical resources

 More than 150 people devoted to
 libraries, tools, application porting
 and market development

 Worldwide focus
Reaching a Broad Range of Markets




  Scientific computing   Creative pro   Education / research
Strategic Partners
CAD/ CAM/    CAE/ EDA    Computational   Computational   Defence &      Digital        Physical       Seismic
CAID                     chemistry       Finance         Intelligence   Content        Sciences       processing
                                                                        creation                      and
                                                                                                      visualization
Autodesk     Ansys       Amber           MATLAB          Ikena          Adobe          Quda (L-QCD)   Schlumberger



Dassault     Dassault    NAMD            Mathematica     Intergraph     Autodesk M&E   WRF            Landmark
Systemes:    Systemes:
CATIA        Simulia
Solidworks

PTC          Nastran     Gromacs         NAG             ESRI           Avid           ACUSA          Paradigm



Siemens      LSTC        Lammps          Murex           Manifold       MainConcept    HOMME



             Synopsys    GAMESS                                         Sony           HYCOM
Leading MD Applications


                    Features
 Application                             GPU Perf   Release Status                           Notes
                   Supported
                     PMEMD :                                                       Single and multi-GPUs.
  AMBER         Explicit & Implicit         8X         V11 Released            Expect 2x more performance in
                     Solvent                                                     V11 patch release (shortly)

               Implicit (5x), Explicit              Single GPU released,             Next release: 2H2011
 GROMACS           (2x) Solvent
                                          2x-5x         Version 4.5.4                 Better Explicit, MPI

               Lennard-Jones, Gay-
 LAMMPS              Berne
                                            6x           Released                    Single and multi-GPU.


                  Non-bond force
  NAMD              calculation
                                          2x-7x        Released, v2.8                Single and multi-GPU.


                                                                    GPU Perf compared against Multi-core x86 CPU socket.
                                                                       GPU Perf benchmarked on GPU supported features
                                                                           and may be a kernel to kernel perf comparison
Additional MD/MM Applications Ramping

                    Features
 Application                             GPU Perf           Release Status                          Notes
                   Supported

                       TBD,                 4-29X                                                Single GPU.
 Abalone           “Simulations”          (on 1060 GPU)
                                                                Released
                                                                                             Agile Molecule, Inc.

                                                                                        Production bio-molecular
                                          “µ-sec long
                 Written for use on                                                  dynamics (MD) software specially
  ACEMD                GPUs
                                        trajectories on         Released
                                                                                      optimized to run on single and
                                         workstation”
                                                                                               multi-GPUs
               Two-body Forces, Link-
                                                            V 4.0 Source only              Next release: 2H2011
 DL_POLY       cell Pairs, Ewald SPME         4x            Results Published        Multi-GPU, multi-node supported
                  forces, Shake VV

 HOOMD-          Written for use on           2X            Released, Version
                                                                                            Single and multi-GPU.
                                        (32 CPU cores vs.
                       GPUs                                       0.9.2
  Blue                                    2 10XX GPUs)


                                                                           GPU Perf compared against Multi-core x86 CPU socket.
                                                                              GPU Perf benchmarked on GPU supported features
                                                                                  and may be a kernel to kernel perf comparison
Viz and “Docking” Applications

  Related              Features
                                                       GPU Perf       Release Status                            Notes
Applications          Supported
                                                                                                     Visualization from Visage
                  3D visualization of
                                                                                                  Imaging. Next release, 5.4, will
Amira 5®         volumetric data and                      N/A        Released, Version 5.3.3
                                                                                                   use GPU for general purpose
                       surfaces
                                                                                                   processing in some functions

  Core              GPU accelerated                      Up to
                                                                      Released, Suite 2011
                                                                                                       Single and multi-GPUs.
                      application                        5000X                                            Schrodinger, Inc.
 Hopping
                   Real-time shape
                                                                                                      Single and multi-GPUs.
FastROCS              similarity                       800-3000X            Released
                                                                                                   Open Eyes Scientific Software
                searching/comparison
                      High quality rendering,
               large structures (100 million atoms),
                       GPU acceleration for
                                                       100-125X or                                Visualization from University of
   VMD         computationally demanding analysis
                 and visualization tasks, multiple
                GPU support for very fast display of
                                                         greater
                                                                     Released, Version 1.9
                                                                                                   Illinois at Urbana-Champaign
                    molecular orbitals arising in
                  quantum chemistry calculations
                                                                                       GPU Perf compared against Multi-core x86 CPU socket.
                                                                                          GPU Perf benchmarked on GPU supported features
                                                                                              and may be a kernel to kernel perf comparison
Quantum Chemistry
                   Features            GPU
Application                                        Release Status                           Notes
                  Supported            Perf
                  Libqc with Rys
                                                                              Single GPU supported in 10/1/10
              Quadrature Algorithm,
                                                                                          release.
GAMESS-US      integral evaluation,     2.5X            Released
                                                                                   Multi-GPU supported in
                 closed shell Fock
                                                                                     July 2011 release.
               matrix construction
               Triples part of Reg-
                                                                                   Development GPGPU
                CCSD(T), CCSD &         3-8X           Date TBA,
NWChem            EOMCCSD task        projected     in development
                                                                                benchmarks: www.nwchem-
                                                                                         sw.org
                    schedulers
                                                       Date TBA,
                Various features        8-14x
 Q-CHEM         including RI-MP2      projected
                                                    In development               Significant porting already

                                      44-650X                                    Single and Multi-GPU.
                “Full GPU-based         vs.                                  Completely redesigned to exploit
TeraChem           solution”          GAMESS
                                                  Version 1.45 released
                                                                                massive GPU parallelism
                                      CPU ver.
                                                                   GPU Perf compared against Multi-core x86 CPU socket.
                                                                      GPU Perf benchmarked on GPU supported features
                                                                          and may be a kernel to kernel perf comparison
Material Science



                   Features            GPU
Application                                    Release Status                           Notes
                  Supported            Perf
               BigDFT - 50% of the                                       http://inac.cea.fr/L_Sim/BigDFT
 Abinit          program (short        6-30X   Released June 2009                  /news.html
                  convolutions)

Quantum-       PWscf package: linear
                 algebra (matrix
                                                                          Created by Irish Centre for High-
Espresso/       multiply), explicit    TBD     Released May 5, 2011
                                                                                  End Computing
              computational kernels,
  PWscf               3D FFTs




                                                                GPU Perf compared against Multi-core x86 CPU socket.
                                                                   GPU Perf benchmarked on GPU supported features
                                                                       and may be a kernel to kernel perf comparison
Bioinformatics


CUDA-BLASTP                 HEX Protein Docking
CUDA-EC                     Jacket (MATLAB Plugin)
CUDA-MEME                   MUMmerGPU
CUDASW++ (Smith-Waterman)   MUMmerGPU++
DNADist                     SARUMAN
GPU Blast                   SeqNFind
GPU-HMMER                   UGENE


                            Additional details can be found at Tesla Bio Workbench:
                            http://www.nvidia.com/object/tesla_bio_workbench.html
Structural Mechanics
    Application      GPU Features               GPU Perf               Release Status                        Notes
ANSYS Mechanical     Linear eqn solvers           2x Total            Today, release 13 SP2          FE implicit, single-GPU

 Abaqus/Standard     Linear eqn solver            2x Total             Today, release 6.11           FE implicit, single-GPU

  IMPETUS Afea       Explicit solver, SPH   10x SPH, 2x Total           Today, release 1.0           FE explicit, multi-GPU


 LS-DYNA implicit    Linear eqn solver            3x Total              Planned for 2011             FE implicit, multi-GPU


   MD Nastran        Linear eqn solvers          2x Solver              Planned for 2011             FE implicit, multi-GPU


       Marc          Linear eqn solver           1.5x Total             Planned for 2011             FE implicit, single-GPU

 RADIOSS Implicit    Linear eqn solver           1.5x Total               Demonstration              FE implicit, single-GPU

PAM-CRASH implicit   Linear eqn solver           1.5x Total               Demonstration              FE implicit, single-GPU

   NX Nastran        Linear eqn solver           1.4x Total               Demonstration              FE implicit, single-GPU
                                   GPU Perf compared against Multi-core x86 CPU socket.
                                   GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
Fluid Dynamics
   Application      GPU Features                GPU Perf              Release Status                       Notes
 Altair AcuSolve    Linear eqn solver             2x Total             Today, release 1.8     FE unstructured NS, multi-GPU

Autodesk Moldflow   Linear eqn solver             2x Total            Today, release 2011     FE unstructured NS, single-GPU

 FluiDyna LBultra   LBM, particle CFD            20x Total             Today, release 1.0       Structured LBM, multi-GPU

FluiDyna Culises-   Linear eqn solvers           3x Solver             Today, release 1.0       Unstructured NS, single-GPU
OpenFOAM Solver
 Vratis SpeedIT-    Linear eqn solvers           3x Solver             Today, release 1.2       Unstructured NS, multi-GPU
OpenFOAM Solver
   Prometech        MPS, particle CFD           4x-9x Total           Q3CY11 release 2.5         Particle based, multi-GPU
  Particleworks
  Sandia NL S3D     Chemistry kernel       8x SP, 5x DP kernel           Demonstration        Structured grid DNS, multi-GPU

  Turbostream         Explicit solver            19x Total             Today, release 2.0      Structured grid NS, multi-GPU

 SD++ (Jameson)       Explicit solver            16x Total             Planned for 2011       FE unstructured NS, multi-GPU
                                    GPU Perf compared against Multi-core x86 CPU socket.
 FEFLO (Lohner)       Explicit solver            2x Total            Planned for 2011         FE unstructured NS, multi-GPU
                                    GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
Electromagnetics

                     Features
  Application                           GPU Perf       Release Status                      Notes
                    Supported
                                                                                     Single & multi-GPU;
 Agilent EMPro          FDTD                6X         2011.07 Released
                                                                                        EMPro 2011 PR

                     Transient (FIT)    9X on 1 GPU
CST Microwave                                                                        Single & multi-GPU;
                 solver; Combined MPI   to 20X+ on 4     2011 Released
                                                                                      www.cst.com/perf
    Studio         & GPU computing          GPUs
                                                                                   Single and multi-GPU;
Remcom XFdtd            FDTD              30-300X        XF7 Released
                                                                                 XStream GPU acceleration

                       FDTD;                                                        Single and multi-GPU;
SPEAG SEMCAD X       Acceleware
                                           100X         14.4.3 Released
                                                                                    www.speag.com/perf




                                                           GPU Performance compared against quad-core x86 CPU socket;
                                                        Remcom XFdtd GPU performance compared against single core CPU
Climate/ Weather/ Ocean
Application   GPU Features                 GPU Perf              Production Status                       Notes
               WSM5, WSM3, Ice
  WRF         Microphysics models
                                         4x-6x Models               Today, release 3.2                  single-GPU


 ASUCA           Most routines             12x Total              In production at JMA                  multi-GPU

   NIM           Most routines           7x Dynamics               Limited production                   multi-GPU

 HIRLAM         Dynamical core             3x Solver                 Planned for 2011                   multi-GPU


 HOMME              Models                 3x Models                 Planned for 2011                   single-GPU


  CAM          Linear eqn solver           2x Solver                 Planned for 2011                   single-GPU

                                        10x Models, 3x
 GEOS-5          Most routines
                                          Dynamics
                                                                      Demonstration                     multi-GPU


 MITgcm        Linear eqn solver           3x solver                  Demonstration                     single-GPU

 HYCOM         Linear eqn solver           2x solver                  Demonstration                     single-GPU
                             GPU Perf compared against Multi-core x86 CPU socket.
                             GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison

More Related Content

What's hot

AMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD
 
Poser pro reference manual
Poser pro reference manualPoser pro reference manual
Poser pro reference manualSykrayo
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD
 
AMD Unified Video Decoder
AMD Unified Video DecoderAMD Unified Video Decoder
AMD Unified Video DecoderAMD
 
Hardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedHardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedThe Linux Foundation
 
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_AnkaraCongatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_AnkaraIşınsu Akçetin
 
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_IstanbulCongatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_IstanbulIşınsu Akçetin
 
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC clusterToward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC clusterRyousei Takano
 
AMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD
 
Simulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersSimulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersEric Van Hensbergen
 
Case Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded ProcessorsCase Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded Processorsaccount inactive
 
Hp All In 1
Hp All In 1Hp All In 1
Hp All In 1RBratton
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD
 
HPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialHPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialJeff Larkin
 
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011Shinya Takamaeda-Y
 
Gentek Introduce(en)
Gentek Introduce(en)Gentek Introduce(en)
Gentek Introduce(en)cloudmmog
 
Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1Bengt Edlund
 

What's hot (20)

AMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series Presentation
 
Poser pro reference manual
Poser pro reference manualPoser pro reference manual
Poser pro reference manual
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick Bergman
 
AMD Unified Video Decoder
AMD Unified Video DecoderAMD Unified Video Decoder
AMD Unified Video Decoder
 
Hardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedHardware assisted Virtualization in Embedded
Hardware assisted Virtualization in Embedded
 
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_AnkaraCongatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
 
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_IstanbulCongatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
 
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC clusterToward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
 
Implement Checkpointing for Android
Implement Checkpointing for AndroidImplement Checkpointing for Android
Implement Checkpointing for Android
 
AMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press Presentation
 
Simulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersSimulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to Supercomputers
 
Case Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded ProcessorsCase Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded Processors
 
Implement Checkpointing for Android (ELCE2012)
Implement Checkpointing for Android (ELCE2012)Implement Checkpointing for Android (ELCE2012)
Implement Checkpointing for Android (ELCE2012)
 
Hp All In 1
Hp All In 1Hp All In 1
Hp All In 1
 
Power7 facts and features 17 aug
Power7 facts and features 17 augPower7 facts and features 17 aug
Power7 facts and features 17 aug
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
HPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialHPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorial
 
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
 
Gentek Introduce(en)
Gentek Introduce(en)Gentek Introduce(en)
Gentek Introduce(en)
 
Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1
 

Similar to Nvidia Cuda Apps Jun27 11

PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrKohei KaiGai
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101John Holden
 
2D Games to HPC
2D Games to HPC2D Games to HPC
2D Games to HPCDVClub
 
GPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O ArchitectureGPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O Architectureguestb3fc97
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”HSA Foundation
 
Compute API –Past & Future
Compute API –Past & FutureCompute API –Past & Future
Compute API –Past & FutureOfer Rosenberg
 
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsightlaparuma
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation AMD
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationCliff Kinard
 
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverKernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverAnne Nicolas
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeireimec
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Slide_N
 
Sears Point Racetrack
Sears Point RacetrackSears Point Racetrack
Sears Point RacetrackDino, llc
 

Similar to Nvidia Cuda Apps Jun27 11 (20)

PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101
 
2D Games to HPC
2D Games to HPC2D Games to HPC
2D Games to HPC
 
3 d to_hpc
3 d to_hpc3 d to_hpc
3 d to_hpc
 
GPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O ArchitectureGPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O Architecture
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 
PG-Strom
PG-StromPG-Strom
PG-Strom
 
Compute API –Past & Future
Compute API –Past & FutureCompute API –Past & Future
Compute API –Past & Future
 
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
 
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverKernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeire
 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
 
Pgopencl
PgopenclPgopencl
Pgopencl
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Example Application of GPU
Example Application of GPUExample Application of GPU
Example Application of GPU
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3
 
Sears Point Racetrack
Sears Point RacetrackSears Point Racetrack
Sears Point Racetrack
 

Recently uploaded

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀DianaGray10
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Thierry Lestable
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...Product School
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...Product School
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaRTTS
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Product School
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxDavid Michel
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxAbida Shariff
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlPeter Udo Diehl
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoTAnalytics
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform EngineeringJemma Hussein Allen
 

Recently uploaded (20)

Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 

Nvidia Cuda Apps Jun27 11

  • 2. Strategic Focus on Applications Senior-level relationship and market managers Dedicated technical resources More than 150 people devoted to libraries, tools, application porting and market development Worldwide focus
  • 3. Reaching a Broad Range of Markets Scientific computing Creative pro Education / research
  • 4. Strategic Partners CAD/ CAM/ CAE/ EDA Computational Computational Defence & Digital Physical Seismic CAID chemistry Finance Intelligence Content Sciences processing creation and visualization Autodesk Ansys Amber MATLAB Ikena Adobe Quda (L-QCD) Schlumberger Dassault Dassault NAMD Mathematica Intergraph Autodesk M&E WRF Landmark Systemes: Systemes: CATIA Simulia Solidworks PTC Nastran Gromacs NAG ESRI Avid ACUSA Paradigm Siemens LSTC Lammps Murex Manifold MainConcept HOMME Synopsys GAMESS Sony HYCOM
  • 5. Leading MD Applications Features Application GPU Perf Release Status Notes Supported PMEMD : Single and multi-GPUs. AMBER Explicit & Implicit 8X V11 Released Expect 2x more performance in Solvent V11 patch release (shortly) Implicit (5x), Explicit Single GPU released, Next release: 2H2011 GROMACS (2x) Solvent 2x-5x Version 4.5.4 Better Explicit, MPI Lennard-Jones, Gay- LAMMPS Berne 6x Released Single and multi-GPU. Non-bond force NAMD calculation 2x-7x Released, v2.8 Single and multi-GPU. GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 6. Additional MD/MM Applications Ramping Features Application GPU Perf Release Status Notes Supported TBD, 4-29X Single GPU. Abalone “Simulations” (on 1060 GPU) Released Agile Molecule, Inc. Production bio-molecular “µ-sec long Written for use on dynamics (MD) software specially ACEMD GPUs trajectories on Released optimized to run on single and workstation” multi-GPUs Two-body Forces, Link- V 4.0 Source only Next release: 2H2011 DL_POLY cell Pairs, Ewald SPME 4x Results Published Multi-GPU, multi-node supported forces, Shake VV HOOMD- Written for use on 2X Released, Version Single and multi-GPU. (32 CPU cores vs. GPUs 0.9.2 Blue 2 10XX GPUs) GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 7. Viz and “Docking” Applications Related Features GPU Perf Release Status Notes Applications Supported Visualization from Visage 3D visualization of Imaging. Next release, 5.4, will Amira 5® volumetric data and N/A Released, Version 5.3.3 use GPU for general purpose surfaces processing in some functions Core GPU accelerated Up to Released, Suite 2011 Single and multi-GPUs. application 5000X Schrodinger, Inc. Hopping Real-time shape Single and multi-GPUs. FastROCS similarity 800-3000X Released Open Eyes Scientific Software searching/comparison High quality rendering, large structures (100 million atoms), GPU acceleration for 100-125X or Visualization from University of VMD computationally demanding analysis and visualization tasks, multiple GPU support for very fast display of greater Released, Version 1.9 Illinois at Urbana-Champaign molecular orbitals arising in quantum chemistry calculations GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 8. Quantum Chemistry Features GPU Application Release Status Notes Supported Perf Libqc with Rys Single GPU supported in 10/1/10 Quadrature Algorithm, release. GAMESS-US integral evaluation, 2.5X Released Multi-GPU supported in closed shell Fock July 2011 release. matrix construction Triples part of Reg- Development GPGPU CCSD(T), CCSD & 3-8X Date TBA, NWChem EOMCCSD task projected in development benchmarks: www.nwchem- sw.org schedulers Date TBA, Various features 8-14x Q-CHEM including RI-MP2 projected In development Significant porting already 44-650X Single and Multi-GPU. “Full GPU-based vs. Completely redesigned to exploit TeraChem solution” GAMESS Version 1.45 released massive GPU parallelism CPU ver. GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 9. Material Science Features GPU Application Release Status Notes Supported Perf BigDFT - 50% of the http://inac.cea.fr/L_Sim/BigDFT Abinit program (short 6-30X Released June 2009 /news.html convolutions) Quantum- PWscf package: linear algebra (matrix Created by Irish Centre for High- Espresso/ multiply), explicit TBD Released May 5, 2011 End Computing computational kernels, PWscf 3D FFTs GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 10. Bioinformatics CUDA-BLASTP HEX Protein Docking CUDA-EC Jacket (MATLAB Plugin) CUDA-MEME MUMmerGPU CUDASW++ (Smith-Waterman) MUMmerGPU++ DNADist SARUMAN GPU Blast SeqNFind GPU-HMMER UGENE Additional details can be found at Tesla Bio Workbench: http://www.nvidia.com/object/tesla_bio_workbench.html
  • 11. Structural Mechanics Application GPU Features GPU Perf Release Status Notes ANSYS Mechanical Linear eqn solvers 2x Total Today, release 13 SP2 FE implicit, single-GPU Abaqus/Standard Linear eqn solver 2x Total Today, release 6.11 FE implicit, single-GPU IMPETUS Afea Explicit solver, SPH 10x SPH, 2x Total Today, release 1.0 FE explicit, multi-GPU LS-DYNA implicit Linear eqn solver 3x Total Planned for 2011 FE implicit, multi-GPU MD Nastran Linear eqn solvers 2x Solver Planned for 2011 FE implicit, multi-GPU Marc Linear eqn solver 1.5x Total Planned for 2011 FE implicit, single-GPU RADIOSS Implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU PAM-CRASH implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU NX Nastran Linear eqn solver 1.4x Total Demonstration FE implicit, single-GPU GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 12. Fluid Dynamics Application GPU Features GPU Perf Release Status Notes Altair AcuSolve Linear eqn solver 2x Total Today, release 1.8 FE unstructured NS, multi-GPU Autodesk Moldflow Linear eqn solver 2x Total Today, release 2011 FE unstructured NS, single-GPU FluiDyna LBultra LBM, particle CFD 20x Total Today, release 1.0 Structured LBM, multi-GPU FluiDyna Culises- Linear eqn solvers 3x Solver Today, release 1.0 Unstructured NS, single-GPU OpenFOAM Solver Vratis SpeedIT- Linear eqn solvers 3x Solver Today, release 1.2 Unstructured NS, multi-GPU OpenFOAM Solver Prometech MPS, particle CFD 4x-9x Total Q3CY11 release 2.5 Particle based, multi-GPU Particleworks Sandia NL S3D Chemistry kernel 8x SP, 5x DP kernel Demonstration Structured grid DNS, multi-GPU Turbostream Explicit solver 19x Total Today, release 2.0 Structured grid NS, multi-GPU SD++ (Jameson) Explicit solver 16x Total Planned for 2011 FE unstructured NS, multi-GPU GPU Perf compared against Multi-core x86 CPU socket. FEFLO (Lohner) Explicit solver 2x Total Planned for 2011 FE unstructured NS, multi-GPU GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 13. Electromagnetics Features Application GPU Perf Release Status Notes Supported Single & multi-GPU; Agilent EMPro FDTD 6X 2011.07 Released EMPro 2011 PR Transient (FIT) 9X on 1 GPU CST Microwave Single & multi-GPU; solver; Combined MPI to 20X+ on 4 2011 Released www.cst.com/perf Studio & GPU computing GPUs Single and multi-GPU; Remcom XFdtd FDTD 30-300X XF7 Released XStream GPU acceleration FDTD; Single and multi-GPU; SPEAG SEMCAD X Acceleware 100X 14.4.3 Released www.speag.com/perf GPU Performance compared against quad-core x86 CPU socket; Remcom XFdtd GPU performance compared against single core CPU
  • 14. Climate/ Weather/ Ocean Application GPU Features GPU Perf Production Status Notes WSM5, WSM3, Ice WRF Microphysics models 4x-6x Models Today, release 3.2 single-GPU ASUCA Most routines 12x Total In production at JMA multi-GPU NIM Most routines 7x Dynamics Limited production multi-GPU HIRLAM Dynamical core 3x Solver Planned for 2011 multi-GPU HOMME Models 3x Models Planned for 2011 single-GPU CAM Linear eqn solver 2x Solver Planned for 2011 single-GPU 10x Models, 3x GEOS-5 Most routines Dynamics Demonstration multi-GPU MITgcm Linear eqn solver 3x solver Demonstration single-GPU HYCOM Linear eqn solver 2x solver Demonstration single-GPU GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison