SlideShare a Scribd company logo
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
CUDA-based Linear Solvers for Stable
Fluids
G. Amador and A. Gomes
Departamento de Inform´atica
Universidade da Beira Interior
Covilh˜a, Portugal
m1420@ubi.pt, agomes@di.ubi.pt
April, 2010
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
1 Introduction
2 Stable Fluids
The Eulerian approach
Physics Model
3 NVIDIA Compute Unified Device Architecture (CUDA)
Workflow
Iterative solvers
Jacobi
Gauss-Seidel red-black
Conjugate gradient
4 Results
Jacobi performance
Gauss-Seidel performance
Conjugate gradient performance
5 Conclusions
Conclusions
Future Work
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Overview
The study of fluid simulation (e.g., water) is important
for two industries:
(real-time ≥ 30 fps) (off-line ≤ 30 fps)
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Overview
The study of fluid simulation (e.g., water) is important
for two industries:
(real-time ≥ 30 fps) (off-line ≤ 30 fps)
Problems:
How to implement (specifically for 3D stable fluids) the
CUDA-based versions of the Jacobi, Gauss-Seidel,
and conjugate gradient iterative solvers?
What are the real-time performance limitations of
these solvers implementations?
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
The Eulerian approach
The Eulerian approach
Space partitioning:
Variations of velocity and density are observed at the
center of each cell.
Velocities and densities are updated through an im-
plicit method (Stam stable fluids, 1999), i.e., uncondi-
tionally stable for any time step.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations for incompressible fluids
Mass conservation: −→
u = 0
Velocity evolution:
∂
−→
u
∂t
= −
−→
u ·
−→
u + v 2−→
u +
−→
f
Density evolution:
∂ρ
∂t
= −
−→
u · ρ + k 2
ρ + S
−→
u : velocity field.
v: fluids viscosity.
ρ: density of the field.
k: density diffusion rate.
−→
f : external forces added to the velocity field.
S: external sources added to the density field.
=
∂
∂x
,
∂
∂y
,
∂
∂z
: gradient.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations implementation
Update velocity:
Add external forces (
−→
f ).
Velocity Diffusion (v 2−→
u ).
Move (−
−→
u .
−→
u e
−→
u = 0).
Update density:
Add external sources (S).
Density advection (−
−→
u . ρ).
Density diffusion (k 2
ρ).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations implementation
Update velocity:
Add external forces (
−→
f ).
Velocity Diffusion (v 2−→
u ).
Move (−
−→
u .
−→
u e
−→
u = 0).
Update density:
Add external sources (S).
Density advection (−
−→
u . ρ).
Density diffusion (k 2
ρ).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Diffusion
Exchanges of density
or velocity between
neighbours (2D).
Solve a sparse linear system (Ax = b), using an iter-
ative method (e.g., Jacobi, Gauss-Seidel, conjugate
gradient, etc.).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Move
Ensure mass conservation and the fluid’s incom-
pressibility.
Hodge decomposition:
Conservative field = our field - gradient
Determine the gradient using diffusion’s iterative
method (e.g., Jacobi, Gauss-Seidel, conjugate gradi-
ent, etc.).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Workflow
Workflow
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Jacobi
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Gauss-Seidel red-black
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Iterative solvers
Conjugate gradient
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Jacobi performance
Jacobi performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Gauss-Seidel performance
Gauss-Seidel performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Conjugate gradient performance
Conjugate gradient performance
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Conclusions
Conclusions
The CUDA-based implementation of the Gauss-
Seidel solver allows more iterations than the CPU-
based implementation, however it converges two
times slower.
The CUDA-based implementations of the Jacobi and
Gauss-Seidel iterative solvers achieved better perfor-
mances (i.e. faster in processing time) than the CPU-
based implementations.
The CUDA-based implementation of the conjugate
gradient, for grid sizes superior to 643, due to global
memory latency, performs worst than the CPU-based
version.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Future Work
Future Work
Search ways, implementable using CUDA, to reduce
global memory accesses (e.g., data structures, dy-
namic memory, etc.).
Implement the CPU-based multi-core versions of
the solvers and compare their performance with the
CUDA-based versions.
Search new solvers implementable using CUDA, with
better convergence rate than relaxation techniques
(Jacobi and Gauss-Seidel), with no significant extra
computational effort such as the conjugate gradient.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Future Work
Questions???

More Related Content

Viewers also liked

Schulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDASchulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Jörn Dinkla
 
CUDA-Aware MPI
CUDA-Aware MPICUDA-Aware MPI
CUDA-Aware MPI
Eugene Kolesnikov
 
GPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel ApplicationsGPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel Applications
Marcos Gonzalez
 
PL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database AnalyticsPL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database Analytics
Kohei KaiGai
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
Kohei KaiGai
 
MUE 2011 Conference Presentation
MUE 2011 Conference PresentationMUE 2011 Conference Presentation
MUE 2011 Conference Presentation
Gonçalo Amador
 
Presentation visapp
Presentation visappPresentation visapp
Presentation visapp
Caroline Pacheco do E. Silva
 

Viewers also liked (7)

Schulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDASchulung: Einführung in das GPU-Computing mit NVIDIA CUDA
Schulung: Einführung in das GPU-Computing mit NVIDIA CUDA
 
CUDA-Aware MPI
CUDA-Aware MPICUDA-Aware MPI
CUDA-Aware MPI
 
GPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel ApplicationsGPU, CUDA, OpenCL and OpenACC for Parallel Applications
GPU, CUDA, OpenCL and OpenACC for Parallel Applications
 
PL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database AnalyticsPL/CUDA - GPU Accelerated In-Database Analytics
PL/CUDA - GPU Accelerated In-Database Analytics
 
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database AnalyticsPL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
PL/CUDA - Fusion of HPC Grade Power with In-Database Analytics
 
MUE 2011 Conference Presentation
MUE 2011 Conference PresentationMUE 2011 Conference Presentation
MUE 2011 Conference Presentation
 
Presentation visapp
Presentation visappPresentation visapp
Presentation visapp
 

Similar to ICISA 2010 Conference Presentation

A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognition
Federico Magliani
 
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaCFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
Mario Felipe Campuzano Ochoa
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10
Jody Garnett
 
Cuda
CudaCuda
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : Notes
Subhajit Sahu
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemC
DVClub
 
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesEvolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Databricks
 
Accelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdfAccelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdf
Liang Yan
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
NVIDIA Taiwan
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Cluster
airbots
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
NVIDIA Japan
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
Ganesan Narayanasamy
 
Converter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation BoardConverter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation Board
Analog Devices, Inc.
 
2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio
Leo Benkel
 
Speed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorSpeed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS Accelerator
Databricks
 
CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019
NVIDIA
 
the usefulness of ldope tools
the usefulness of ldope toolsthe usefulness of ldope tools
the usefulness of ldope tools
liang0816
 
Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.
Vasant Bhanushali
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with ScyllaMeeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with Scylla
ScyllaDB
 
Directions EMEA Choosing the best possible Azure platform for NAV
Directions EMEA Choosing the best possible Azure platform for NAVDirections EMEA Choosing the best possible Azure platform for NAV
Directions EMEA Choosing the best possible Azure platform for NAV
Aleksandar Totovic
 

Similar to ICISA 2010 Conference Presentation (20)

A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognition
 
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaCFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10
 
Cuda
CudaCuda
Cuda
 
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : Notes
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemC
 
Evolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming PipelinesEvolution is Continuous, and so are Big Data and Streaming Pipelines
Evolution is Continuous, and so are Big Data and Streaming Pipelines
 
Accelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdfAccelerate-your-AI-Cloud-infrastructure.pdf
Accelerate-your-AI-Cloud-infrastructure.pdf
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Cluster
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
 
Converter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation BoardConverter Simulation - Beyond the Evaluation Board
Converter Simulation - Beyond the Evaluation Board
 
2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio2020 03-26 - meet up - zparkio
2020 03-26 - meet up - zparkio
 
Speed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS AcceleratorSpeed up UDFs with GPUs using the RAPIDS Accelerator
Speed up UDFs with GPUs using the RAPIDS Accelerator
 
CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019CUDA Sessions You Won't Want to Miss at GTC 2019
CUDA Sessions You Won't Want to Miss at GTC 2019
 
the usefulness of ldope tools
the usefulness of ldope toolsthe usefulness of ldope tools
the usefulness of ldope tools
 
Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.Some Domestic & Global projects executed by our team.
Some Domestic & Global projects executed by our team.
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with ScyllaMeeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with Scylla
 
Directions EMEA Choosing the best possible Azure platform for NAV
Directions EMEA Choosing the best possible Azure platform for NAVDirections EMEA Choosing the best possible Azure platform for NAV
Directions EMEA Choosing the best possible Azure platform for NAV
 

Recently uploaded

Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Crescat
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
pavan998932
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Julian Hyde
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 

Recently uploaded (20)

Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
What is Augmented Reality Image Tracking
What is Augmented Reality Image TrackingWhat is Augmented Reality Image Tracking
What is Augmented Reality Image Tracking
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)Measures in SQL (SIGMOD 2024, Santiago, Chile)
Measures in SQL (SIGMOD 2024, Santiago, Chile)
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 

ICISA 2010 Conference Presentation

  • 1. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions CUDA-based Linear Solvers for Stable Fluids G. Amador and A. Gomes Departamento de Inform´atica Universidade da Beira Interior Covilh˜a, Portugal m1420@ubi.pt, agomes@di.ubi.pt April, 2010
  • 2. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions 1 Introduction 2 Stable Fluids The Eulerian approach Physics Model 3 NVIDIA Compute Unified Device Architecture (CUDA) Workflow Iterative solvers Jacobi Gauss-Seidel red-black Conjugate gradient 4 Results Jacobi performance Gauss-Seidel performance Conjugate gradient performance 5 Conclusions Conclusions Future Work
  • 3. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Overview The study of fluid simulation (e.g., water) is important for two industries: (real-time ≥ 30 fps) (off-line ≤ 30 fps)
  • 4. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Overview The study of fluid simulation (e.g., water) is important for two industries: (real-time ≥ 30 fps) (off-line ≤ 30 fps) Problems: How to implement (specifically for 3D stable fluids) the CUDA-based versions of the Jacobi, Gauss-Seidel, and conjugate gradient iterative solvers? What are the real-time performance limitations of these solvers implementations?
  • 5. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions The Eulerian approach The Eulerian approach Space partitioning: Variations of velocity and density are observed at the center of each cell. Velocities and densities are updated through an im- plicit method (Stam stable fluids, 1999), i.e., uncondi- tionally stable for any time step.
  • 6. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations for incompressible fluids Mass conservation: −→ u = 0 Velocity evolution: ∂ −→ u ∂t = − −→ u · −→ u + v 2−→ u + −→ f Density evolution: ∂ρ ∂t = − −→ u · ρ + k 2 ρ + S −→ u : velocity field. v: fluids viscosity. ρ: density of the field. k: density diffusion rate. −→ f : external forces added to the velocity field. S: external sources added to the density field. = ∂ ∂x , ∂ ∂y , ∂ ∂z : gradient.
  • 7. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations implementation Update velocity: Add external forces ( −→ f ). Velocity Diffusion (v 2−→ u ). Move (− −→ u . −→ u e −→ u = 0). Update density: Add external sources (S). Density advection (− −→ u . ρ). Density diffusion (k 2 ρ).
  • 8. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations implementation Update velocity: Add external forces ( −→ f ). Velocity Diffusion (v 2−→ u ). Move (− −→ u . −→ u e −→ u = 0). Update density: Add external sources (S). Density advection (− −→ u . ρ). Density diffusion (k 2 ρ).
  • 9. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Diffusion Exchanges of density or velocity between neighbours (2D). Solve a sparse linear system (Ax = b), using an iter- ative method (e.g., Jacobi, Gauss-Seidel, conjugate gradient, etc.).
  • 10. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Move Ensure mass conservation and the fluid’s incom- pressibility. Hodge decomposition: Conservative field = our field - gradient Determine the gradient using diffusion’s iterative method (e.g., Jacobi, Gauss-Seidel, conjugate gradi- ent, etc.).
  • 11. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Workflow Workflow
  • 12. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Iterative solvers Jacobi
  • 13. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Iterative solvers Gauss-Seidel red-black
  • 14. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Iterative solvers Conjugate gradient
  • 15. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Jacobi performance Jacobi performance
  • 16. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Gauss-Seidel performance Gauss-Seidel performance
  • 17. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Conjugate gradient performance Conjugate gradient performance
  • 18. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Conclusions Conclusions The CUDA-based implementation of the Gauss- Seidel solver allows more iterations than the CPU- based implementation, however it converges two times slower. The CUDA-based implementations of the Jacobi and Gauss-Seidel iterative solvers achieved better perfor- mances (i.e. faster in processing time) than the CPU- based implementations. The CUDA-based implementation of the conjugate gradient, for grid sizes superior to 643, due to global memory latency, performs worst than the CPU-based version.
  • 19. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Future Work Future Work Search ways, implementable using CUDA, to reduce global memory accesses (e.g., data structures, dy- namic memory, etc.). Implement the CPU-based multi-core versions of the solvers and compare their performance with the CUDA-based versions. Search new solvers implementable using CUDA, with better convergence rate than relaxation techniques (Jacobi and Gauss-Seidel), with no significant extra computational effort such as the conjugate gradient.
  • 20. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Future Work Questions???