SlideShare a Scribd company logo
1 of 19
Download to read offline
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
A CUDA-based Implementation of Stable
Fluids in 3D with Internal and Moving
Boundaries
G. Amador and A. Gomes
Departamento de Inform´atica
Universidade da Beira Interior
Covilh˜a, Portugal
m1420@ubi.pt, agomes@di.ubi.pt
March, 2010
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
1 Introduction
2 Stable Fluids
The Eulerian approach
Physics Model
3 NVIDIA Compute Unified Device Architecture (CUDA)
Workflow
Grid partition
4 Results
Results - Processing time
5 Conclusions
Conclusions
Future Work
Demo
Questions
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Overview
The study of natural phenomena simulation is impor-
tant for two industries:
(real-time) (off-line)
Real-time simulations must be fast while appearing
realistic (≥ 30 frames per second).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Overview
The study of natural phenomena simulation is impor-
tant for two industries:
(real-time) (off-line)
Real-time simulations must be fast while appearing
realistic (≥ 30 frames per second).
Problem:
How to implement a CUDA-based version of the 3D
stable fluids method?
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
The Eulerian approach
The Eulerian approach
Space partitioning:
Variations of velocity and density are observed at the
center of each cell.
Velocities and densities are updated through an im-
plicit method (Stam stable fluids, 1999), i.e., uncondi-
tionally stable for any time step.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations for incompressible fluids
Mass conservation: −→
u = 0
Velocity evolution:
∂
−→
u
∂t
= −
−→
u ·
−→
u + v 2−→
u +
−→
f
Density evolution:
∂ρ
∂t
= −
−→
u · ρ + k 2
ρ + S
−→
u : velocity field.
v: fluids viscosity.
ρ: density of the field.
k: density diffusion rate.
−→
f : external forces added to the velocity field.
S: external sources added to the density field.
=
∂
∂x
,
∂
∂y
,
∂
∂z
: gradient.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Navier-Stokes equations implementation
Update velocity:
Add external forces (
−→
f ).
Velocity Diffusion (v 2−→
u ).
Move (−
−→
u .
−→
u e
−→
u = 0).
Update density:
Add external sources (S).
Density advection (−
−→
u . ρ).
Density diffusion (k 2
ρ).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Add external forces
−→
u =
−→
u0 + ∆t ×
−→
f (velocity).
ρ = ρ0 + ∆t × S (density).
−→
f : external forces to the velocity field.
S: external density source (e.g., ink mixed with water).
−→
u : new velocity.
−→
u0: old velocity.
ρ: new density.
ρ0: old density.
∆t: simulation time step.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Advection or transport
Quantities transport (e.g., the fluid itself, densities,
moving objects, etc.) when the fluid flows.
Particle trajectory tracked back in time.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Diffusion
Exchanges of density
or velocity between
neighbours (2D).
Solve a sparse linear system (Ax = b), using an iter-
ative method (e.g., Gauss-Seidel red black).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Move
Ensure mass conservation and the fluid’s incom-
pressibility.
Hodge decomposition:
Conservative field = our field - gradient
Determine the gradient using diffusion’s iterative
method (i.e., Gauss-Seidel red black).
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Physics Model
Gauss-Seidel Red-Black
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Workflow
Workflow
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Grid partition
Grid partition
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Results - Processing time
Results - Processing time
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Conclusions
Conclusions
The CUDA-based implementation of 3D stable fluids
achieved better performance (i.e. faster in processing
time) than the CPU-based implementation.
The CUDA-based implementation of 3D stable fluids
allows grid sizes up to 1283, using an Nvidia GeForce
8800 GT card.
The CUDA-based implementation of 3D stable fluids
allows real-time simulations to grid sizes up to 323,
using an Nvidia GeForce 8800 GT card.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Future Work
Future Work
Search ways, implementable using CUDA, to reduce
stable fluids memory requirements (e.g., data struc-
tures, dynamic memory, etc.).
Reduce, numerical error related, dissipation in the
semi-Lagrangian advection, using MacCormack ad-
vection instead.
Implement and analyse the performance of an CPU-
based multi-core version of 3D stable fluids.
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Demo
Demo
ubi-logo
Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions
Questions
Questions???

More Related Content

What's hot

The Ring programming language version 1.5.3 book - Part 80 of 184
The Ring programming language version 1.5.3 book - Part 80 of 184The Ring programming language version 1.5.3 book - Part 80 of 184
The Ring programming language version 1.5.3 book - Part 80 of 184Mahmoud Samir Fayed
 
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...Thom Lane
 
GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...
GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...
GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...Takateru Yamagishi
 
Multi-core GPU – Fast parallel SAR image generation
Multi-core GPU – Fast parallel SAR image generationMulti-core GPU – Fast parallel SAR image generation
Multi-core GPU – Fast parallel SAR image generationMahesh Khadatare
 

What's hot (6)

The Ring programming language version 1.5.3 book - Part 80 of 184
The Ring programming language version 1.5.3 book - Part 80 of 184The Ring programming language version 1.5.3 book - Part 80 of 184
The Ring programming language version 1.5.3 book - Part 80 of 184
 
Cap 12
Cap 12Cap 12
Cap 12
 
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
CIFAR-10 for DAWNBench: Wide ResNets, Mixup Augmentation and "Super Convergen...
 
GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...
GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...
GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/He...
 
Slide tesi
Slide tesiSlide tesi
Slide tesi
 
Multi-core GPU – Fast parallel SAR image generation
Multi-core GPU – Fast parallel SAR image generationMulti-core GPU – Fast parallel SAR image generation
Multi-core GPU – Fast parallel SAR image generation
 

Viewers also liked

Luiza e arthur nevada
Luiza e arthur nevadaLuiza e arthur nevada
Luiza e arthur nevadaAndreaHaupt
 
Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013
Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013
Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013First Tuesday Bergen
 
Se casa cleopatra.ppt_....
Se casa cleopatra.ppt_....Se casa cleopatra.ppt_....
Se casa cleopatra.ppt_....magoyoel
 
Alan Baker BE Civil
Alan Baker BE CivilAlan Baker BE Civil
Alan Baker BE CivilAlan Baker
 
Blog da minha escola
Blog da minha escolaBlog da minha escola
Blog da minha escolaviannota
 
HomeKitとNode.jsを使ってSiriでコントロールするなにか
HomeKitとNode.jsを使ってSiriでコントロールするなにかHomeKitとNode.jsを使ってSiriでコントロールするなにか
HomeKitとNode.jsを使ってSiriでコントロールするなにかkamiyam .
 
Diapositivas infiltracion visual studio hydrovlab
Diapositivas infiltracion visual studio hydrovlabDiapositivas infiltracion visual studio hydrovlab
Diapositivas infiltracion visual studio hydrovlabBYRON RIVADENEIRA
 
Care Quality Commission Strategy Consultation: Place and Pathways
Care Quality Commission Strategy Consultation: Place and PathwaysCare Quality Commission Strategy Consultation: Place and Pathways
Care Quality Commission Strategy Consultation: Place and PathwaysCare Quality Commission
 
Las necesidades. (1)
Las necesidades. (1)Las necesidades. (1)
Las necesidades. (1)Aaraacely
 

Viewers also liked (15)

Luiza e arthur nevada
Luiza e arthur nevadaLuiza e arthur nevada
Luiza e arthur nevada
 
Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013
Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013
Vladimir Dobrov - Distributed agile teams @ First Tuesday Bergen 4 Juni 2013
 
Se casa cleopatra.ppt_....
Se casa cleopatra.ppt_....Se casa cleopatra.ppt_....
Se casa cleopatra.ppt_....
 
Alan Baker BE Civil
Alan Baker BE CivilAlan Baker BE Civil
Alan Baker BE Civil
 
Ejercicio..diapos
Ejercicio..diaposEjercicio..diapos
Ejercicio..diapos
 
Blog da minha escola
Blog da minha escolaBlog da minha escola
Blog da minha escola
 
HomeKitとNode.jsを使ってSiriでコントロールするなにか
HomeKitとNode.jsを使ってSiriでコントロールするなにかHomeKitとNode.jsを使ってSiriでコントロールするなにか
HomeKitとNode.jsを使ってSiriでコントロールするなにか
 
Emociones
EmocionesEmociones
Emociones
 
Issue 03-07web 2
Issue 03-07web 2Issue 03-07web 2
Issue 03-07web 2
 
Shaping the future
Shaping the futureShaping the future
Shaping the future
 
Agrario
AgrarioAgrario
Agrario
 
Diapositivas infiltracion visual studio hydrovlab
Diapositivas infiltracion visual studio hydrovlabDiapositivas infiltracion visual studio hydrovlab
Diapositivas infiltracion visual studio hydrovlab
 
Care Quality Commission Strategy Consultation: Place and Pathways
Care Quality Commission Strategy Consultation: Place and PathwaysCare Quality Commission Strategy Consultation: Place and Pathways
Care Quality Commission Strategy Consultation: Place and Pathways
 
Juntos Tech
Juntos TechJuntos Tech
Juntos Tech
 
Las necesidades. (1)
Las necesidades. (1)Las necesidades. (1)
Las necesidades. (1)
 

Similar to ICCSA 2010 Conference Presentation

CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaCFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaMario Felipe Campuzano Ochoa
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DLLeapMind Inc
 
Big Data Solutions for the Climate Community
Big Data Solutions for the Climate CommunityBig Data Solutions for the Climate Community
Big Data Solutions for the Climate CommunityEUDAT
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...NVIDIA Taiwan
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10Jody Garnett
 
A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionFederico Magliani
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Clusterairbots
 
Nvidia® cuda™ 5 sample evaluationresult_2
Nvidia® cuda™ 5 sample evaluationresult_2Nvidia® cuda™ 5 sample evaluationresult_2
Nvidia® cuda™ 5 sample evaluationresult_2Yukio Saito
 
GPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionGPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionRichard Southern
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsGanesan Narayanasamy
 
the usefulness of ldope tools
the usefulness of ldope toolsthe usefulness of ldope tools
the usefulness of ldope toolsliang0816
 
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesSubhajit Sahu
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemCDVClub
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Codemotion
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Codemotion
 

Similar to ICCSA 2010 Conference Presentation (20)

CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano OchoaCFD Cornell Energy Workshop - M.F. Campuzano Ochoa
CFD Cornell Energy Workshop - M.F. Campuzano Ochoa
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DL
 
Big Data Solutions for the Climate Community
Big Data Solutions for the Climate CommunityBig Data Solutions for the Climate Community
Big Data Solutions for the Climate Community
 
Cuda
CudaCuda
Cuda
 
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
Recent Progress in SCCS on GPU Simulation of Biomedical and Hydrodynamic Prob...
 
State of GeoServer 2.10
State of GeoServer 2.10State of GeoServer 2.10
State of GeoServer 2.10
 
A location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognitionA location-aware embedding technique for accurate landmark recognition
A location-aware embedding technique for accurate landmark recognition
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance Computing
 
CUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce ClusterCUDA performance study on Hadoop MapReduce Cluster
CUDA performance study on Hadoop MapReduce Cluster
 
Nvidia® cuda™ 5 sample evaluationresult_2
Nvidia® cuda™ 5 sample evaluationresult_2Nvidia® cuda™ 5 sample evaluationresult_2
Nvidia® cuda™ 5 sample evaluationresult_2
 
GPU Accelerated Domain Decomposition
GPU Accelerated Domain DecompositionGPU Accelerated Domain Decomposition
GPU Accelerated Domain Decomposition
 
Cuda project paper
Cuda project paperCuda project paper
Cuda project paper
 
Cuda lab manual
Cuda lab manualCuda lab manual
Cuda lab manual
 
Application Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systemsApplication Optimisation using OpenPOWER and Power 9 systems
Application Optimisation using OpenPOWER and Power 9 systems
 
the usefulness of ldope tools
the usefulness of ldope toolsthe usefulness of ldope tools
the usefulness of ldope tools
 
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : Notes
 
Design Verification Using SystemC
Design Verification Using SystemCDesign Verification Using SystemC
Design Verification Using SystemC
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
 
Using Docker in CI process
Using Docker in CI processUsing Docker in CI process
Using Docker in CI process
 

More from Gonçalo Amador

Eurographics 2016 Conference Presentation
Eurographics 2016 Conference PresentationEurographics 2016 Conference Presentation
Eurographics 2016 Conference PresentationGonçalo Amador
 
SIACG 2011 Conference Presentation
SIACG 2011 Conference PresentationSIACG 2011 Conference Presentation
SIACG 2011 Conference PresentationGonçalo Amador
 
IDC 2010 Conference Presentation
IDC 2010 Conference PresentationIDC 2010 Conference Presentation
IDC 2010 Conference PresentationGonçalo Amador
 
MUE 2011 Conference Presentation
MUE 2011 Conference PresentationMUE 2011 Conference Presentation
MUE 2011 Conference PresentationGonçalo Amador
 

More from Gonçalo Amador (6)

Eurographics 2016 Conference Presentation
Eurographics 2016 Conference PresentationEurographics 2016 Conference Presentation
Eurographics 2016 Conference Presentation
 
SIACG 2011 Conference Presentation
SIACG 2011 Conference PresentationSIACG 2011 Conference Presentation
SIACG 2011 Conference Presentation
 
IDC 2010 Conference Presentation
IDC 2010 Conference PresentationIDC 2010 Conference Presentation
IDC 2010 Conference Presentation
 
MUE 2011 Conference Presentation
MUE 2011 Conference PresentationMUE 2011 Conference Presentation
MUE 2011 Conference Presentation
 
Fluids pt
Fluids ptFluids pt
Fluids pt
 
Fluids en
Fluids enFluids en
Fluids en
 

Recently uploaded

Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 

Recently uploaded (20)

Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 

ICCSA 2010 Conference Presentation

  • 1. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions A CUDA-based Implementation of Stable Fluids in 3D with Internal and Moving Boundaries G. Amador and A. Gomes Departamento de Inform´atica Universidade da Beira Interior Covilh˜a, Portugal m1420@ubi.pt, agomes@di.ubi.pt March, 2010
  • 2. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions 1 Introduction 2 Stable Fluids The Eulerian approach Physics Model 3 NVIDIA Compute Unified Device Architecture (CUDA) Workflow Grid partition 4 Results Results - Processing time 5 Conclusions Conclusions Future Work Demo Questions
  • 3. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Overview The study of natural phenomena simulation is impor- tant for two industries: (real-time) (off-line) Real-time simulations must be fast while appearing realistic (≥ 30 frames per second).
  • 4. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Overview The study of natural phenomena simulation is impor- tant for two industries: (real-time) (off-line) Real-time simulations must be fast while appearing realistic (≥ 30 frames per second). Problem: How to implement a CUDA-based version of the 3D stable fluids method?
  • 5. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions The Eulerian approach The Eulerian approach Space partitioning: Variations of velocity and density are observed at the center of each cell. Velocities and densities are updated through an im- plicit method (Stam stable fluids, 1999), i.e., uncondi- tionally stable for any time step.
  • 6. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations for incompressible fluids Mass conservation: −→ u = 0 Velocity evolution: ∂ −→ u ∂t = − −→ u · −→ u + v 2−→ u + −→ f Density evolution: ∂ρ ∂t = − −→ u · ρ + k 2 ρ + S −→ u : velocity field. v: fluids viscosity. ρ: density of the field. k: density diffusion rate. −→ f : external forces added to the velocity field. S: external sources added to the density field. = ∂ ∂x , ∂ ∂y , ∂ ∂z : gradient.
  • 7. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Navier-Stokes equations implementation Update velocity: Add external forces ( −→ f ). Velocity Diffusion (v 2−→ u ). Move (− −→ u . −→ u e −→ u = 0). Update density: Add external sources (S). Density advection (− −→ u . ρ). Density diffusion (k 2 ρ).
  • 8. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Add external forces −→ u = −→ u0 + ∆t × −→ f (velocity). ρ = ρ0 + ∆t × S (density). −→ f : external forces to the velocity field. S: external density source (e.g., ink mixed with water). −→ u : new velocity. −→ u0: old velocity. ρ: new density. ρ0: old density. ∆t: simulation time step.
  • 9. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Advection or transport Quantities transport (e.g., the fluid itself, densities, moving objects, etc.) when the fluid flows. Particle trajectory tracked back in time.
  • 10. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Diffusion Exchanges of density or velocity between neighbours (2D). Solve a sparse linear system (Ax = b), using an iter- ative method (e.g., Gauss-Seidel red black).
  • 11. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Move Ensure mass conservation and the fluid’s incom- pressibility. Hodge decomposition: Conservative field = our field - gradient Determine the gradient using diffusion’s iterative method (i.e., Gauss-Seidel red black).
  • 12. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Physics Model Gauss-Seidel Red-Black
  • 13. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Workflow Workflow
  • 14. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Grid partition Grid partition
  • 15. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Results - Processing time Results - Processing time
  • 16. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Conclusions Conclusions The CUDA-based implementation of 3D stable fluids achieved better performance (i.e. faster in processing time) than the CPU-based implementation. The CUDA-based implementation of 3D stable fluids allows grid sizes up to 1283, using an Nvidia GeForce 8800 GT card. The CUDA-based implementation of 3D stable fluids allows real-time simulations to grid sizes up to 323, using an Nvidia GeForce 8800 GT card.
  • 17. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Future Work Future Work Search ways, implementable using CUDA, to reduce stable fluids memory requirements (e.g., data struc- tures, dynamic memory, etc.). Reduce, numerical error related, dissipation in the semi-Lagrangian advection, using MacCormack ad- vection instead. Implement and analyse the performance of an CPU- based multi-core version of 3D stable fluids.
  • 18. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Demo Demo
  • 19. ubi-logo Introduction Stable Fluids NVIDIA Compute Unified Device Architecture (CUDA) Results Conclusions Questions Questions???