Your SlideShare is downloading. ×
0
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Determination of line tension in the 3D Ising model on GPUs
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Determination of line tension in the 3D Ising model on GPUs

591

Published on

This is the talk I gave at the 2nd International Symposium …

This is the talk I gave at the 2nd International Symposium
“Computer Simulations on GPU” (SimGPU 2013)

Published in: Technology, Sports
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
591
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Determination of line tensionin the 3D Ising model on GPUsBenjamin Block, Tobias Preis, David Winter, Suam Kim,Peter Virnau, Kurt BinderUniversity of Mainz, Institute for PhysicsSimGPU 2013
  • 2. Topic Touched1. Ising Model on GPU
  • 3. Topic Touched1. Ising Model on GPU2. Line Tension Estimation
  • 4. Ising ModelOrderedRandom Transition+ nearest neighbor interaction <
  • 5. Monte CarloPerform successive spin flips!Probability: Metropolis criterionInherently serial... but
  • 6. GPU Implementation• GPUs: massively parallel processingT. Preis, P. Virnau, W. Paul, J. J. Schneider:GPU Accelerated Monte Carlo Simulation ofthe 2D and 3D Ising Model, J. Comp. Phys.,228 (2009)• Architecture specific optimization• Multi GPU implementation
  • 7. Parallelization of Lattice UpdatesIdea: Update non-interacting domains in parallelCheckerboard Update
  • 8. Reduce slow memory access
  • 9. Reduce slow memory accessuint4 blocksin globalmemoryIdea: Store spins in 128 bit (uint4) chunks
  • 10. Reduce slow memory accessuint4 blocksin globalmemoryIdea: Store spins in 128 bit (uint4) chunksAccess 128 spins with one memory lookup
  • 11. Reduce slow memory accessuint4 blocksin globalmemoryOnethreadIdea: Store spins in 128 bit (uint4) chunksAccess 128 spins with one memory lookupExtract spins in local thread memory (registers) forcomputation
  • 12. Update schemeuint4
  • 13. Update schemeuint4
  • 14. Update schemeExtract chunk inthreaduint4
  • 15. Update schemeExtract chunk inthreadPerformComputations(draw randomnumber, evaluateMetropolis criterion)uint4
  • 16. Update schemeExtract chunk inthreadPerformComputations(draw randomnumber, evaluateMetropolis criterion)Update patternuint4
  • 17. XORUpdate schemeExtract chunk inthreadPerformComputations(draw randomnumber, evaluateMetropolis criterion)Old spins New spinsUpdate pattern=uint4
  • 18. Multispin Coding?• Multiple spins are coded in memory unit (128spins in 128 bit)
  • 19. Multispin Coding?• Multiple spins are coded in memory unit (128spins in 128 bit)• Computation is not done on encoded spins inparallel but serial in each chunk
  • 20. Multispin Coding?• Multiple spins are coded in memory unit (128spins in 128 bit)• Computation is not done on encoded spins inparallel but serial in each chunk• Multispin coding algorithms designed for CPUswere not efficient on GPU
  • 21. Multispin Coding?• Multiple spins are coded in memory unit (128spins in 128 bit)• Computation is not done on encoded spins inparallel but serial in each chunk• Multispin coding algorithms designed for CPUswere not efficient on GPUWhy??
  • 22. Multispin Coding
  • 23. Array of spins (1 bit = 1 spin)
  • 24. ?Array of spins (1 bit = 1 spin)MC step:
  • 25. ?Array of spins (1 bit = 1 spin)MC step:
  • 26. ?Array of spins (1 bit = 1 spin)MC step:In advance:
  • 27. ?Array of spins (1 bit = 1 spin)MC step:PooledrandompatternsNeighbors(Bitwise)Judgement function:(for eachenergy level)
  • 28. ?Array of spins (1 bit = 1 spin)MC step:Pool of randompatterns
  • 29. ?Array of spins (1 bit = 1 spin)MC step:select onepatternrandomlyConstruct update pattern
  • 30. Array of spins (1 bit = 1 spin)XOR
  • 31. Array of spins (1 bit = 1 spin)XOR=Spins for next step
  • 32. Downsides of Pooling• Impairs quality of simulation (the smaller thepool the less random)
  • 33. Downsides of Pooling• Impairs quality of simulation (the smaller thepool the less random)• Low flexibility (external fields...)
  • 34. Downsides of Pooling• Impairs quality of simulation (the smaller thepool the less random)• Low flexibility (external fields...)• Relies on a lot of precomputation and randommemory lookups (GPU killer)
  • 35. PerformanceCPUsimpleCPUmultispincodingGPUsimpleGPUoptimized~ 20x~ 200xResults from 20112D IsingGPU: NVIDIA Tesla S1070CPU: Intel i7 (2.67 GHz, 1 core)
  • 36. PerformanceCPUsimpleCPUmultispincodingGPUsimpleGPUoptimized~ 20xGPU: NVIDIA Tesla S1070CPU: Intel i7 (2.67 GHz, 1 core)Results from 20112D Ising
  • 37. PerformanceCPUsimpleCPUmultispincodingGPUsimpleGPUoptimized~ 20xGPU: NVIDIA Tesla S1070CPU: Intel i7 (2.67 GHz, 1 core)Results from 20112D Ising8x, still one core!
  • 38. PerformanceCPUsimpleCPUmultispincodingGPUsimpleGPUoptimizedResults from 20112D IsingGPU: NVIDIA Tesla S1070CPU: Intel i7 (2.67 GHz, 1 core)
  • 39. PerformanceCPUsimpleCPUmultispincodingGPUsimpleGPUoptimized~ 20xResults from 20112D IsingGPU: NVIDIA Tesla S1070CPU: Intel i7 (2.67 GHz, 1 core)
  • 40. PerformanceCPUsimpleCPUmultispincodingGPUsimpleGPUoptimized~ 20x~ 200xResults from 20112D IsingGPU: NVIDIA Tesla S1070CPU: Intel i7 (2.67 GHz, 1 core)
  • 41. Simulation on multiple GPUsSpread spin lattice over many GPUsin different machinesExchange border informationbetween machines via MPI
  • 42. Simulation Domains per GPU Border Arrays
  • 43. Multi-GPU PerformanceMeasure: Single spin flips per GPUCommunicationoverheadBottleneck forsmall system sizes
  • 44. • 64 GPUs: 256 GB video memory• Enough for a lattice of 800.000 x 800.000 spins• One lattice sweep: 3 seconds on pre-Fermi (S1070)hardware
  • 45. ?
  • 46. ?OpenCL??
  • 47. Platform independence51
  • 48. KernelsIdea: Hide language differences in macros
  • 49. Macros expand to different expressions on each platform•CUDA (Driver API)•OpenCL•Host C
  • 50. Initialization• Initialize• Load “Device Programs” (Kernels) from source• Create Data Containers that take care of data
  • 51. Run kernel with parametersUse data on host
  • 52. Cross platform performance56CPU: i7NehalemNvidia:Geforce GTX580AMD: HD 69703D IsingExample
  • 53. Results
  • 54. Results• Downside: Lowest common denominator(CUDA has a lot more features by now)
  • 55. Results• Downside: Lowest common denominator(CUDA has a lot more features by now)• No explicit copying needed (containers job)
  • 56. Results• Downside: Lowest common denominator(CUDA has a lot more features by now)• No explicit copying needed (containers job)• In our case: OpenCL was 10% slower on NVIDIA card(Geforce GTX580)
  • 57. Results• Downside: Lowest common denominator(CUDA has a lot more features by now)• No explicit copying needed (containers job)• In our case: OpenCL was 10% slower on NVIDIA card(Geforce GTX580)• slower on comparable AMD card (Radeon HD 6970)
  • 58. Results• Downside: Lowest common denominator(CUDA has a lot more features by now)• No explicit copying needed (containers job)• In our case: OpenCL was 10% slower on NVIDIA card(Geforce GTX580)• slower on comparable AMD card (Radeon HD 6970)• Take this with a grain of salt
  • 59. Nucleation
  • 60. Nucleation phenomena• Nucleation important in materialsresearch, atmosphere, etc
  • 61. NucleationPhase 1 Phase 2
  • 62. NucleationPhase 1 Phase 2Induced by nuclei!
  • 63. Most spins up Most spins down
  • 64. Heterogeneous NucleationWall attached droplet
  • 65. =
  • 66. Simulation in the Ising ModelWinter D., Virnau P., Binder K., PRL Volume 103 Issue 22 (2009)
  • 67. Young
  • 68. Free Energy of DropletΗ=0, Θ=90oWinter D., Virnau P., Binder K., PRL Volume 103 Issue 22 (2009)
  • 69. Young
  • 70. Line Contribution
  • 71. Line Contribution
  • 72. A different method...
  • 73. A different method...Surface field H > 0 which tilts interface
  • 74. A different method...Surface field H > 0 which tilts interface
  • 75. A different method...Antiperiodic BoundaryConditions force and stabilizean interfaceSurface field H > 0 which tilts interface
  • 76. A different method...Antiperiodic BoundaryConditions force and stabilizean interfaceSurface field H > 0 which tilts interfaceAngle is limited by geometry...
  • 77. Flatten geometryLxLyFlattened geometry in dimension X allows for stronger tiltLz
  • 78. Boundary ConditionImplementation83Simulate one extra chunk in each dimension
  • 79. Boundary ConditionImplementationPeriodic: Exchange borders
  • 80. Boundary ConditionImplementationAPBC: Read, XOR 1, Write
  • 81. Thermodynamic integration• Vary box size in all dimensions• Measure Free Energies of surfaces byintegration over magnetization
  • 82. • Expressions can be derived for the Free Energydifferences in each dimensionYoung’s Equation(1)(2)(3)
  • 83. • Expressions can be derived for the Free Energydifferences in each dimensionYoung’s EquationCombination of the first two expressionsAllows extraction of Line Tension(1)(2)(3)
  • 84. • Which can be combined to an expression for theline tension:(1) (2)(3)
  • 85. Putting it together- -
  • 86. 9191(2011) Kim et al.T=3.0
  • 87. Side viewTop viewDensity Profile3D System:56x120x120 spins
  • 88. 9393
  • 89. Conclusion
  • 90. Conclusion• Direct method to measure line tension for tiltedsurfaces
  • 91. Conclusion• Direct method to measure line tension for tiltedsurfaces• Our first real world use of the Ising Model onGPUs
  • 92. Conclusion• Direct method to measure line tension for tiltedsurfaces• Our first real world use of the Ising Model onGPUs• Optimization is important (CPU and GPU) forfair comparison
  • 93. Conclusion• Direct method to measure line tension for tiltedsurfaces• Our first real world use of the Ising Model onGPUs• Optimization is important (CPU and GPU) forfair comparison• Platform independence is possible (useful?)
  • 94. Conclusion• Direct method to measure line tension for tiltedsurfaces• Our first real world use of the Ising Model onGPUs• Optimization is important (CPU and GPU) forfair comparison• Platform independence is possible (useful?)• The Ising model is a good candidate for parallelprocessing on GPU clusters
  • 95. Publications• Monte Carlo Test of the Classical Theory for HeterogeneousNucleation BarriersWinter D., Virnau P., Binder K., Phys.Rev.Let. 103, 22 (2009)• Multi-GPU Accelerated Multi-Spin Monte Carlo Simulations ofthe 2D Ising modelBlock, B., Virnau, P., Preis, T.:, Computer Physics Communications,Volume 181, Issue 9 (2010)• Monte Carlo Methods for Estimating Interfacial Free Energiesand Line TensionsBinder, K., Block., B., Das, S. K., Virnau, P., Winter, D., J. Stat.Phys (2011)• Platform independent, efficient implementation of the Isingmodel on parallel acceleration devicesBlock B. J., Eur. Phys. J. Spec. Top. (2012)

×