Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Use of distributed FFT for writing fully
distributed N-body code for cosmological
               applications


          ...
Motivation
The classical N-body problem simulates the evolution of a
system of N bodies, where the force exerted on each b...
Hence, the need for optimisation comes in; Fast Fourier
Transforms are used which reduce the time required for
calculation...
Problem Definition
Each N-body code has two basic modules, one for calculation
of the total force acting on each body, giv...
Initial conditions are
setup for the model of
interest.
                                  N-body



Compute forces for giv...
Technologies Used
FFTW – Fastest Fourier Transform in the WEST is a C
subroutine library for computing the discrete Fourie...
The FFTW routines store the data in row-major format for
multi-dimensional arrays.
It does not do normalization of data im...
PMFAST is a particle-mesh N-body code, written in Fortran
90 and aimed towards use in large-scale structure cosmological
s...
Plan of Work
The project comprises of writing an N-body code taking input
conditions, solving the potential equation in k-...
It is trivial to solve Φ by using the fast Fourier transform to go to the frequency
  domain where the Poisson equation ha...
• Calculated the dependence of error on the values of σ and N.
                  Error = Σ(i=1toN) (gobtained(i)-g(i))2 /g...
– Error(σ =5) = 0.0264835
   – Error(σ =10) = 0.043631
   – Error(σ =15) = 0.0607785 , keeping N=1024, constant


  Hence,...
2-d complex transform (above) and real transform (below)
• After successful completion of out-place transforms, in-place
  transforms were done as they are useful in the project.
...
• The next step is to store the data required by each process in the
  local memory of the process itself and then repeat ...
References
1. J.S.Bagla 2001, Cosmological N-Body Simulations, Resource
   Summary, Khagol 48, 5
2. J. S. Bagla, Cosmologi...
Upcoming SlideShare
Loading in …5
×

Writing distributed N-body code using distributed FFT - 1

1,989 views

Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Writing distributed N-body code using distributed FFT - 1

  1. 1. Use of distributed FFT for writing fully distributed N-body code for cosmological applications Supervisors : Dr. S. Sanyal, IIIT Allahabad &, Dr. J. S. Bagla, HRI Allahabad -Kalpana Roy R200513
  2. 2. Motivation The classical N-body problem simulates the evolution of a system of N bodies, where the force exerted on each body arises due to its interaction with all the other bodies in the system. It is used in cosmology to study processes of structure formation like the dynamical evolution of star clusters under the influence of physical forces. Given the initial conditions of the bodies i.e. initial masses, positions and velocities, an N-body code helps to calculate their current positions and motions, evaluating the intermediate values over timesteps and updating. The particle-particle interactions lead to the order of N2 calculations which is extremely huge and practically not feasible.
  3. 3. Hence, the need for optimisation comes in; Fast Fourier Transforms are used which reduce the time required for calculation to order of N log N. Even then large volumes of data are generated and the calculation of an N-body code takes excessively long time even on the fastest of computers [2]. As a solution, the computations are done on distributed systems. The task is divided into the number of processors/systems available which perform calculations on their local data. As the calculations occur parallely, time required decreases. Hence, use of distributed FFT for writing a fully distributed N- body code provides the advantages of faster calculations at a comparatively lower cost.
  4. 4. Problem Definition Each N-body code has two basic modules, one for calculation of the total force acting on each body, given the configuration of particles and the other module moves the particles in this force field. The project deals with calculation of the force field based on initial conditions and movement of the particles based on the force. The data will be decomposed and stored into the local memory of each distributed machine and processed. Then the processed local data of all the machines will be combined and the desired N-body code will be obtained.
  5. 5. Initial conditions are setup for the model of interest. N-body Compute forces for given particle positions Move the particles by one step no If t = tfin yes Write output to file
  6. 6. Technologies Used FFTW – Fastest Fourier Transform in the WEST is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, of arbitrary input size, and of both real and complex data. The FFTW package was developed at MIT by Matteo Frigo and Steven G. Johnson. FFTW libraries can be used for writing codes in C, C++ and Fortran languages. It is used for solving the Poisson equation of the gravitational potential and calculation of force using Fourier transform. By default, both the forward and inverse Fourier transforms are done out-place. FFTW also provides for in-place transforms, with same input and output arrays.
  7. 7. The FFTW routines store the data in row-major format for multi-dimensional arrays. It does not do normalization of data implicitly and hence if we perform forward transform of some data and inverse transform of the result, we get the original data multiplied by the size of the array. FFTW also support MPI (Message Passing Interface) operations allowing for distributed memory parallelism, where each CPU has its own separate memory, and which can scale up to clusters of many thousands of processors. This is desirable in the project building as the data is huge and will not fit in the memory of a single processor. In MPI, the data is divided among a set of “processes” which each run in their own memory address space.
  8. 8. PMFAST is a particle-mesh N-body code, written in Fortran 90 and aimed towards use in large-scale structure cosmological simulations [5]. It offers support for distributed memory systems through MPI as well as parallel initial condition generator.
  9. 9. Plan of Work The project comprises of writing an N-body code taking input conditions, solving the potential equation in k-space and calculating the force and simulate over timesteps, calculating the intermediate position and other attributes. As the major task here is solving of the equation in k-space using Fourier transform, the following steps are followed: The force and gravitational potential are related to each other as Finding the potential energy Φ is easy, because the Poisson equation, where G is Newton's constant and is the density (number of particles at the mesh points.)
  10. 10. It is trivial to solve Φ by using the fast Fourier transform to go to the frequency domain where the Poisson equation has the simple form, The gravitational field can now be found by multiplying by k and computing the inverse Fourier transform. • The first step of the project was taking a 1-dimensional real data value and calculating the error obtained by using FFTW for forward and then subsequent inverse transform followed by normalisation. – g(x) = exp(-(x-N/2)2/(2*σ2)) , x ranging from 1 to N – ∂2g = ((x-N/2)2/σ2 – 1)*g(x)/σ2 = f(x), say – f(x) ------> F(k) [forward fourier transform] – F(k)/-k2 ---------> g(x) [inverse fourier transform] where, k2 = kx2 + ky2 + kz2 , for 3–dimensional data – in current case 1-d , k2 = kx2 – kx = 2π/N * i, i<=N/2 – = 2π/N * (N-i), i>n/2
  11. 11. • Calculated the dependence of error on the values of σ and N. Error = Σ(i=1toN) (gobtained(i)-g(i))2 /g(i)2 – Error(N=256) = 0.077926 – Error(N=512) = 0.043631 – Error(N=1024) = 0.0264835 , keeping σ =5, constant.
  12. 12. – Error(σ =5) = 0.0264835 – Error(σ =10) = 0.043631 – Error(σ =15) = 0.0607785 , keeping N=1024, constant Hence, it is deducted that the error value increases with increasing σ but decreases as N increases. • Performed multi-dimensional fast Fourier transform of real and complex data. In this case the complex data's real part was kept equal to the real data and complex value was left to zero, so that both the real and complex transform were done on the same data.
  13. 13. 2-d complex transform (above) and real transform (below)
  14. 14. • After successful completion of out-place transforms, in-place transforms were done as they are useful in the project. • The next step is to perform the in-place transforms using distributed-memory parallelism. Afore-mentioned work has been done before mid-semester. • Work to be done now is to run the same MPI programs with very large N values on a 32-node cluster, each node having 16GB RAM and a quad core processor. The task will be to plot time against the number of processes for a particular N value and find the optimal number of processes for which execution time is minimised.
  15. 15. • The next step is to store the data required by each process in the local memory of the process itself and then repeat the above. This will reduce the storage requirements and now the data size can be extremely large as it will not depend on the storage of one processor only. • After the optimisation of Fourier transform functions, a Particle Mesh based N-body code, PMFAST, will be used and the force computations will be done using the developed distributed- memory Fourier transform codes. • With the help of the force computations, particles will be moved accordingly and subsequent calculations will be done iteratively using timestep to achieve the final attributes of the particles.
  16. 16. References 1. J.S.Bagla 2001, Cosmological N-Body Simulations, Resource Summary, Khagol 48, 5 2. J. S. Bagla, Cosmological N-Body Simulations, Gravitational Clustering in an Expanding Universe - http://www.hri.res.in/~jasjeet/thesis.html 3. FFTW – Fastest Fourier Transform in the WEST - http://www.fftw.org 4. The Message Passing Interface (MPI) standard - http://www-unix.mcs.anl.gov/mpi/ 5. PMFAST - http://www.cita.utoronto.ca/~merz/pmfast/ 6. Wikipedia – The online encyclopedia

×