Phase Unwrapping and Affine
Transformations using CUDA

    Perhaad Mistry, Sherman Braganza,
            David Kaeli, Mir...
Keck 3D Fusion Microscope
Motivation




                      Unwrapped Phase Image
Wrapped Phase Image
                                           ...
Motivation
Phase Unwrapping used for research in In-Vitro
Fertilization (IVF) in Optical Quadrature Microscopy
Cell counts...
Outline
Optical Quadrature Microscopy
Affine Transformations
Minimum LP Norm Phase Unwrapping
MEX Interface Usage
Performa...
OQM Imaging
                                                      Pattern determined by phase
                 Camera 0
  ...
Affine Transformation
Align images to fix sample’s
position in different images                                           ...
Noise Removal & Phase Information
Separate “noise” only image available (Dark Image)
Dark voltage image subtracted from th...
Example Images for Affine Transform




 Mixed Image      Signal only Image
                                      After Af...
Phase Unwrapping
Frame                       Phase
             Affine
Grabber                     Unwrapping
            ...
MATLAB’s Unwrap v/s Minimum LP Norm




                                               Unwrap Using Minimum Lp
 Unwrap Usi...
P
    Minimum             L     Norm Algorithm
Minimize difference                = (φi +1, j − φi , j )Ui , j + (φi +1, j...
Preconditioned Conjugate Gradient
Solve un-weighted least ε 2 = M −2 N−2 (φ − φ − ∆x )2
                               ∑ ∑...
Better DCT Implementation
Implementing N point DCT using DFT uses 2N point DFT
Another method seen in [1]
   Rearrange x(n...
Minimum                  L      Norm Algorithm
                              P

for k ← 0 to LP Norm Count
   Calculate Da...
Example Images – Phase Unwrapping




Wrapped Phase Data                   Unwrapped Phase Data
                Images of ...
Example Images – Mouse Embryo




  Wrapped Phase Data   Unwrapped Phase Data
                                            ...
MATLAB External Interface (MEX)
MEX produces linked         Frame Grabber
objects from C code             Image
          ...
Performance (Phase Unwrapping)
                       Baseline                  GPU      MEX and GPU   Mex & GPU
         ...
Performance Results
Future Work
Study implementation of Conjugate Gradient
Study scatter operations in CUDA
   Present literature shows improv...
Conclusions
Implemented the computationally intensive Minimum Lp
Norm Phase Unwrapping and Affine Transforms on the
GPU
No...
Thank You

  Questions?

Perhaad Mistry
(pmistry@ece.neu.edu)
Upcoming SlideShare
Loading in …5
×

Accelarating Optical Quadrature Microscopy Using GPUs

1,273 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,273
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Accelarating Optical Quadrature Microscopy Using GPUs

  1. 1. Phase Unwrapping and Affine Transformations using CUDA Perhaad Mistry, Sherman Braganza, David Kaeli, Miriam Leeser ECE Department Northeastern University Boston, MA
  2. 2. Keck 3D Fusion Microscope
  3. 3. Motivation Unwrapped Phase Image Wrapped Phase Image 3
  4. 4. Motivation Phase Unwrapping used for research in In-Vitro Fertilization (IVF) in Optical Quadrature Microscopy Cell counts obtained after unwrapping determine viability of embryos Results required at close to real time to see changes in sample Improve cost-effectiveness of OQM and quality of research 4
  5. 5. Outline Optical Quadrature Microscopy Affine Transformations Minimum LP Norm Phase Unwrapping MEX Interface Usage Performance Affine Phase OQM Frame Transforms Unwrapping Grabber Future Work Conclusion OQM Imaging System Layout 5
  6. 6. OQM Imaging Pattern determined by phase Camera 0 PBS difference between laser beams Camera 1 Camera 3 Mirror Phase calculated using magnitudes captured from four cameras NPBS Camera 2 Mixed (M), signal only(S), REF SIG reference(R) and dark (D) images Phase difference measured by Sample angle(E) ranges from –π to π 1 n=3 n M n − S n − Rn E = ∑i * 4 n=0 Rn From Mirror Laser angle(E) yields phase that is required for unwrapping Optical Quadrature Microscopy Layout 6
  7. 7. Affine Transformation Align images to fix sample’s position in different images Phase Affine Microscope’s Unwrapping Transforms Frame 2x2 and 2x1 matrices Grabber obtained from microscope Image pixels divided among Algorithm for Affine Transform [1] blocks of threads (x,y) = f(BlockId, ThreadID) Each thread calculates one Load Data = Image[x][y] pixel Includes - Rotation (θ), Scaling (s), Shearing (sh) ⎛ x' ⎞ ⎛ s(x)*cos(θ) sh(y)*sin(θ) ⎞ ⎛ x ⎞ ⎛ a ⎞ Implemented using CUDA ⎜ ⎟=⎜ ⎟⎜ ⎟+⎜ ⎟ y' ⎠ ⎝ -sh(x)*sin(θ) s(y)*cos(θ) ⎠ ⎝ y ⎠ ⎝ b ⎠ ⎝ since result needed for Phase Store Image[x'][y'] = Data Unwrapping (Uncoalesced W rites) [1] A. K. Jain, Fundamentals of digital image processing, Prentice Hall 7
  8. 8. Noise Removal & Phase Information Separate “noise” only image available (Dark Image) Dark voltage image subtracted from the mixed, signal and reference images Subtraction of signal image (Sn) and reference image (Rn) due to fixed pattern levels of individual cameras Mn = Mn − Dn and so on for Sn and Rn 1 n=3 n Mn − Sn − Rn E = ∑i * n: camera number 4 n =0 Rn phase = angle(E ) Square root of Rn to balance intensities for detection 8
  9. 9. Example Images for Affine Transform Mixed Image Signal only Image After Affine Transform and Noise Removal 9 Dark Image Reference Image
  10. 10. Phase Unwrapping Frame Phase Affine Grabber Unwrapping Transforms Optical phase difference from interferometric image is shown Parameter of interest is Optical path difference which can be more than π Unwrapping along a line: Sum till a gradient of π then, add 2π In 2D if noise is present Path dependencies while summing Wrapped Image Causes accumulation errors (note range of values) 10
  11. 11. MATLAB’s Unwrap v/s Minimum LP Norm Unwrap Using Minimum Lp Unwrap Using MATLAB’s unwrap() function Norm Algorithm [1] 11 [1] Dennis C. Ghiglia, Mark D. Pritt: Two-Dimensional Phase Unwrapping: Theory, Algorithms, and Software
  12. 12. P Minimum L Norm Algorithm Minimize difference = (φi +1, j − φi , j )Ui , j + (φi +1, j − φi , j )Vi , j ci , j between gradients of −(φi , j − φi −1, j )Ui −1, j − (φi , j − φi , j −1 )Vi , j wrapped and unwrapped = ∆ ix, j U (i , j ) − ∆ ix−1, j U (i − 1, j ) + ci , j data [1] ∆ iy, jV (i , j ) − ∆ iy−1, jV (i − 1 j ) , Nonlinear PDE since U and where V are data dependant φx ,y is the unwrapped phase at (x,y) weights ∆ x ,y and ∆ y ,y denote the wrapped phase in the PDE solved iteratively x x x and y direction respectively using the Preconditioned Conjugate Gradient Method Can be expressed as Qφ = c [1] Dennis C. Ghiglia, Mark D. Pritt. Two- Dimensional Phase Unwrapping: Theory, 12 Algorithms, and Software, Wiley New York
  13. 13. Preconditioned Conjugate Gradient Solve un-weighted least ε 2 = M −2 N−2 (φ − φ − ∆x )2 ∑ ∑ i +1, j i , j i , j square phase unwrapping i =0 j =0 problem M − 2 N −2 + ∑ ∑ (φi +1, j − φi , j − ∆iy, j )2 Minimize ε2 i =0 j =0 where φi , j is the wrapped phase at (i,j) and Preconditioning using a DCT needed ∆ix, j is the unwrapped phase at (i,j) After discretizing the equation Conjugate gradient and taking the 2D Fourier Transform method called after DCT Ρ m,n preconditioning Φ= 2cos(π m ) + 2cos(π n ) − 4 m,n M N where Φ and Ρ are the Fourier Transformed versions of φ and ∆ respectively 13
  14. 14. Better DCT Implementation Implementing N point DCT using DFT uses 2N point DFT Another method seen in [1] Rearrange x(n) into v(n) such that ⎡ N − 1⎤ v ( n ) = x (2 n ) 0≤n≤⎢ ⎥ ⎣2⎦ ⎡ N + 1⎤ = x (2 N − 2 n − 1) ⎢ 2 ⎥ ≤ n ≤ N −1 ⎣ ⎦ W4kN * Real(v(n))) DCT(x(n)) = 2*DFT( For 2D DCT: 4x less work since N*N instead of 2N*2N PCG is ~ 90% of LP Norm Execution time Shuffle kernel before CUFFT call [1] Makhoul J. A fast cosine transform in one and two dimensions. 14
  15. 15. Minimum L Norm Algorithm P for k ← 0 to LP Norm Count Calculate Data Derived Weights Preconditioning for j ← 0 to PCG Iteration Count before CG DCT To DFT Steps (Shuffle Kernel) Execute CUFFT Call Point-Wise Complex Multiplication to Get DCT Result PCG Scaling Step Algorithm Execute CUFFT Call to do iDCT Conjugate Gradient (Point Wise Matrix Operations) end for if Convergence exit end for 15
  16. 16. Example Images – Phase Unwrapping Wrapped Phase Data Unwrapped Phase Data Images of Glass Bead and water 16
  17. 17. Example Images – Mouse Embryo Wrapped Phase Data Unwrapped Phase Data 17
  18. 18. MATLAB External Interface (MEX) MEX produces linked Frame Grabber objects from C code Image Acquisition Tool Box MATLAB present in our system due to the frame MATLAB Legacy grabbers MATLAB Legacy Interfaces Interface Gateway function in C Yes makes MATLAB data (mxArray) visible to our Affine Transform Phase Wrapper(C) Using Unwrapping Save linked C code MEX Wrapper (C) Affine ? Using MEX No Affine Transform Phase & Noise Removal Unwrapping GPU GPU code Code(CUDA) (CUDA) 18 OQM Processing Control Flow
  19. 19. Performance (Phase Unwrapping) Baseline GPU MEX and GPU Mex & GPU GPU Time Time Speedup Time(sec) Speedup Preconditioning 11.17 1.2 9.3X 1.2 9.3X Conjugate Grad. 2.89 0.55 5.25X 0.55 5.25X IO Activity 0.7 0.7 1X 0.02 NA Miscellaneous 0.79 0.51 1.53X 0.41 1.92X 7.20X Total for Unwrap 15.55 2.97 5.24X 2.16 Baseline : Serial implementation using FFTW for the Fourier Transform 19
  20. 20. Performance Results
  21. 21. Future Work Study implementation of Conjugate Gradient Study scatter operations in CUDA Present literature shows improvements only for larger data sizes (may be better on G200) Multi-Gpu version for imaging scenarios where images may be grabbed at faster rates Presently only one image at a time acquired. Phase Unwrapping also required for applications like Synthetic Aperture Radar (SAR) 21
  22. 22. Conclusions Implemented the computationally intensive Minimum Lp Norm Phase Unwrapping and Affine Transforms on the GPU Not a batch-oriented process Latency was critical, single image used at a time Reducing latency throughout the system improves quality of research and time to discovery in Optical Science Laboratory 22
  23. 23. Thank You Questions? Perhaad Mistry (pmistry@ece.neu.edu)

×