SlideShare a Scribd company logo
1 of 20
Download to read offline
Implementing 3D SPHARM Surfaces
   Registration on Cell Processor

 Huian Li (huili@indiana.edu)                Mi Yan (miyan@us.ibm.com)
  Robert Henschel (rhensche@indiana edu)
                   (rhensche@indiana.edu)    Li Shen (shenli@iupui edu)
                                                     (shenli@iupui.edu)



                             July 29, 2009
Contents
•   SPHARM registration
•   Matlab implementation
•   Cell implementation
•   Performance Analysis
•   Conclusion
SPHARM Surfaces
 • R di l and stellar surfaces
   Radial d t ll         f
 • Simply connected, arbitrarily shaped
 • Vision, graphics, imaging, bioinformatics
SPHARM Expansion




             ( )  (x y z)
             (,)  (x,y,z)
             ( )
             (,)   (x,y,z)
                     (     )
              Area-preserving
                 mapping
SHREC




   (a) template, (b) object, (c) after ICP, (d) after
   registration of p
     g             parameterization
Calculation of coefficients
• After rotating the parameter net on the surface in
  Euler angles (α, β, γ), new coefficients will be:
                                               l
             c (  ) 
                  m
                   l                        
                                            nl
                                                    D    l
                                                         mn     (  ) c        l
                                                                                  n



   where
                                                       min( l  n ,l  m )
                 D mn ( )  e (  i m  in ) (
                   l
                                                              (  1) t d mnt (  ))
                                                      t  max( 0 , n  m )
                                                                          l



   and

                     (l  n)!(l  n)!(l  m)!(l  m)!                                
 d mnt (  ) 
   l
                                                           (cos ) ( 2l nm2t ) (sin ) ( 2t mn )
                 (l  n  t )!(l  m  t )!(t  m  n)!t!       2                     2
RMSD
• RMSD (Root Mean Square Distance): distance
  between two SPHARM models

                           L max   l
                       1
       RMSD       
                      4
                            
                           l0 m l
                                       || c 1ml  c 2 , l || 2
                                             ,
                                                    m




            m              m
       c    and c
           1 ,l            2 ,l    are coefficients of two
       SPHARM models
Matlab implementation
• A straightforward implementation in Matlab:

     for l = 0 Lmax
              0,
       for m = -l, l
          for n = -l, l
                   l
             for t = max(0, n-m), min(l+m, l-n)
              ... performing calculations ...

• One rotation for Lmax = 50 took 823 seconds on 2GHz quad
                                                      quad-
  core Intel Xeon E5335
Cell B.E.
Cell implementation
• Domain decomposition:
     for l = 0, Lmax
       for m = -l l
                 l,
          for n = -l, l
             for t = max(0 n-m) min(l+m l-n)
                     max(0, n m), min(l+m, l n)
              ... calculations ...

• Decomposition along l leads to work load
  imbalance among SPUs

 • Decomposition along m creates unnecessary data
        p            g                     y
   communication
Cell implementation
• Loop fusion:
    for l = 0, Lmax
      for m = -l l
                l,
         for n = -l, l
            for t = max(0 n-m) min(l+m l-n)
                    max(0, n m), min(l+m, l n)
             ... calculations ...
• Unique index for combined loop:
    f(l, m) = l2 + m + l
• W kl d f each SPE :
  Workload for     h
    (Lmax + 1)2/(total # of SPEs)
Cell implementation
• Lookup table T for factorial
• Transform exponentials & multiplications into
  multiplications & additions respectively
                    additions, respectively.
                     (l  n)!(l  n)!(l  m)!(l  m)!                                
d   l
          ( )                                            (cos ) ( 2l nm2t ) (sin ) ( 2t mn )
                 (l  n  t )!(l  m  t )!(t  m  n)!t!
    mnt
                                                                2                     2

               exp(
              1
                 (T (l  n )  T (l  n )  T (l  m )  T (l  m ))
              2
               T (l  n  t )  T (l  m  t )  T (t  m  n )  T (t )
                                                                                      
               ( 2l  n  m  2t )  log(cos           )  ( 2t  m  n )  log(sin       ))
                                                    2                                  2
Cell implementation
• Others that specific to Cell:
    • Vectorization & data alignment
    • DMA data transfer between main memory &
      local store
    • SPU d decrementert
Cell implementation
• Single p
     g precision vs. double p
                            precision: all data in single p
                                                      g precision
Cell implementation
• Single p
     g precision vs. double p
                            precision: p
                                       partial data in double p
                                                              precision
Cell implementation
• Single p
     g precision vs. double p
                            precision: all critical data in double p
                                                                   precision
Performance analysis
                      Performance of one rotation on Cell BE

                      1.8
                      18
                      1.6
                      1.4
                 s)
     Time (seconds



                      1.2
                        1
                      0.8
                      0.6
                      0.4
                      04
     T




                      0.2
                        0
                             1       2         4          8   16
                                         Number of SPEs
Performance analysis
                        Performance of finding the shortest
                          distance at Level 3 on Cell BE
                      7000

                      6000

                      5000
                 s)
           seconds




                      4000
     Time (s




                      3000                                    GNU gcc
                                                              IBM xlc
                      2000

                      1000

                         0
                             4       8       12     16
                                   Number of SPEs
Conclusion
• Performance increases dramatically on Cell due to
  its unique architecture and algorithm optimization.
• Carefulness must be taken for data placement due
  to limited local store.
• Carefulness must also be taken for data transfer
  between local store and main memory.
The End




          Questions?

More Related Content

What's hot

DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsDSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsAmr E. Mohamed
 
EAGE Amsterdam 2014
EAGE Amsterdam 2014EAGE Amsterdam 2014
EAGE Amsterdam 2014wsspsoft
 
SchNet: A continuous-filter convolutional neural network for modeling quantum...
SchNet: A continuous-filter convolutional neural network for modeling quantum...SchNet: A continuous-filter convolutional neural network for modeling quantum...
SchNet: A continuous-filter convolutional neural network for modeling quantum...Kazuki Fujikawa
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Pierre Jacob
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Pierre Jacob
 
Camera parameters
Camera parametersCamera parameters
Camera parametersTheYacine
 
Modèle de coordination du groupe de robots mobiles
Modèle de coordination du groupe de robots mobilesModèle de coordination du groupe de robots mobiles
Modèle de coordination du groupe de robots mobilesAkrem Hadji
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Modeling and Verification of Cyber Physical Systems
Modeling and Verification of Cyber Physical SystemsModeling and Verification of Cyber Physical Systems
Modeling and Verification of Cyber Physical Systemscpsworkshop
 
Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual Jitendra Jangid
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsArvind Devaraj
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexityashishtinku
 
Math cad fourier analysis (jcb-edited)
Math cad   fourier analysis (jcb-edited)Math cad   fourier analysis (jcb-edited)
Math cad fourier analysis (jcb-edited)Julio Banks
 
Matt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense SlidesMatt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense Slidesmpurkeypile
 
Self-organized criticality
Self-organized criticalitySelf-organized criticality
Self-organized criticalityOsame Kinouchi
 
Fourier transformation
Fourier transformationFourier transformation
Fourier transformationzertux
 

What's hot (20)

DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsDSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
 
EAGE Amsterdam 2014
EAGE Amsterdam 2014EAGE Amsterdam 2014
EAGE Amsterdam 2014
 
SchNet: A continuous-filter convolutional neural network for modeling quantum...
SchNet: A continuous-filter convolutional neural network for modeling quantum...SchNet: A continuous-filter convolutional neural network for modeling quantum...
SchNet: A continuous-filter convolutional neural network for modeling quantum...
 
LN s05-machine vision-s2
LN s05-machine vision-s2LN s05-machine vision-s2
LN s05-machine vision-s2
 
Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7Presentation of SMC^2 at BISP7
Presentation of SMC^2 at BISP7
 
Presentation MCB seminar 09032011
Presentation MCB seminar 09032011Presentation MCB seminar 09032011
Presentation MCB seminar 09032011
 
Jokyokai2
Jokyokai2Jokyokai2
Jokyokai2
 
Camera parameters
Camera parametersCamera parameters
Camera parameters
 
Modèle de coordination du groupe de robots mobiles
Modèle de coordination du groupe de robots mobilesModèle de coordination du groupe de robots mobiles
Modèle de coordination du groupe de robots mobiles
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Modeling and Verification of Cyber Physical Systems
Modeling and Verification of Cyber Physical SystemsModeling and Verification of Cyber Physical Systems
Modeling and Verification of Cyber Physical Systems
 
Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual Signal Prosessing Lab Mannual
Signal Prosessing Lab Mannual
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier Transforms
 
19. algorithms and-complexity
19. algorithms and-complexity19. algorithms and-complexity
19. algorithms and-complexity
 
Math cad fourier analysis (jcb-edited)
Math cad   fourier analysis (jcb-edited)Math cad   fourier analysis (jcb-edited)
Math cad fourier analysis (jcb-edited)
 
Matt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense SlidesMatt Purkeypile's Doctoral Dissertation Defense Slides
Matt Purkeypile's Doctoral Dissertation Defense Slides
 
Radix-2 DIT FFT
Radix-2 DIT FFT Radix-2 DIT FFT
Radix-2 DIT FFT
 
Self-organized criticality
Self-organized criticalitySelf-organized criticality
Self-organized criticality
 
Chapter6 sampling
Chapter6 samplingChapter6 sampling
Chapter6 sampling
 
Fourier transformation
Fourier transformationFourier transformation
Fourier transformation
 

Viewers also liked

Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...
Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...
Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...Karin Kleingeld
 
Goed bestuur bij overdracht in familiebedrijven: loslaten en oppakken
Goed bestuur bij overdracht in familiebedrijven: loslaten en oppakkenGoed bestuur bij overdracht in familiebedrijven: loslaten en oppakken
Goed bestuur bij overdracht in familiebedrijven: loslaten en oppakkenKarin Kleingeld
 
Risk Management Webinar
Risk Management WebinarRisk Management Webinar
Risk Management Webinarjanemangat
 
BORIS in action
BORIS in actionBORIS in action
BORIS in actionboris_vhc
 
2010 02 instrumentation_and_runtime_measurement
2010 02 instrumentation_and_runtime_measurement2010 02 instrumentation_and_runtime_measurement
2010 02 instrumentation_and_runtime_measurementPTIHPA
 

Viewers also liked (6)

Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...
Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...
Entrepreneurial goverance in het MKB. Bijdrage aan het Jaarboek Corporate Gov...
 
Goed bestuur bij overdracht in familiebedrijven: loslaten en oppakken
Goed bestuur bij overdracht in familiebedrijven: loslaten en oppakkenGoed bestuur bij overdracht in familiebedrijven: loslaten en oppakken
Goed bestuur bij overdracht in familiebedrijven: loslaten en oppakken
 
Risk Management Webinar
Risk Management WebinarRisk Management Webinar
Risk Management Webinar
 
Community
CommunityCommunity
Community
 
BORIS in action
BORIS in actionBORIS in action
BORIS in action
 
2010 02 instrumentation_and_runtime_measurement
2010 02 instrumentation_and_runtime_measurement2010 02 instrumentation_and_runtime_measurement
2010 02 instrumentation_and_runtime_measurement
 

Similar to Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor

Passive network-redesign-ntua
Passive network-redesign-ntuaPassive network-redesign-ntua
Passive network-redesign-ntuaIEEE NTUA SB
 
Network Bandwidth Allocation.ppt
Network Bandwidth Allocation.pptNetwork Bandwidth Allocation.ppt
Network Bandwidth Allocation.pptAliIssa53
 
IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfgrssieee
 
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...Evangelos Ntotsios
 
Live model transformations driven by incremental pattern matching
Live model transformations driven by incremental pattern matchingLive model transformations driven by incremental pattern matching
Live model transformations driven by incremental pattern matchingIstvan Rath
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkKazuki Fujikawa
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptxpallavidhade2
 
Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009Juan Luis Nieves
 
Sampling and Reconstruction (Online Learning).pptx
Sampling and Reconstruction (Online Learning).pptxSampling and Reconstruction (Online Learning).pptx
Sampling and Reconstruction (Online Learning).pptxHamzaJaved306957
 
New Mathematical Tools for the Financial Sector
New Mathematical Tools for the Financial SectorNew Mathematical Tools for the Financial Sector
New Mathematical Tools for the Financial SectorSSA KPI
 
Paper computer
Paper computerPaper computer
Paper computerbikram ...
 
Paper computer
Paper computerPaper computer
Paper computerbikram ...
 
El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711RCCSRENKEI
 
From Biological to Smart CMOS Imaging: Architectural approach
From Biological to Smart CMOS Imaging: Architectural approachFrom Biological to Smart CMOS Imaging: Architectural approach
From Biological to Smart CMOS Imaging: Architectural approachUniversity of Waterloo, Canada
 
Benchmark Calculations of Atomic Data for Modelling Applications
 Benchmark Calculations of Atomic Data for Modelling Applications Benchmark Calculations of Atomic Data for Modelling Applications
Benchmark Calculations of Atomic Data for Modelling ApplicationsAstroAtom
 
Nonlinear Stochastic Programming by the Monte-Carlo method
Nonlinear Stochastic Programming by the Monte-Carlo methodNonlinear Stochastic Programming by the Monte-Carlo method
Nonlinear Stochastic Programming by the Monte-Carlo methodSSA KPI
 
Foss4g2009tokyo Realini Go Gps
Foss4g2009tokyo Realini Go GpsFoss4g2009tokyo Realini Go Gps
Foss4g2009tokyo Realini Go GpsOSgeo Japan
 

Similar to Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor (20)

Er24902905
Er24902905Er24902905
Er24902905
 
Passive network-redesign-ntua
Passive network-redesign-ntuaPassive network-redesign-ntua
Passive network-redesign-ntua
 
Network Bandwidth Allocation.ppt
Network Bandwidth Allocation.pptNetwork Bandwidth Allocation.ppt
Network Bandwidth Allocation.ppt
 
Dsp manual print
Dsp manual printDsp manual print
Dsp manual print
 
IGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdfIGARSS2011 FR3.T08.3 BenDavid.pdf
IGARSS2011 FR3.T08.3 BenDavid.pdf
 
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...
 
Live model transformations driven by incremental pattern matching
Live model transformations driven by incremental pattern matchingLive model transformations driven by incremental pattern matching
Live model transformations driven by incremental pattern matching
 
Predicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman networkPredicting organic reaction outcomes with weisfeiler lehman network
Predicting organic reaction outcomes with weisfeiler lehman network
 
MSc Presentation
MSc PresentationMSc Presentation
MSc Presentation
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
 
Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009Color Img at Prisma Network meeting 2009
Color Img at Prisma Network meeting 2009
 
Sampling and Reconstruction (Online Learning).pptx
Sampling and Reconstruction (Online Learning).pptxSampling and Reconstruction (Online Learning).pptx
Sampling and Reconstruction (Online Learning).pptx
 
New Mathematical Tools for the Financial Sector
New Mathematical Tools for the Financial SectorNew Mathematical Tools for the Financial Sector
New Mathematical Tools for the Financial Sector
 
Paper computer
Paper computerPaper computer
Paper computer
 
Paper computer
Paper computerPaper computer
Paper computer
 
El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711El text.tokuron a(2019).jung190711
El text.tokuron a(2019).jung190711
 
From Biological to Smart CMOS Imaging: Architectural approach
From Biological to Smart CMOS Imaging: Architectural approachFrom Biological to Smart CMOS Imaging: Architectural approach
From Biological to Smart CMOS Imaging: Architectural approach
 
Benchmark Calculations of Atomic Data for Modelling Applications
 Benchmark Calculations of Atomic Data for Modelling Applications Benchmark Calculations of Atomic Data for Modelling Applications
Benchmark Calculations of Atomic Data for Modelling Applications
 
Nonlinear Stochastic Programming by the Monte-Carlo method
Nonlinear Stochastic Programming by the Monte-Carlo methodNonlinear Stochastic Programming by the Monte-Carlo method
Nonlinear Stochastic Programming by the Monte-Carlo method
 
Foss4g2009tokyo Realini Go Gps
Foss4g2009tokyo Realini Go GpsFoss4g2009tokyo Realini Go Gps
Foss4g2009tokyo Realini Go Gps
 

More from PTIHPA

Github:fi Presentation
Github:fi PresentationGithub:fi Presentation
Github:fi PresentationPTIHPA
 
2010 05 hands_on
2010 05 hands_on2010 05 hands_on
2010 05 hands_onPTIHPA
 
Trace Visualization
Trace VisualizationTrace Visualization
Trace VisualizationPTIHPA
 
2010 vampir workshop_iu_configuration
2010 vampir workshop_iu_configuration2010 vampir workshop_iu_configuration
2010 vampir workshop_iu_configurationPTIHPA
 
2010 03 papi_indiana
2010 03 papi_indiana2010 03 papi_indiana
2010 03 papi_indianaPTIHPA
 
Overview: Event Based Program Analysis
Overview: Event Based Program AnalysisOverview: Event Based Program Analysis
Overview: Event Based Program AnalysisPTIHPA
 
Switc Hpa
Switc HpaSwitc Hpa
Switc HpaPTIHPA
 
Statewide It Robert Henschel
Statewide It Robert HenschelStatewide It Robert Henschel
Statewide It Robert HenschelPTIHPA
 
3 Vampir Trace In Detail
3 Vampir Trace In Detail3 Vampir Trace In Detail
3 Vampir Trace In DetailPTIHPA
 
5 Vampir Configuration At IU
5 Vampir Configuration At IU5 Vampir Configuration At IU
5 Vampir Configuration At IUPTIHPA
 
2 Vampir Trace Visualization
2 Vampir Trace Visualization2 Vampir Trace Visualization
2 Vampir Trace VisualizationPTIHPA
 
1 Vampir Overview
1 Vampir Overview1 Vampir Overview
1 Vampir OverviewPTIHPA
 
4 HPA Examples Of Vampir Usage
4 HPA Examples Of Vampir Usage4 HPA Examples Of Vampir Usage
4 HPA Examples Of Vampir UsagePTIHPA
 
GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...PTIHPA
 
Big Iron and Parallel Processing, USArray Data Processing Workshop
Big Iron and Parallel Processing, USArray Data Processing WorkshopBig Iron and Parallel Processing, USArray Data Processing Workshop
Big Iron and Parallel Processing, USArray Data Processing WorkshopPTIHPA
 

More from PTIHPA (15)

Github:fi Presentation
Github:fi PresentationGithub:fi Presentation
Github:fi Presentation
 
2010 05 hands_on
2010 05 hands_on2010 05 hands_on
2010 05 hands_on
 
Trace Visualization
Trace VisualizationTrace Visualization
Trace Visualization
 
2010 vampir workshop_iu_configuration
2010 vampir workshop_iu_configuration2010 vampir workshop_iu_configuration
2010 vampir workshop_iu_configuration
 
2010 03 papi_indiana
2010 03 papi_indiana2010 03 papi_indiana
2010 03 papi_indiana
 
Overview: Event Based Program Analysis
Overview: Event Based Program AnalysisOverview: Event Based Program Analysis
Overview: Event Based Program Analysis
 
Switc Hpa
Switc HpaSwitc Hpa
Switc Hpa
 
Statewide It Robert Henschel
Statewide It Robert HenschelStatewide It Robert Henschel
Statewide It Robert Henschel
 
3 Vampir Trace In Detail
3 Vampir Trace In Detail3 Vampir Trace In Detail
3 Vampir Trace In Detail
 
5 Vampir Configuration At IU
5 Vampir Configuration At IU5 Vampir Configuration At IU
5 Vampir Configuration At IU
 
2 Vampir Trace Visualization
2 Vampir Trace Visualization2 Vampir Trace Visualization
2 Vampir Trace Visualization
 
1 Vampir Overview
1 Vampir Overview1 Vampir Overview
1 Vampir Overview
 
4 HPA Examples Of Vampir Usage
4 HPA Examples Of Vampir Usage4 HPA Examples Of Vampir Usage
4 HPA Examples Of Vampir Usage
 
GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...GeneIndex: an open source parallel program for enumerating and locating words...
GeneIndex: an open source parallel program for enumerating and locating words...
 
Big Iron and Parallel Processing, USArray Data Processing Workshop
Big Iron and Parallel Processing, USArray Data Processing WorkshopBig Iron and Parallel Processing, USArray Data Processing Workshop
Big Iron and Parallel Processing, USArray Data Processing Workshop
 

Recently uploaded

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 

Recently uploaded (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 

Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor

  • 1. Implementing 3D SPHARM Surfaces Registration on Cell Processor Huian Li (huili@indiana.edu) Mi Yan (miyan@us.ibm.com) Robert Henschel (rhensche@indiana edu) (rhensche@indiana.edu) Li Shen (shenli@iupui edu) (shenli@iupui.edu) July 29, 2009
  • 2. Contents • SPHARM registration • Matlab implementation • Cell implementation • Performance Analysis • Conclusion
  • 3. SPHARM Surfaces • R di l and stellar surfaces Radial d t ll f • Simply connected, arbitrarily shaped • Vision, graphics, imaging, bioinformatics
  • 4. SPHARM Expansion ( )  (x y z) (,)  (x,y,z) ( ) (,) (x,y,z) ( ) Area-preserving mapping
  • 5. SHREC (a) template, (b) object, (c) after ICP, (d) after registration of p g parameterization
  • 6. Calculation of coefficients • After rotating the parameter net on the surface in Euler angles (α, β, γ), new coefficients will be: l c (  )  m l  nl D l mn (  ) c l n where min( l  n ,l  m ) D mn ( )  e (  i m  in ) ( l  (  1) t d mnt (  )) t  max( 0 , n  m ) l and (l  n)!(l  n)!(l  m)!(l  m)!   d mnt (  )  l  (cos ) ( 2l nm2t ) (sin ) ( 2t mn ) (l  n  t )!(l  m  t )!(t  m  n)!t! 2 2
  • 7. RMSD • RMSD (Root Mean Square Distance): distance between two SPHARM models L max l 1 RMSD  4   l0 m l || c 1ml  c 2 , l || 2 , m m m c and c 1 ,l 2 ,l are coefficients of two SPHARM models
  • 8. Matlab implementation • A straightforward implementation in Matlab: for l = 0 Lmax 0, for m = -l, l for n = -l, l l for t = max(0, n-m), min(l+m, l-n) ... performing calculations ... • One rotation for Lmax = 50 took 823 seconds on 2GHz quad quad- core Intel Xeon E5335
  • 10. Cell implementation • Domain decomposition: for l = 0, Lmax for m = -l l l, for n = -l, l for t = max(0 n-m) min(l+m l-n) max(0, n m), min(l+m, l n) ... calculations ... • Decomposition along l leads to work load imbalance among SPUs • Decomposition along m creates unnecessary data p g y communication
  • 11. Cell implementation • Loop fusion: for l = 0, Lmax for m = -l l l, for n = -l, l for t = max(0 n-m) min(l+m l-n) max(0, n m), min(l+m, l n) ... calculations ... • Unique index for combined loop: f(l, m) = l2 + m + l • W kl d f each SPE : Workload for h (Lmax + 1)2/(total # of SPEs)
  • 12. Cell implementation • Lookup table T for factorial • Transform exponentials & multiplications into multiplications & additions respectively additions, respectively. (l  n)!(l  n)!(l  m)!(l  m)!   d l ( )   (cos ) ( 2l nm2t ) (sin ) ( 2t mn ) (l  n  t )!(l  m  t )!(t  m  n)!t! mnt 2 2  exp( 1  (T (l  n )  T (l  n )  T (l  m )  T (l  m )) 2  T (l  n  t )  T (l  m  t )  T (t  m  n )  T (t )    ( 2l  n  m  2t )  log(cos )  ( 2t  m  n )  log(sin )) 2 2
  • 13. Cell implementation • Others that specific to Cell: • Vectorization & data alignment • DMA data transfer between main memory & local store • SPU d decrementert
  • 14. Cell implementation • Single p g precision vs. double p precision: all data in single p g precision
  • 15. Cell implementation • Single p g precision vs. double p precision: p partial data in double p precision
  • 16. Cell implementation • Single p g precision vs. double p precision: all critical data in double p precision
  • 17. Performance analysis Performance of one rotation on Cell BE 1.8 18 1.6 1.4 s) Time (seconds 1.2 1 0.8 0.6 0.4 04 T 0.2 0 1 2 4 8 16 Number of SPEs
  • 18. Performance analysis Performance of finding the shortest distance at Level 3 on Cell BE 7000 6000 5000 s) seconds 4000 Time (s 3000 GNU gcc IBM xlc 2000 1000 0 4 8 12 16 Number of SPEs
  • 19. Conclusion • Performance increases dramatically on Cell due to its unique architecture and algorithm optimization. • Carefulness must be taken for data placement due to limited local store. • Carefulness must also be taken for data transfer between local store and main memory.
  • 20. The End Questions?