SlideShare a Scribd company logo
1 of 18
Download to read offline
Generic parallelization strategies for data assimilation


                    Nils van Velzen, Martin Verlaan

                                                      06-10-11
Outline


  ●
      OpenDA
  ●
      Coupling of model to DA method
  ●
      Parallelism of model and EnKF
  ●
      Parallelism for black box models
  ●
      Conclusions




             Generic parallelization strategies for data assimilation   06-10-11   2
OpenDA


•   What is OpenDA?
    – A generic toolbox for data-assimilation
    – Set of interfaces that define interactions between components
    – Library of data-assimilation algorithms
    – Open source


•   Why OpenDA?
    – More efficient than development for each application
    – Shared knowledge between applications
    – Development of algorithms with e.g. universities
    – Easier to test


                   Generic parallelization strategies for data assimilation   06-10-11   3
OpenDA


•   Object oriented design
     – Classes, software building blocks
    – Interface (set of functions suitable for all models, observations,
      etc)




                  Generic parallelization strategies for data assimilation   06-10-11   4
OpenDA


 ●
     Formal form of a model
       dx
          =M ( x(t ) , u(t) , p , w(t ))
       dt
 ●
     State of model instance x (t ) , u (t) , p , w(t )
 ●
     Instance state cannot be directly changed
     only through the methods like:
        GetState, AxpyState, Compute...
 ●
     Algorithm has no knowledge on model
     internals

                Generic parallelization strategies for data assimilation   06-10-11   5
OpenDA




         Generic parallelization strategies for data assimilation   06-10-11   6
EnKF semi parallel


      ●
          Only parallelize model steps:                                                     Semi parallel
           • Not scalable, often sufficient                                                Communication
                                                                                              volume

ξif (t k )= M (ξa (t k −1 ))+wi (t k )
                i
                                                                                            C m +n N
  f      1 f
x (t k )=     ξ (t )
         N i k
E (t k )= [ ξ1 (t k )− x (t k ) , ξ 2 (t k )− x (t k ) , ... , ξ N (t k )− x (t k ) ]            0
 f           f          f           f          f                 f          f


                                       ⋮
ξia (t k )=ξif (t k )+ K (t k )[ y (t k )− H (t k ) ξif (t k )+v i (t k )]                       nN


                           Generic parallelization strategies for data assimilation   06-10-11         7
EnKF semi parallel


  ●
      Lotos-euros air quality model




              Generic parallelization strategies for data assimilation   06-10-11   8
EnKF semi parallel


  ●
      Generic semi parallel due to OO concepts




             Generic parallelization strategies for data assimilationA   06-10-11   9
EnKF parallel


●
    Stochastic model time steps
ξif (t k )=M (ξa (t k−1 ))+wi (t k )
               i



●
    Mean of Ensemble
     f          1 f
    x (t k )=    ξ (t )
                N i k

●
    Update (for all ensemble members)
ξia (t k )=ξif (t k )+ K (t k )[ y (t k )− H (t k ) ξif (t k )+v i (t k )]


                           Generic parallelization strategies for data assimilation   06-10-11   10
EnKF parallel


Column wise distribution                                                                          Row wise distribution


Separate                Combined                                                                Separate          Combined


 nN                      0                                                                             nN          Cm
                                                                                                C m+
  p                                                                                                     p


 n log 2 ( p )           n log 2 ( p )                                                            0               0


 nN                     n N log 2 ( p−1)                                                         nN               0
    +n N log 2 ( p−1)
  p                                                                                               p




                                     Generic parallelization strategies for data assimilation          06-10-11         11
EnKF parallel


Column wise distribution                                                                          Row wise distribution


Separate                Combined                                                                Separate          Combined


 nN                      0                                                                             nN          Cm
                                                                                                C m+
  p                                                                                                     p


 n log 2 ( p )           n log 2 ( p )                                                            0               0


 nN                     n N log 2 ( p−1)                                                         nN               0
    +n N log 2 ( p−1)
  p                                                                                               p




                                     Generic parallelization strategies for data assimilation          06-10-11         12
EnKF parallel


●
    WAQUA shallow water model
●
 Comparison of parallelization
strategies for RRSQRT (Roest
et al.)




                  Generic parallelization strategies for data assimilation   06-10-11   13
EnKF parallel


•   Full parallel using OO concepts:
     – All filters run same code
     – State vectors are distributed
        (parallel vector)
     – Operations on parallel vectors
       by a parallel vector
       implementation
     – NO need to change the model and
       filter code
     – All complexities hidden in generic
       support layers/implementations


                  Generic parallelization strategies for data assimilation   06-10-11   14
Parallel computing with black box models


• Black box models
  – No change to model code
  – Data exchange using files
  – Note: disk can be slow/more data written than
    needed




            Generic parallelization strategies for data assimilation   06-10-11   15
Parallel computing with black box models


• Swan model for wind generated waves
  – Operational model + DA for the north sea
  – Black box model
  – 1 hour, 8 cpu's 10 min
     50% IO
  – EnKF implementation (?)
  – Parallelization of filter


            Generic parallelization strategies for data assimilation   06-10-11   16
Parallel computing with black box models


• Black box model is normal model
  – Semi parallel+full parallel
  – Note disk speed,
  – Use local disks:
     • Faster
     • No sequential
       bottleneck


            Generic parallelization strategies for data assimilation   06-10-11   17
Conclusions


• Generic parallelization stragegies due to object
  oriented programming concepts.
• Single filter implementation for sequential as
  parallel computing
• EnKF like algorithms need combination of
  parallel strategies
• Black box models and IO can be parallelized as
  well

              eneric parallelization strategies for data assimilation   06-10-11   18

More Related Content

What's hot

Asymptotic Notation and Complexity
Asymptotic Notation and ComplexityAsymptotic Notation and Complexity
Asymptotic Notation and ComplexityRajandeep Gill
 
Analysis of algorithn class 3
Analysis of algorithn class 3Analysis of algorithn class 3
Analysis of algorithn class 3Kumar
 
Time complexity (linear search vs binary search)
Time complexity (linear search vs binary search)Time complexity (linear search vs binary search)
Time complexity (linear search vs binary search)Kumar
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notationsEhtisham Ali
 
Renyis entropy
Renyis entropyRenyis entropy
Renyis entropywtyru1989
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notationsMamta Pandey
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic NotationsRishabh Soni
 
High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...Vissarion Fisikopoulos
 
Basics & asymptotic notations
Basics & asymptotic notationsBasics & asymptotic notations
Basics & asymptotic notationsRajendran
 
Asymptotic analysis
Asymptotic analysisAsymptotic analysis
Asymptotic analysisSoujanya V
 
Aaex5 group2(中英夾雜)
Aaex5 group2(中英夾雜)Aaex5 group2(中英夾雜)
Aaex5 group2(中英夾雜)Shiang-Yun Yang
 
Aaex4 group2(中英夾雜)
Aaex4 group2(中英夾雜)Aaex4 group2(中英夾雜)
Aaex4 group2(中英夾雜)Shiang-Yun Yang
 

What's hot (20)

Asymptotic notation
Asymptotic notationAsymptotic notation
Asymptotic notation
 
Asymptotic Notation and Complexity
Asymptotic Notation and ComplexityAsymptotic Notation and Complexity
Asymptotic Notation and Complexity
 
Analysis of algorithn class 3
Analysis of algorithn class 3Analysis of algorithn class 3
Analysis of algorithn class 3
 
Time complexity
Time complexityTime complexity
Time complexity
 
Time complexity (linear search vs binary search)
Time complexity (linear search vs binary search)Time complexity (linear search vs binary search)
Time complexity (linear search vs binary search)
 
Asymptotic notation
Asymptotic notationAsymptotic notation
Asymptotic notation
 
Algorithum Analysis
Algorithum AnalysisAlgorithum Analysis
Algorithum Analysis
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notations
 
Renyis entropy
Renyis entropyRenyis entropy
Renyis entropy
 
Asymptotic notations
Asymptotic notationsAsymptotic notations
Asymptotic notations
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic Notations
 
Algo complexity
Algo complexityAlgo complexity
Algo complexity
 
High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...
 
Table 7a,7b kft 131
Table 7a,7b  kft 131Table 7a,7b  kft 131
Table 7a,7b kft 131
 
Basics & asymptotic notations
Basics & asymptotic notationsBasics & asymptotic notations
Basics & asymptotic notations
 
Asymptotic analysis
Asymptotic analysisAsymptotic analysis
Asymptotic analysis
 
Computational Complexity
Computational ComplexityComputational Complexity
Computational Complexity
 
Aaex5 group2(中英夾雜)
Aaex5 group2(中英夾雜)Aaex5 group2(中英夾雜)
Aaex5 group2(中英夾雜)
 
Aaex4 group2(中英夾雜)
Aaex4 group2(中英夾雜)Aaex4 group2(中英夾雜)
Aaex4 group2(中英夾雜)
 
Asymptotic Notation
Asymptotic NotationAsymptotic Notation
Asymptotic Notation
 

Similar to Generic parallelization strategies for data assimilation

Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...Tuong Do
 
Dsp U Lec09 Iir Filter Design
Dsp U   Lec09 Iir Filter DesignDsp U   Lec09 Iir Filter Design
Dsp U Lec09 Iir Filter Designtaha25
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...zukun
 
Correlative level coding
Correlative level codingCorrelative level coding
Correlative level codingsrkrishna341
 
Dsp U Lec10 DFT And FFT
Dsp U   Lec10  DFT And  FFTDsp U   Lec10  DFT And  FFT
Dsp U Lec10 DFT And FFTtaha25
 
Chapter 9 computation of the dft
Chapter 9 computation of the dftChapter 9 computation of the dft
Chapter 9 computation of the dftmikeproud
 
Modeling Metacommunties
Modeling MetacommuntiesModeling Metacommunties
Modeling MetacommuntiesDistribEcology
 
Case Study (All)
Case Study (All)Case Study (All)
Case Study (All)gudeyi
 
Class 25: Reversing Reverse
Class 25: Reversing ReverseClass 25: Reversing Reverse
Class 25: Reversing ReverseDavid Evans
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdfAnaNeacsu5
 
Dynamic Compression Transmission for Energy Harvesting Networks
Dynamic Compression Transmission for Energy Harvesting NetworksDynamic Compression Transmission for Energy Harvesting Networks
Dynamic Compression Transmission for Energy Harvesting NetworksCristiano Tapparello
 
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfKernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfgrssieee
 
Large-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressivenessLarge-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressivenessSangjin Han
 
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control systemAlireza Mirzaei
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorPTIHPA
 
Parallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsParallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsJonny Daenen
 
Statistical Analysis of Neural Coding
Statistical Analysis of Neural CodingStatistical Analysis of Neural Coding
Statistical Analysis of Neural CodingYifei Shea, Ph.D.
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptxpallavidhade2
 

Similar to Generic parallelization strategies for data assimilation (20)

Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
Solar Cells Lecture 3: Modeling and Simulation of Photovoltaic Devices and Sy...
 
Dsp U Lec09 Iir Filter Design
Dsp U   Lec09 Iir Filter DesignDsp U   Lec09 Iir Filter Design
Dsp U Lec09 Iir Filter Design
 
Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...Principal component analysis and matrix factorizations for learning (part 1) ...
Principal component analysis and matrix factorizations for learning (part 1) ...
 
Correlative level coding
Correlative level codingCorrelative level coding
Correlative level coding
 
Dsp U Lec10 DFT And FFT
Dsp U   Lec10  DFT And  FFTDsp U   Lec10  DFT And  FFT
Dsp U Lec10 DFT And FFT
 
Chapter 9 computation of the dft
Chapter 9 computation of the dftChapter 9 computation of the dft
Chapter 9 computation of the dft
 
Modeling Metacommunties
Modeling MetacommuntiesModeling Metacommunties
Modeling Metacommunties
 
Chapter5 system analysis
Chapter5 system analysisChapter5 system analysis
Chapter5 system analysis
 
Case Study (All)
Case Study (All)Case Study (All)
Case Study (All)
 
Class 25: Reversing Reverse
Class 25: Reversing ReverseClass 25: Reversing Reverse
Class 25: Reversing Reverse
 
Dsp3
Dsp3Dsp3
Dsp3
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdf
 
Dynamic Compression Transmission for Energy Harvesting Networks
Dynamic Compression Transmission for Energy Harvesting NetworksDynamic Compression Transmission for Energy Harvesting Networks
Dynamic Compression Transmission for Energy Harvesting Networks
 
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdfKernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
 
Large-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressivenessLarge-scale computation without sacrificing expressiveness
Large-scale computation without sacrificing expressiveness
 
How to design a linear control system
How to design a linear control systemHow to design a linear control system
How to design a linear control system
 
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. ProcessorImplementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
Implementing 3D SPHARM Surfaces Registration on Cell B.E. Processor
 
Parallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-JoinsParallel Evaluation of Multi-Semi-Joins
Parallel Evaluation of Multi-Semi-Joins
 
Statistical Analysis of Neural Coding
Statistical Analysis of Neural CodingStatistical Analysis of Neural Coding
Statistical Analysis of Neural Coding
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
 

Generic parallelization strategies for data assimilation

  • 1. Generic parallelization strategies for data assimilation Nils van Velzen, Martin Verlaan 06-10-11
  • 2. Outline ● OpenDA ● Coupling of model to DA method ● Parallelism of model and EnKF ● Parallelism for black box models ● Conclusions Generic parallelization strategies for data assimilation 06-10-11 2
  • 3. OpenDA • What is OpenDA? – A generic toolbox for data-assimilation – Set of interfaces that define interactions between components – Library of data-assimilation algorithms – Open source • Why OpenDA? – More efficient than development for each application – Shared knowledge between applications – Development of algorithms with e.g. universities – Easier to test Generic parallelization strategies for data assimilation 06-10-11 3
  • 4. OpenDA • Object oriented design – Classes, software building blocks – Interface (set of functions suitable for all models, observations, etc) Generic parallelization strategies for data assimilation 06-10-11 4
  • 5. OpenDA ● Formal form of a model dx =M ( x(t ) , u(t) , p , w(t )) dt ● State of model instance x (t ) , u (t) , p , w(t ) ● Instance state cannot be directly changed only through the methods like: GetState, AxpyState, Compute... ● Algorithm has no knowledge on model internals Generic parallelization strategies for data assimilation 06-10-11 5
  • 6. OpenDA Generic parallelization strategies for data assimilation 06-10-11 6
  • 7. EnKF semi parallel ● Only parallelize model steps: Semi parallel • Not scalable, often sufficient Communication volume ξif (t k )= M (ξa (t k −1 ))+wi (t k ) i C m +n N f 1 f x (t k )= ξ (t ) N i k E (t k )= [ ξ1 (t k )− x (t k ) , ξ 2 (t k )− x (t k ) , ... , ξ N (t k )− x (t k ) ] 0 f f f f f f f ⋮ ξia (t k )=ξif (t k )+ K (t k )[ y (t k )− H (t k ) ξif (t k )+v i (t k )] nN Generic parallelization strategies for data assimilation 06-10-11 7
  • 8. EnKF semi parallel ● Lotos-euros air quality model Generic parallelization strategies for data assimilation 06-10-11 8
  • 9. EnKF semi parallel ● Generic semi parallel due to OO concepts Generic parallelization strategies for data assimilationA 06-10-11 9
  • 10. EnKF parallel ● Stochastic model time steps ξif (t k )=M (ξa (t k−1 ))+wi (t k ) i ● Mean of Ensemble f 1 f x (t k )= ξ (t ) N i k ● Update (for all ensemble members) ξia (t k )=ξif (t k )+ K (t k )[ y (t k )− H (t k ) ξif (t k )+v i (t k )] Generic parallelization strategies for data assimilation 06-10-11 10
  • 11. EnKF parallel Column wise distribution Row wise distribution Separate Combined Separate Combined nN 0 nN Cm C m+ p p n log 2 ( p ) n log 2 ( p ) 0 0 nN n N log 2 ( p−1) nN 0 +n N log 2 ( p−1) p p Generic parallelization strategies for data assimilation 06-10-11 11
  • 12. EnKF parallel Column wise distribution Row wise distribution Separate Combined Separate Combined nN 0 nN Cm C m+ p p n log 2 ( p ) n log 2 ( p ) 0 0 nN n N log 2 ( p−1) nN 0 +n N log 2 ( p−1) p p Generic parallelization strategies for data assimilation 06-10-11 12
  • 13. EnKF parallel ● WAQUA shallow water model ● Comparison of parallelization strategies for RRSQRT (Roest et al.) Generic parallelization strategies for data assimilation 06-10-11 13
  • 14. EnKF parallel • Full parallel using OO concepts: – All filters run same code – State vectors are distributed (parallel vector) – Operations on parallel vectors by a parallel vector implementation – NO need to change the model and filter code – All complexities hidden in generic support layers/implementations Generic parallelization strategies for data assimilation 06-10-11 14
  • 15. Parallel computing with black box models • Black box models – No change to model code – Data exchange using files – Note: disk can be slow/more data written than needed Generic parallelization strategies for data assimilation 06-10-11 15
  • 16. Parallel computing with black box models • Swan model for wind generated waves – Operational model + DA for the north sea – Black box model – 1 hour, 8 cpu's 10 min 50% IO – EnKF implementation (?) – Parallelization of filter Generic parallelization strategies for data assimilation 06-10-11 16
  • 17. Parallel computing with black box models • Black box model is normal model – Semi parallel+full parallel – Note disk speed, – Use local disks: • Faster • No sequential bottleneck Generic parallelization strategies for data assimilation 06-10-11 17
  • 18. Conclusions • Generic parallelization stragegies due to object oriented programming concepts. • Single filter implementation for sequential as parallel computing • EnKF like algorithms need combination of parallel strategies • Black box models and IO can be parallelized as well eneric parallelization strategies for data assimilation 06-10-11 18