Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DSD-INT 2018 Algorithmic Differentiation - Markus

12 views

Published on

Presentation by Arjen Markus (Deltares) at the Data Science Symposium 2018, during Delft Software Days - Edition 2018. Thursday 15 November 2018, Delft.

Published in: Software
  • Be the first to comment

  • Be the first to like this

DSD-INT 2018 Algorithmic Differentiation - Markus

  1. 1. Enabling technologies: Algorithmic Differentation Arjen Markus November 15, 2018
  2. 2. Purpose of the experiment ”Algorithmic differentiation” is a venerable technique: Quite intrusive if done manually However, the toolkit by NAG makes it easy to apply to C++ and Fortran programs Built into the compiler(s) – very little changes to the source code needed This presentation is about our experiences with the tool. 1. Introduction November 15, 2018 2 / 20
  3. 3. Not only smooth problems ... BLOOM To get acquainted with the method we chose a module from our water quality model that is not too extensive and not too trivial – BLOOM. BLOOM uses an optimisation algorithm based on linear programming: how much algae of various species can exist given light, nutrients, ... 1. Introduction November 15, 2018 3 / 20
  4. 4. Algorithmic differentiation (1) In many applications of our numerical models we need to answer such questions as: Sensitivity analysis: Which parameters influence the outcome the most? Calibration: How can we get as close as possible to the measurements? Data assimilation: How can we use the limited observations we have to estimate a good starting point? ... 2. Technique November 15, 2018 4 / 20
  5. 5. Algorithmic differentiation (2) To answer such questions we have many (mathematical) methods to our disposal. Some, however, are fairly na¨ıve. To determine the sensitivity of the outcome to the parameters we can use the following method: Try different values of the relevant parameters Determine difference with ”nominal” result 0 20 40 60 80 100 0 1 2 3 4 5 Increasing parameter 2. Technique November 15, 2018 5 / 20
  6. 6. Algorithmic differentiation (3) This works fine – if you have a small number of parameters. It also assumes the response is more or less linear. Two alternatives: The tangent linear method: ∂U ∂x , ∂U ∂y , ∂U ∂z , ... The adjoint method: ∂x ∂U , ∂y ∂U , ∂z ∂U , ... Number of parameters: this can be large indeed, think of the number of grid cells, in case of data assimilation. 2. Technique November 15, 2018 6 / 20
  7. 7. How it works – tangent linear method Some technical details – for the tangent linear method: A special compiler takes the source code and transforms it. Each arithmetic operation now calculates the value and the derivative: w = x · y → (w, dw dz ) = (x · y, x · dy dz + y · dx dz ) (1) Each mathematical function does the same. The result is the precise derivative. 2. Technique November 15, 2018 7 / 20
  8. 8. How it works – adjoint method For the adjoint method, things are slightly more complicated: A special compiler takes the source code and transforms it so that all operations are stored – the so-called tape. The calculation is done both forward (normal) and backward to calculate the derivative. Advantage: much faster and you get a direct answer as to how to change the parameters. 2. Technique November 15, 2018 8 / 20
  9. 9. The tools from NAG NAG, Numerical Algorithms Group, offers: HPC consulting and services Software services in general An extensive library with numerical algorithms A Fortran compiler that strictly checks for conformance. The AD compiler for C++ and Fortran 3. About NAG November 15, 2018 9 / 20
  10. 10. Cooperation with NAG Given their experience with numerical algorithms, high-performance computing and their extensive library and other tools, NAG is an interesting party. This experiment was a first opportunity to closely cooperate with them. For the purpose of this presentation, I will focus on two simpler applications. The BLOOM experiment showed that smoothness is not a prerequisite. 3. About NAG November 15, 2018 10 / 20
  11. 11. Linear programming – BLOOM Constraints: (2) x + 0.4y ≤ 10 (3) x + 1.8y ≤ 5 (4) Optimise: x + y (5) The result: the optimum depends on several parameters Determine the Jacobian matrix to identify them Information specific to solution X Y Constraints Optimum 4. Examples November 15, 2018 11 / 20
  12. 12. Simple example: Streeter-Phelps The classical model of BOD and DO in a river: dBOD dt = −kBOD dDO dt = −kBOD + r(DOsat − DO)/H Five parameters: Initial conditions for BOD and DO Saturation concentration DO Decay rate of BOD Reaeration rate DO 4. Examples November 15, 2018 12 / 20
  13. 13. Simple example: Streeter-Phelps – the data Artificial data with noise: 0 5 10 15 20 0 2 4 6 8 10 Oxygen BOD 4. Examples November 15, 2018 13 / 20
  14. 14. Simple example: Streeter-Phelps – error Using a simple line search algorithm and the results of the AD tool: 1 10 100 1000 10000 100000 1000000 10000000 0 10 20 30 40 4. Examples November 15, 2018 14 / 20
  15. 15. Simple example: Streeter-Phelps – final result Using a simple line search algorithm and the results of the AD tool: BOD initial: 9.7 mg/l (10) DO initial: 9.2 mg/l (8.0) Saturation: 7.9 mg/l (7.8) Decay rate: 0.5 d−1 (0.4) Reaeration: 2.0 d−1 (2.5) 0 5 10 15 20 0 2 4 6 8 10 Oxygen (data) Oxygen (model) BOD (data) BOD (model) (In parenthe 4. Examples November 15, 2018 15 / 20
  16. 16. Backtracking an oil spill The idea: We have observed an oil spill somewhere. It was released some two days before. Can we trace it back to its origins? A form of inverse modelling! 4. Examples November 15, 2018 16 / 20
  17. 17. Backtracking an oil spill – set-up Very simple grid – rectangular, constant flow. But: We seek the initial condition that gives us the following patch after two days: 0 5 10 15 20 25 0 1 2 3 4 5 Initial patch "Observed" patch (Green: a very rough estimate ...) 4. Examples November 15, 2018 17 / 20
  18. 18. Backtracking an oil spill – result Use the adjoint gradient of the final result wrt the initial condition: Determine a new initial condition that will yield a result that is closer to the observation. Result: 0 5 10 15 20 25 0 1 2 3 4 5 < 0.001 < 0.003 < 0.01 < 0.03 < 0.1 < 0.3 < 0.6 > 0.6 4. Examples November 15, 2018 18 / 20
  19. 19. A bit of source code ... do iteration = 1,100 ! Update the initial condition conc_init = max(0.0, conc_init - conc_init_adjoint * 0.01) call dco_a1s_tape_create call dco_a1s_tape_register_variable( conc_init ) ... calculate concentration over time ... calculate deviation from "observed" concentration ! Examine the adjoint results call dco_a1s_tape_register_output_variable( deviation ) call dco_a1s_tape_switch_to_passive call dco_a1s_set( deviation, 1.0, -1 ) call dco_a1s_tape_interpret_adjoint call dco_a1s_get( conc_init, conc_init_adjoint, -1 ) call dco_a1s_tape_remove enddo 4. Examples November 15, 2018 19 / 20
  20. 20. Conclusions and recommendations The technique of algorithmic differentation is very promising (and actually well-established). The tool provided by NAG is easy to use, even if there are some complications you need to deal with. One immediate benefit is with OpenDA. 4. Examples November 15, 2018 20 / 20

×