Parameter estimation of distributed hydrological model using polynomial chaos expansion
1. Study Report
M2 - Putika Ashfar Khoiri
Water Engineering Laboratory
Department of Civil Engineering
April 26th , 2018
2. Contents Plan for Master’s Thesis
Master Thesis Title: A study of Parameter Estimation of Distributed Hydrological
(DHM) Model to Increase Simulation Accuracy using Polynomial Chaos
Expansion (PCE)
Chapter 1 Introduction to research backgrounds,
objectives and methodology
Chapter 2 Data and Study Area
1. Conditions of Ibo River
2. Description of precipitation data used (AmeDAS and
X-RAIN)
Chapter 3 Parameter Optimization of Distributed
Hydrological Model
1. Outline of Distributed Hydrological Model
2. Parameter estimation of hydrological model
3. Selection of Distributed Hydrological Model
Parameters
4. PCE setup for DHM model
5. Calculation conditions and calculation period
Chapter 4 Parameter Optimization Results
1. Sensitivity analysis results
2. Parameter optimization results from
Polynomial Chaos Expansion (PCE)
3. Reproduced Calculation from PCE
4. Validation Results
Chapter 5 Results Analysis and Discussion
1. Period-dependence of optimal
parameter values
2. Spatial differences in optimal parameter
values
Chapter 6 Conclusion and
Recommendation
1
3. Background, objectives and methodology
Background Due to the spatial heterogeneity of the distributed hydrological model,
determination of input and parameters setting is difficult
Input and parameters
𝑥 = [𝑥1, 𝑥2, … … . . 𝑥 𝑛]
Distributed Hydrological Model
• Set up
• Calculation conditions
𝑦 = 𝑓(𝑥)
Simulation Discharge
𝑦 = [𝑦1, 𝑦2, . . 𝑦𝑛]
DATA ASSIMILATION
- Improve the model reproducibility by reduce the uncertainty in model input and model
parameters by data assimilation
- It requires to produce better model forecast ability
Validation with
observed discharge
Results
improve
• Parameter optimization
2
4. Background, objectives and methodology
Research Objectives : Construct a parameter estimation system using PCE for DHM and
evaluate its applicability in Ibo river watershed.
Background
3 broad categories of data assimilation
1. Variational Techniques ( 3D Var, 4D Var)
2. Monte-Carlo based techniques ( EnKF, Particle
filter, Markov-Chain Monte Carlo, etc)
3. Emulator techniques (Polynomial Chaos
Expansion (PCE))
Advantage of PCE
Not costly and more effective than Monte-Carlo
because of their random sampling
Disadvantage of PCE
Only 2 parameters can be optimized in one
calculation condition, so it may be varying through
spatial differences and time differences.
PCE approach is computationally cheaper than Monte-Carlo but its rarely used in
hydrological simulation for parameter optimization
Example of Monte Carlo simulation using
contour plot of Nash-Sutcliffe efficiency response
(Khu and Werner, 2003)
3
5. Background, objectives and methodology
Methodology
List the DHM parameters and its ranges
Conduct the sensitivity analysis for
those parameters
Select the parameters
Put the parameters into PCE
Check the reproducibility of calculation
Result analysis
• observation discharge
points,
• peak of discharges
Based on each model efficiency
(RMSE, NSE, R2)
By spatial difference, calculation period difference
4
6. Selection of DHM Parameters
Lower boundary value = 0.5 x parameter original value
Upper boundary value = 1.5 x parameter original value
林内雨量比率 (rate of amount of rain in the forest) 𝛼1 - 0.46-1.215 0.81
樹幹流比率 (rate of stem flow) 𝛼2 - 0.055-0.137 0.11
タンクI(樹冠部)の最大貯留水深 max. storage (TANK 1) 𝑆1𝑚𝑎𝑥 mm 0.72-2.16 1.44
タンクII(樹幹部)の最大貯留水深 max. storage (TANK 2) 𝑆2𝑚𝑎𝑥 mm 0.265-0.795 0.53
タンクIII(林地系表層部)の貯留定数 storage constant TANK
3
𝐾3 hr 3.00-9.00 6.00
タンクIV(林地系下層部)の貯留定数 storage constant TANK
4
𝐾4 mm23/25∙h2/
25
50.00-150.00 100.00
タンクIIIの貯留べき定数 storage power exponent (TANK 3) 𝑃3 - 0.50 – 1.50 0.6
タンクIVの貯留べき定数storage power exponent (TANK 4) 𝑃4 - 0.04 – 0.12 0.08
流出寄与率16 %となる有効土層深 16% D16 particle size
distribution
𝐷16 mm 5.00 – 15.00 10.00
流出寄与率50 %となる有効土層深 D50 particle size
distribution
𝐷50 mm 25.00 – 75.00 50.00
A層の透水係数 (hydraulic conductivity of layer A) 𝑘 cm/sec 0.15 – 0.45 0.30
A層の有効間隙率 (effective porosity of layer A) 𝛾 - 0.10 – 0.30 0.20
A層の厚さ (layer A thickness) 𝐷 mm 100.00 – 300.00 200.00
Parameter Candicate that I want to estimate
Parameter’s name Symbol Units Ranges value
Ranges value
Original value
5
7. Selection of DHM Parameters (1)
The statistical metrics I used for model efficiency evaluation
Calculation conditions for initial calculation and sensitivity analysis
Model domain Ibo river watershed in Hyogo
Grid resolution 2 km x 2 km (307 grids)
Spin up calculation Jan 1, 2015 to April 30, 2015
Calculation period (1) May 1, 2015 to July 15, 2015
Rainfall data
AMeDAS
XRAIN
Temperature data JMA data (monthly data, 3 points)
Observation discharge
at Kamigawara ( 上川原) station
(hourly observation)
Time step (∆t) 0.0005
Nash-Sutcliffe coefficient Root mean square error (RMSE)
Coefficient of determination (R2)
𝑁𝑆 = 1 −
𝑖=1
𝑛
(𝑄 𝑜,𝑖− 𝑄𝑠,𝑖
2
𝑖=1
𝑛 (𝑄 𝑜,𝑖 − 𝑄 𝑜
2 𝑅𝑀𝑆𝐸 =
𝑄𝑠 − 𝑄𝑖
2
𝑛
𝑅2
= 1 −
𝑄 𝑜 − 𝑄 𝑜)(𝑄𝑠 − 𝑄𝑠
𝑄 𝑜 − 𝑄 𝑜
2 𝑄𝑠 − 𝑄𝑠
2
2
𝑄𝑠= simulated discharge
𝑄 𝑜= observed discharge
I used those performance criteria
because they are commonly used to
evaluate the model runoff behaviour
in hydrological model
上川原
(kamigawara)
構
(kamae)
龍野
(tatsuno)
東栗栖
(higashi kurisu)
山崎
(yamazaki)
塩野
(shiono)
曲里
(magari)
6
8. Annual average data of AmeDAS and X-RAIN
XRAIN
Amedas
若桜
佐用
上郡
一宮
姫路
sayou
Kamigori
ichinomiya
Wakasa
Himeji
AmeDAS
Rainfall
observation
point
Average distance weighting
𝑋 = 𝑎𝑛𝑛𝑢𝑎𝑙 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑒𝑐𝑖𝑝𝑖𝑡𝑎𝑡𝑖𝑜𝑛 (𝑚𝑚/𝑦)
𝑤1 = 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑔𝑟𝑖𝑑 𝑖𝑛 𝑎𝑟𝑒𝑎 1 (𝑚𝑚/𝑦)
𝑥1 = 𝑝𝑟𝑒𝑐𝑖𝑝𝑖𝑡𝑎𝑡𝑖𝑜𝑛 𝑑𝑎𝑡𝑎 𝑖𝑛 𝑔𝑟𝑖𝑑 1 1 (𝑚𝑚/𝑦)
𝑋(𝑚𝑚/𝑦)
𝑖𝑛𝑑𝑒𝑥 𝑑𝑎𝑡𝑎
7
9. Selection of DHM Parameters (1)
Comparison of simulation results using X-RAIN and AmeDAS data in Kamigawara station without any
parameter changes
May 1, 2015 to July 15, 2015
X RAIN (x) AmeDAS
NSE 0.675 0.772
RMSE 13.373 15.961
R2 0.829 0.897
In terms of RMSE, the simulation discharge from X-RAIN data has lower value than AmeDAS data. The statistical
analysis that we will use in PCE is RMSE, therefore I want to use X-RAIN data for next sensitivity analysis . 8
10. Selection of DHM Parameters (1)
Sensitivity analysis results
For upper boundary, S1 has the lowest value of RMSE and highest value of NSE and R2 for the sensitivity analysis using
X-RAIN data in Kamigawara station and K3 for the lower boundary, respectively.
The result of this sensitivity analysis may changes within different assimilation period and different observation
points.
May 1, 2015 to July 15, 2015
Kamigawara station
0.000
0.100
0.200
0.300
0.400
0.500
0.600
0.700
0.800
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
Nash-Sutcliffe
Parameter name
upper bc
lower bc
0.000
2.000
4.000
6.000
8.000
10.000
12.000
14.000
16.000
18.000
20.000
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
RMSE
Parameter name
upper bc
lower bc
0.720
0.740
0.760
0.780
0.800
0.820
0.840
0.860
0.880
0.900
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
coefficientofdetermination(R2) Paremeter name
upper bc
lower bc
9
11. Selection of DHM Parameters (1)
Comparison of simulation using
X-RAIN and AmeDAS data in
Kamigawara station without any
parameter changes
June 20, 2015 to August 15, 2015
X RAIN (x) Amedas
NSE 0.927 0.840
RMSE 22.661 23.137
R2 0.974 0.948
August 20, 2015 to Sept 20, 2015
X RAIN (x) Amedas
NSE 0.495 0.521
RMSE 17.512 16.261
R2 0.720 0.782
Largest discharge period
Medium discharge period
Each of the efficiency criteria has
specific effect to the simulation
result for high and low flow
conditions
10
12. Selection of DHM Parameters (1)
Sensitivity analysis results
June 20, 2015 to August 15, 2015
Kamigawara station
9
0.000
0.200
0.400
0.600
0.800
1.000
1.200
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
Nash-Sutcliffe
Parameter name
upper bc
lower bc
0.000
5.000
10.000
15.000
20.000
25.000
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
RMSE
Parameter name
upper bc
lower bc
0.780
0.800
0.820
0.840
0.860
0.880
0.900
0.920
0.940
0.960
0.980
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ Dcoefficientofdetermination(R2)
Paremeter name
upper bc
lower bc
13. Selection of DHM Parameters (1)
Sensitivity analysis results
August 20, 2015 to Sept 20, 2015
Kamigawara station
0.000
0.100
0.200
0.300
0.400
0.500
0.600
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
Nash-Sutcliffe
Parameter name
upper bc
lower bc
0.000
2.000
4.000
6.000
8.000
10.000
12.000
14.000
16.000
18.000
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
RMSE
Parameter name
upper bc
lower bc
0.000
0.100
0.200
0.300
0.400
0.500
0.600
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
coefficientofdetermination(R2)
Paremeter name
upper bc
lower bc
14. PCE setup for parameter optimization (1)
Calculation period (1) May 1, 2015 to July 15, 2015
Precipitation data XRAIN data
Observation discharge
at Kamigawara ( 上川原) station
(hourly observation)
Parameter name Range value Original value
storage constant TANK 3 (K3) 1.5 to 12 6.00
max. storage TANK 1 (S1) 0.36 to 2.88 1.44
RMSE RMSE
Wider changes in parameter
ranges can cause a strong effect
on the polynomial interpolation
which also can affect the layout of
quadrature points in parameter
space
11
15. PCE optimization results (1)
Calculation period (1) May 1, 2015 to July 15, 2015
Precipitation data XRAIN data
Observation discharge
at Kamigawara ( 上川原) station
(hourly observation)
RMSE
optimum parameter
Parameter name value
storage constant TANK 3 (K3) 4.97
max. storage TANK 1 (S1) 1.75
Calculated discharge using optimum
parameter (uniform distribution)
12
16. Selection of DHM Parameters (2)
Sensitivity analysis results
May 1, 2015 to July 15, 2015
Tatsuno station
-1.400
-1.200
-1.000
-0.800
-0.600
-0.400
-0.200
0.000
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
Nash-sutcliffe
upper bc
lower bc
0.000
5.000
10.000
15.000
20.000
25.000
α1 α2 S1 S2 K3 K4 P3 P4 D16 D50 k γ D
RMSE
parameter name
upper bc
lower bc
0.680
0.700
0.720
0.740
0.760
0.780
0.800
0.820
0.840
0.860
0.880
α1 α2 S1 S2 K3 K4 P3 P4 D16D50 k γ D
coefficientofdetermination(R2)
parameter name
upper bc
lower bc
Upper boundary simulation result for
parameter K3 indicates that simulation
results fits the observation data well with
high coefficient of determination
Another results from sensitivity analysis
for NSE and R2 may give different impact
to the simulation result
13
17. PCE optimization results (2)
Calculation period (1) May 1, 2015 to July 15, 2015
Precipitation data XRAIN data
Observation discharge
at龍野 (tatsuno) station
(hourly observation)
RMSE
Calculated discharge using optimum
parameter
optimum parameter
value
170
21.00
Parameter name Range value Original value
storage constant TANK 4 (K4) 25 to 400 100
16% particle size distribution (D16) 2.5 to40 10
14
18. Considerations
1. The most frequently used efficiency coefficient for hydrological model is Nash-Sutcliffe
efficiency and coefficient of determination (R2) which may give different performance relative
to the peak flow or low flow conditions.
2. The efficiency coefficient may works differently with different precipitation data which is used
for the simulation.
3. Parameter result for sensitivity analysis changed every simulation setting is obtained
(location of observation data and assimilation period), thus it will not lead to find the value of
parameter with general usage for forecasting.
Same points
Period 1 Period 2 Period 3
NSE
R2
RMSE
NSE
R2
RMSE
NSE
R2
RMSE
PCE 1 PCE 2 PCE 3
Same period
Point 1 Point 2 Point 3
NSE
R2
RMSE
NSE
R2
RMSE
NSE
R2
RMSE
PCE 4 PCE 5 PCE 6
15
19. Future Tasks
Future task
-Find the difference between X-RAIN and AmeDAS data
-Write chapter 1-2 of master's thesis
-check the result for PCE from another observation points and another assimilation
period
- For the assimilation period in May 1, 2015 to July 15, 2015. It is necessary to perform
PCE for NSE value and coefficient of determination from sensitivity analysis in all the
observation points
-Determine the calculation setting for simulation validation