Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Using R for Global Optimization of a Fully-distributed
Using R for Global Optimization of a Fully-distributed
Hydrological...
Upcoming SlideShare
Loading in …5
×

Using R for Global Optimization of a Fully-distributed Hydrological Model at Continental Scale (AGU 2013)

892 views

Published on

Nowadays hydrologic model simulations are widely used to better understand hydrologic processes and to predict extreme events such as floods and droughts. In particular, the spatially distributed LISFLOOD model is currently used for flood forecasting at Pan-European scale, within the European Flood Awareness System (EFAS). Several model parameters can not be directly measured, and they need to be estimated through calibration. In this work we describe how the free software “R” has been used as a single environment to pre-process hydro-meteorological data, to carry out global optimization, and to post-process calibration results in Europe.

Historical daily discharge records were pre-processed for 4062 stream gauges, with different amount and distribution of data in each one of them. The hydroTSM, raster and sp R packages were used to select 700 stations with an adequate spatio-temporal coverage. Selected stations span a wide range of hydro-climatic characteristics. Nine parameters were selected to be calibrated based on previous expert knowledge. Customized R scripts were used to extract observed time series for each catchment and to prepare the input files required to fully set up the calibration thereof. The hydroPSO package was then used to carry out a single-objective global optimization on each selected catchment, by using the Standard Particle Swarm 2011 (SPSO-2011) algorithm. Among the many goodness-of-fit measures available in the hydroGOF package, the Nash-Sutcliffe efficiency was used to drive the optimization. User-defined functions were developed for reading model outputs and passing them to the calibration engine. The long computational time required to finish the calibration at continental scale was partially alleviated by using 4 multi-core machines (with both GNU/Linux and Windows OS) and the “parallel” option available in hydroPSO. Calibration results (not described here) were automatically produced in both text and graphical formats, including a comparison of observed and simulated hydrographs, histograms, boxplots and dotty plots with the parameter values sampled during the optimization. Graphical results allowed a quick assessment of model performance and the identification of individual problems during calibration.

This work illustrates how R proved to be a valuable environment to facilitate modeling, visualization, and data analysis at continental scale in an efficient and reproducible way, without switching to other applications to perform single analyzes. The application discussed here relates to the calibration of a hydrologic model written in pyhton+PCRaster. However, considering the exponentially increasing number of contributed packages, the multi-platform architecture, and the scripting capabilities available, we believe R is a promising environment for hydrology and a similar approach can be applied to a wider class of models requiring parameter optimization.

Published in: Education
  • Be the first to comment

Using R for Global Optimization of a Fully-distributed Hydrological Model at Continental Scale (AGU 2013)

  1. 1. Using R for Global Optimization of a Fully-distributed Using R for Global Optimization of a Fully-distributed Hydrological Model at Continental Scale Hydrological Model at Continental Scale Mauricio Zambrano-Bigiarini, Zuzanna Zajac and Peter Salamon Mauricio Zambrano-Bigiarini, Zuzanna Zajac and Peter Salamon 1) Motivation Joint Research Centre AGU 2013-1804792 Identifier: H51R-06 Dec 13th, 2013 6) Calibration results + post-processing The spatially-distributed LISFLOOD hydrological model is used for flood forecasting at Pan-European scale, within the European Flood Awareness System (EFAS). Several model parameters need to be estimated through calibration for ca. 700 subcatchments. Calibrating all the individual catchment for the whole Europe is a very time consuming and prone-to-error task. 2) Aim To describe and illustrate how the free software R has been used as a single environment to pre-process hydro-meteorological data, carry out global optimization, and to post-process calibration results at European scale. 3) Why using R for massive hydrological analysis? ● ● ● ● ● ● ● ● ● ● Fig 03. Evolution of the global optimum (Nash-Sutcliffe efficiency) and the normalized swarm radius (δnorm) along the number of iterations. Base functions allow efficient data manipulation and storage (spatial data and time series). Support for almost every vectorial and raster spatial format (rgdal, raster and sp packages). R is both a scientific software and a programming language (types, objects, functions, extensions). Scripting capabilities allow explicit documentation and reproducible research. Fully-customizable and high-quality graphical functions for exploratory data analysis and visualization. Highly extensible (4000+ packages with state-of-the art contributions in several fields of knowledge). Easy integration with other languages (C/C++, Fortran, Python, etc), e.g., for intensive computations. Easy parallelization (multi-core machines or network clusters). Multi-platform (GNU/Linux, Mac/OS X, Windows) Free and open-source. Fig 01. Shaded boxes represent the seven major calibration areas used for splitting up the pan-European spatial domain. Colored dots represent discharge stations coming from two different data sources, which were analyzed to select ca. 700 stations for calibration. 5) Model Calibration NSE Fig 06. Figure automatically generated for assessing the quality of the calibration results of each single catchment. The upper panel shows a comparison of the observed and simulated hydrographs during the verification period; the lower left panel shows a comparison of the flow duration curves thereof, while the lower right panel shows numerical statistics for comparing observations with their simulated counterparts. 7) Concluding Remarks Fig 04. Nash-Sutcliffe efficiency (NSE) response surface projected onto the parameter space (pseudo 3D-dotty plots) for selected parameters, to highlight equifinality issues. 4) Pre-processing ● ● ● ● ● Historical daily data for national providers). 4062 stream gages (from ● hydroTSM, sp and raster packages were used to select ~700 stations with enough temporal data and good spatial distribution across Europe. Nine parameters were selected for calibration based on previous expert knowledge. The pan-European spatial extent was split up into 7 main calibration areas, in order to speed up the model computation time. Customized R scripts were used to extract observed time series for each catchment and to prepare the input files required for individual calibrations (i.e., ParamRanges.txt, ParamFiles.txt, obs.tss, and hydroPSO-subbXXX.R files along with a masking area map defining the drainage area of individual catchments). www.jrc.europa.eu ● ● ● ● Fig 02. Flow chart of the calibration of a single catchment. Files ParamRanges.txt  and ParamFiles.txt  defines which parameters are to be calibrated and where they have to be modified, respectively. Settings.xml defines location of model input files and the value of model parameters. Light-blue shaded boxes indicate some user intervention, while light-yellow shaded boxes represent static input files (not modified during optimization). obs.tss : file with observed discharges. ● dis.tss : file with simulated discharges. ● read_tss(): user-defined R function for reading .tss files ● ● Fig 05. Dotty plots showing the model performance (NSE) versus parameter values, for three selected parameters. Vertical red line indicates the “optimum” parameter value. ● ● ● 2011 (hydroPSO package). Mauricio Zambrano-Bigiarini*, Zuzanna Zajac and Peter Salamon European Commission • Joint Research Centre • Institute for Environment and Sustainability *Currently at: EULA-Chile Centre, University of Concepción (Chile) • Email: mauricio.zambrano@udec.cl data analysis at continental scale. The use of a single environment for pre-processing, calibrating and post-processing of results made easier further changes to any step of the workflow. Results in hundreds of catchments with different hydro-climatological regimes showed that hydroPSO is an effective and efficient R package for finding near-optimal parameter sets at a low computation cost. Notwithstanding this case study is related only to the calibration of a hydrological model written in Ptyhon+PCRaster, we believe that a similar approach can be applied to a wide class of environmental models requiring some form of parameter optimization, from micro to global scale. References: NSE() : R function for computing the Nash-Sutcliffe efficiency (hydroGOF package) ● SPSO-2011 : Standard Particle Swarm Optimization The use of the 'parallel' option available in the hydroPSO, allowed a substantial reduction of the total calibration time (ca. 50% with 6 cores). R proved to be an efficient environment to facilitate modeling, visualization and ● ● EFAS (2013), “European Flood Awareness System”, http://www.efas.eu/. [Online. Last accessed 05-Dec-2013] van Der Knijff, J. M., J. Younis, and A. P. J. De Roo (2010), LISFLOOD: a GIS-based distributed model for river basin scale water balance and flood simulation, International Journal of Geographical Information Science, 24(2), 189–212, doi:10.1080/13658810802549154. Zambrano-Bigiarini, M.; R. Rojas (2013), A model-independent Particle Swarm Optimisation software for model calibration, Environmental Modelling & Software, 43, 5-25, doi:10.1016/j.envsoft.2013.01.004. Zambrano-Bigiarini, M. (2013). hydroTSM: Time series management, analysis and interpolation for hydrological modelling. R package version 0.4-1. http://CRAN.R-project.org/package=hydroTSM Zambrano-Bigiarini, M. (2013). hydroGOF: Goodness-of-fit functions for comparison of simulated and observed hydrological time series. R package version 0.3-7. http://CRAN.R-project.org/package=hydroGOF

×