Automating regional descriptive statistic computations for environmental modeling Satoshi Hirabayashi Environmental Resour...
Outline <ul><li>Background </li></ul><ul><li>Introduction to Problems </li></ul><ul><li>Methods </li></ul><ul><li>Case Stu...
Low Streamflow Regional Regression (Kroll et al., 2004) Background % Standard Error 700 600 500 400 300 200 100 0 USGS USG...
Low Streamflow Regional Regression Background Q 7,10  : 7-day 10-year streamflow statistic  I  : Model parameter to be es...
Zonal Statistics Tool Background
Zonal Statistics Tool Background
Zonal Statistics Tool Background
Problems with Zonal Statistics Tool Introduction to Problems SAS
Problems with Multiple Raster Datasets Introduction  to Problems Research Hirabayashi, 2005 Kroll, 2007 Hirabayashi and Kr...
Problems in Manual Operations <ul><li>Tedious </li></ul><ul><li>Time-consuming </li></ul><ul><li>Erroneous </li></ul>Intro...
Automated Explanatory Variables Extraction Methods Batch Output Table Creation Output Table Watershed Boundaries Parameter...
User Interface (ArcWC) Methods
Watershed Characteristic Extraction Case Study
Watershed Characteristic Extraction Case Study Hydro1K DEM Raster Dataset # Data 1
Watershed Characteristic Extraction Case Study Hydro1K DEM Slope Raster Dataset # Data 1 1
Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM Raster Dataset # Data 1 1 13
Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM STATSGO Raster Dataset # Data 1 1 13 12
Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM STATSGO NLCD Raster Dataset # Data 1 1 13 12 1
Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM STATSGO NLCD Raster Dataset # Data 1 1 13 12 1 Wate...
<ul><li>Saved at least 95% of the labor time. </li></ul><ul><li>GIS toolset is versatile and can aid in a wide variety of ...
Questions?
Gauging Site Relocation Methods
Gauging Site Relocation Methods
Unnested Watershed Identification Methods
Unnested Watershed Identification Methods
Unnested Watershed Identification Methods
Unnested Watershed Identification Methods
Batch  STATSGO Processing Methods
Upcoming SlideShare
Loading in …5
×

ERFEG Seminar Fall 2008

504 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
504
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • I am gonna talk little bit about my past research, titled automating regional descriptive statistic computations for environmental modeling.
  • This is the same chart as Chuck’s talk, showing a comparison of low streamflow regression models constructed with three different sets of explanatory variables. My talk is focusing on these digitally derived watershed characteristics.
  • Low streamflow regression models generally take this form. Q7,10 is a 7-day 10-year streamflow statistic, Betas are model parameters, and X’s are watershed characteristics, like topography, climate, and soil information. The models can be constructed by first, deriving Xi’s from raster datasets using ArcGIS zonal statistics tool, and then inputing Q7,10 and potential X’s into a statistical software, SAS. We imput a large number of Xi’s as potential explanatory variables and the SAS picks Xi’s that best estimates the Q7,10.
  • This is how this tool works. Here is watershed layer, each polygon here represents a watershed boundary.
  • Then, overlay this layer on top a raster data.
  • The tool takes cells that are included within each watershed, and calculates descriptive statistics of these cell values, and results are stored in a table. In this table each row represents a watershed boundary, and columns represent descriptive statistics for this raster data. When you process another raster data, ideally, the results are appended to the same table, because eventually we want to have one table to input to SAS. But the zonal statistics tool can’t do that. Instead, separated tables are created for multiple raster data.
  • This is a problem of the zonal statistics tool. So what you need to do is to merge these tables created for multiple raster data into one table. This can be done by just copy and paste columns, but there is another problem. Columns in these tables have same name, mean or standard deviation, but in this table, those column names should be identifiable for each raster data, like a mean of elevation, standard deviation of precipitation, and so on. So you also need change the column names. When there are only 10 raster data, those table can be relatively easily merged manually.
  • But in our studies, we employ much more raster data. Here, my master’s thesis, fourteen hundred raster data were used, and I had three different watershed layers, each has 35 watersheds, so the number of tables I needed to merge were more than four thousand. In Chuck’s today’s talk, 28 raster and 112 tables needed to be merged. In my paper here, again fourteen hundred raster tables, and in this paper, 162 tables. So, for the first one, you need to manually copy &amp; paste columns for 4000 times, and change the column names 4000 times.
  • So manual operation is very tedious, time-consuming, and prone to human errors. Motivated by these problems, we decided to develop a custom ArcGIS toolset.
  • Here is a user interface of that tool. Actually, that tool is just one tool in the GIS toolset we developed, named Arc watershed classification. In this toolset, most of the GIS operations for our research are customized and integrated. I only show this one tool today. Using this window, you can specify parameter files and other input to the tool. Then, press OK, everything is automatically done.
  • Here is a case study. In the same study region as Chuck’s talk, 144 watersheds.
  • We used hydro1k DEM.
  • Slope that is derived from the DEM
  • 13 raster data from dataset called PRISM,representing monthly and yearly precipitation
  • and 12 raster data of soil classification from dataset called STATSGO,
  • And landcover from national landcover dataset.
  • Using these raster dataset, we used the developed tool and created a watershed characteristics database. This table can be inputted to SAS to construct regression equations.
  • The developed tool saved at least 95 % of the manual labor time. GIS toolset is versatile and can aid in a wide variety of environmental studies, meaning that the polygons don’t need to be watershed boundaries, that can be any boundaries like State, county, or town, and any raster dataset can be processed.
  • ERFEG Seminar Fall 2008

    1. 1. Automating regional descriptive statistic computations for environmental modeling Satoshi Hirabayashi Environmental Resources Engineering SUNY College of Environmental Science and Forestry, Syracuse, NY USA
    2. 2. Outline <ul><li>Background </li></ul><ul><li>Introduction to Problems </li></ul><ul><li>Methods </li></ul><ul><li>Case Study </li></ul><ul><li>Conclusions </li></ul>Outline
    3. 3. Low Streamflow Regional Regression (Kroll et al., 2004) Background % Standard Error 700 600 500 400 300 200 100 0 USGS USGS and Digital USGS, Digital, and Hydrogeology Entire US 29 regions 930 HCDN sites Focus !
    4. 4. Low Streamflow Regional Regression Background Q 7,10 : 7-day 10-year streamflow statistic  I : Model parameter to be estimated X i : Watershed characteristics <ul><li>Derive X i ’s from raster datasets with ArcGIS zonal statistics tool . </li></ul><ul><li>Input Q 7,10 and X i ’s into SAS. </li></ul><ul><li>SAS picks X i ’s that best estimate Q 7,10 . </li></ul>Model Construction Process
    5. 5. Zonal Statistics Tool Background
    6. 6. Zonal Statistics Tool Background
    7. 7. Zonal Statistics Tool Background
    8. 8. Problems with Zonal Statistics Tool Introduction to Problems SAS
    9. 9. Problems with Multiple Raster Datasets Introduction to Problems Research Hirabayashi, 2005 Kroll, 2007 Hirabayashi and Kroll, 2007 Hirabayashi and Kroll, 2008 1,466 28 1,466 54 # of Raster Dataset # of Watersheds (# of Layers) 35 (3) 144 (4) 35 (1) 106 (3) # of Tables 4,398 112 1,466 162
    10. 10. Problems in Manual Operations <ul><li>Tedious </li></ul><ul><li>Time-consuming </li></ul><ul><li>Erroneous </li></ul>Introduction to Problems Develop a custom ArcGIS toolset.
    11. 11. Automated Explanatory Variables Extraction Methods Batch Output Table Creation Output Table Watershed Boundaries Parameter File Parameter File Log File Batch Descriptive Statistic Calculation dBASE table Developed tool ESRI GRID/TIFF /IMAGINE raster geodatabase/ shapefile ASCII text file Weather, Soil, Elevation, etc
    12. 12. User Interface (ArcWC) Methods
    13. 13. Watershed Characteristic Extraction Case Study
    14. 14. Watershed Characteristic Extraction Case Study Hydro1K DEM Raster Dataset # Data 1
    15. 15. Watershed Characteristic Extraction Case Study Hydro1K DEM Slope Raster Dataset # Data 1 1
    16. 16. Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM Raster Dataset # Data 1 1 13
    17. 17. Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM STATSGO Raster Dataset # Data 1 1 13 12
    18. 18. Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM STATSGO NLCD Raster Dataset # Data 1 1 13 12 1
    19. 19. Watershed Characteristic Extraction Case Study Hydro1K DEM Slope PRISM STATSGO NLCD Raster Dataset # Data 1 1 13 12 1 Watershed characteristics database
    20. 20. <ul><li>Saved at least 95% of the labor time. </li></ul><ul><li>GIS toolset is versatile and can aid in a wide variety of environmental studies. </li></ul><ul><li>Available for free download at www.esf.edu/erfeg/kroll. </li></ul>Conclusions Conclusions
    21. 21. Questions?
    22. 22. Gauging Site Relocation Methods
    23. 23. Gauging Site Relocation Methods
    24. 24. Unnested Watershed Identification Methods
    25. 25. Unnested Watershed Identification Methods
    26. 26. Unnested Watershed Identification Methods
    27. 27. Unnested Watershed Identification Methods
    28. 28. Batch STATSGO Processing Methods

    ×