Your SlideShare is downloading. ×
  • Like
  • Save
M.S. Capstone Seminar
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

M.S. Capstone Seminar

  • 595 views
Published

 

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
595
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • Good morning. Thank you for coming to my capstone seminar. My name is Satoshi Hirabayashi. I will talk about my research, examining the impact of raster datasets on flood and low streamflow regional regression models using custom gis applications
  • Here is outline of today’s talk.
  • First is abstract.
  • Here is my research title. This is quite long. So, I will break it down a little bit for you to better understand. First, I will use a lot of raster datasets in this study. Then, because there are so many data, GIS operations will be customized to easily process these datasets. Then, outputs from the GIS application will be used as inputs to the hydrologic modeling. Flood and low streamflow will be modeled. And finally, the impact of raster datasets on those models will be examined.
  • Introduction.
  • Of interest is to estimate or predict frequency and magnitude of extreme hydrologic events, flood and low streamflow. There are common statistics to estimate flood and low streamflow. The common flood statistic is 100 year flood, Q100, represents the annual maximum streamflow that is exceeded on average once every 100 years. The common low streamflow statistic is 7-day, 10-year low streamflow, Q7,10, which represents the average annual 7-day minimum streamflow that is exceeded on average in 9 out of every 10 years . Q100, annual maximum flow that occurs once every 100 years on average. Q7,10, annual 7-day minimum flow that occurs once every 10 years on average. There are two situations in the rivers, gauged or ungauged. At the gauged river sites, the streamflows are gauged and recorded. In this case, typically flood and low streamflow statistics can be estimated with a frequency analysis. On the other hand, at the ungauged river sites, streamflow records are not available. In this case, one common way to estimate flood and low streamflow is a regional regression model. This is my interest here, to predict flood and low streamflow at ungauged sites with the regional regression models.
  • Regional regression models is an equation between Q and Xs. Q is a response variable, in this study, Q100, or Q7,10. Xs are explanatory variables, watershed characteristics. At ungauged sites, watershed characteristics can be measured. But stream data is not available here, so of course, so you can’t calculate Q. So you have Xs here but you don’t have Q here. That means, you can’t determine the relationship here. What you can do is to choose a gauged site in a same region as the ungauged river sites. At gauged sites, watershed characteristics can be measured. And Q can be calculated from streamflow records. So you have both Q and Xs here, so using these values, you can determine the relationship between Q and Xs. Once you develop a regression model, you can apply the developed model to predict the low streamflow statistics at ungauged sites because they are At the same region.
  • Regional regression model is a mathematical equation. Here Q is flood or low streamflow statistics and Xs are watershed characteristics. Alpha, beta, gamma are model parameters to be estimated. By taking logarithm on both side, a log-linear model is obtained. There is one response variable and two or more explanatory variables. So, this relationship is analyzed with the multiple linear regression technique. I used ordinary least squares regression procedure to estimate model parameters, alpha, beta, gamma.
  • Now, I will introduce some past studies on flood and low streamflow regional regression models. Traditionally, the regional regression models to estimate flood work relatively good with the standard error of the models is 30-60 %. But, low streamflow regional regression models performed poorly, the standard error is more than 100 %. Some reasons for this poor performance have been pointed out. And here are two reasons. Traditionally, the watershed characteristics were manually derived from the paper map. So the data quality was probably insufficient. Another potential reason is that important watershed characteristics were not used in the model development. This seems to be due to the insufficient data quantity.
  • My advisor, Dr. Chuck Kroll took countermeasures for data quality and quantity. Two primary data involved in the regional regression model development are streamflow records and watershed characteristics. For streamflow records, they employed data from USGS’s Hydro-Climatic Data Network (HCDN), which meet certain measurement accuracy criteria. So, the quality is good. Also, they employed more than 1500 sites with averagely 44 years of record length. For the watershed characteristics, they used many raster data, 1465 data, and they derived watershed characteristics digitally. As a result, they councluded that the low streamflow regional regression models were improved, throughout the United States.
  • I will introduce Kroll et al’s processes. There are two major processes, watershed boundary delineation and watershed characteristics derivation. From DEM, and gauging site location, the watershed boundary of interest area can be delineated. This delineated watershed boundary is then overlaid on the raster data, such as climate data, and the spatial average is calculated. This is going to be a watershed characteristic. They didn’t use a GIS package, because there were so many data, and would take so long due to the computer power. So they developed programs in FORTRAN.
  • Kroll et al’s study motivated this study. They used 1 DEM with approximately 1km horizontal resolution. But now DEMs with finer resolution are available, using finer resolution DEM can enhance the quality of watershed boundary. They used two different climate datasets, one is 49 km resolution, the other is 4km, and the number of each dataset was fourteen hundred and thirteen. The soil data is 1km resolution and 12 data. If I use a finer resolution, quality of watershed characteristics can be improved. And if I use more data, quantity of watershed characteristics can be increased.
  • Recently, people use a GIS package to process geospatial data. GIS is very useful to derive input parameters to hydrologic models, But, what if there is a large amount of data?? In this study almost fifteen hundred data need to be processed. Can you do that manually? I don’t wanna do that, because, it is tedious, time-consuming, easily produce human errors, ….
  • Here is outline of today’s talk.
  • To develop a GIS system to efficiently and effectively derive watershed characteristics. As I said, I will use DEMs with finer horizontal resolution to delineate watershed boundaries, and examine the impact of the horizontal resolution on flood and low streamflow regional regression models. Also, I will use raster data with finer resolution, and new raster data to derive watershed characteristics. Then I will examine the impact of these new raster datasets on flood and low streamflow regional regression models. Of these watershed characteristics, I will examine the most important ones to include in flood and low streamflow regional regression models.
  • To develop a GIS system to efficiently and effectively derive watershed characteristics. As I said, I will use DEMs with finer horizontal resolution to delineate watershed boundaries, and examine the impact of the horizontal resolution on flood and low streamflow regional regression models. Also, I will use raster data with finer resolution, and new raster data to derive watershed characteristics. Then I will examine the impact of these new raster datasets on flood and low streamflow regional regression models. Of these watershed characteristics, I will examine the most important ones to include in flood and low streamflow regional regression models.
  • To develop a GIS system to efficiently and effectively derive watershed characteristics. As I said, I will use DEMs with finer horizontal resolution to delineate watershed boundaries, and examine the impact of the horizontal resolution on flood and low streamflow regional regression models. Also, I will use raster data with finer resolution, and new raster data to derive watershed characteristics. Then I will examine the impact of these new raster datasets on flood and low streamflow regional regression models. Of these watershed characteristics, I will examine the most important ones to include in flood and low streamflow regional regression models.
  • To develop a GIS system to efficiently and effectively derive watershed characteristics. As I said, I will use DEMs with finer horizontal resolution to delineate watershed boundaries, and examine the impact of the horizontal resolution on flood and low streamflow regional regression models. Also, I will use raster data with finer resolution, and new raster data to derive watershed characteristics. Then I will examine the impact of these new raster datasets on flood and low streamflow regional regression models. Of these watershed characteristics, I will examine the most important ones to include in flood and low streamflow regional regression models.
  • Here is outline of today’s talk.
  • This map shows the standard error of the regional regression models developed by Kroll et al. They generally improved the low streamflow model performance across the country, but, these areas are not very good. Standard error was more than hundred. The models in these regions may be improved with DEMs with finer resolution and more explanatory variables.
  • Flood regional regression models across the country was developed by the USGS. The performance of these flood models are relatively good. This map shows the number of explanatory variables in the models. In this region, only one variable is used in the models. For instance, in Tennessee, only drainage area is used in the model. Based on these results, the study region of this study was decided. Models in these regions may be improved with more explanatory variables.
  • Here, Tennessee, Kentucky, and North Carolina region, 36 USGS gauged sites are employed in this study. They are unregulated streams, which means there is no addition or withdrawal of water. They are not affected by regulation, such as reservoir releases. Streamflow data for these sites are recorded in the USGS national water information system (NWIS). They have more than 10 years of streamflow records.
  • This list presents the employed data. General watershed characteristics for the study region were obtained from the USGS. DEMs I used are Hydro1k DEM, 1-degree DEM and National elevation dataset (NED) DEM, horizontal resolution of these DEMs are 1 km, 85m, and 30 m, respectively. Soil data is USDA’s STATSGO. Climate is PRISM. Soil and climate data are same as Kroll et al’s, but the resolution of climate is much finer. They used 49 km, but this study used 4 km. I employed hydrology and hydrogeology data from USGS kansas water science center. As an initial investigation into methods to quantify hydrologic information from remotely sensed imageries that may aid in the prediction of extreme hydrologic events. Here I used raw data obtained by the MODIS (Moderate Resolution Imaging Spectroradiometer) sensor onboard NASA's Terra satellite.
  • This figure shows the overall GIS processes in this study. There are three main tasks, watershed boundary delineation, preprocess, and spatial statistics calculation. Watershed boundaries are delineated with Arc Hydro tools.
  • Basic idea here is that the water flows from points with higher elevation the lower elevation. So, using the elevation information in DEM and outlet point of the stream where the gauging site are located, watershed boundary for each gauging site can be delineated. The first step is fill sinks in DEM, if a cell is surrounded by higher elevation cells, the water is trapped in that cell and cannot flow. The fill sinks function modifies the elevation value to eliminate these problems. The second step is determining the flow direction. From the flow direction grid, watershed boundary is delineated. So I got three watershed boundary sets, which are denoted as WS1k, WS85, and WS30, representing the horizontal resolution of DEM used.
  • STATSGO toolset create raster datasets from original polygon data and attribute tables.
  • Developed STATSGO toolset take original STATSGO data as inputs and output the raster data to be used for the watershed characteristics derivation. Original STATSGO data I used is 1 polygon data and 2 attribute tables, soil component table and soil layer table, for each state. These data are inputted to the developed STATSGO toolset. Soil characteristics were weighted according to component percentage and layer thickness, then weighted values were joined to polygons, and a raster data representing a weighted soil characteristics. These processes were performed for each state, and then created raster for each state were merged into one raster data. Raster data representing max and minimum of these soil characteristics were created.
  • Raster toolset performs preprocesses for hydrologic, hydrogeologic, MODIS, PRISM raster datasets.
  • Developed raster toolset take original PRISM data as inputs and output the raster data to be used for the watershed characteristics derivation. Original PRISM data is a raster data in ArcInfo ASCII GRID format. To be used in watershed characteristics derivation, these raster data need to be converted to ESRI GRID format. Since we employed more than fourteen hundred data, it is very tedious to process all of these data manually. So I developed batch algorithm in the raster toolset so that there is only a few user interaction for the process. So physical paths to raster data are listed in the raster list file, and the batch process reads this file, retrieve raster data, then convert them from ASCII GRID format to the ESRI GRID format. Also, projection of these raster data need to be converted to the same projection as the watershed boundaries. Again, this process can be performed in a batch process, and minimum of these soil characteristics were created. Original PRISM data covers conterminous United States. But that is too large for this study, it would take longer if you used this data for watershed characteristics derivation, so I decided to develop a clipping function in the raster toolset. This is again performed in a batch process. As a result of those batch processes, clipped PRISM raster datasets were prepared. And they include …
  • Hydrology and hydrogeology data are provided in the ESRI GRID format. So There is no need to convert the format. But still projection conversion and clipping are necessary, they can be done the batch process in raster toolset. As a result, four raster datasets, that represents Base flow index, groundwater recharge, infiltration excess overland flow, saturation excess overland flow.
  • MODIS data are originally in hierarchical data format (HDF) format. These data are first processed in ENVI and exported to raster data in ESRI GRID format. Then, using the batch process I have talked, the projection conversion and clipping were performed. Prepared raster data are calibrated radiance on march 6, 2000, that is in the typical wet season in the study region And Sep. 6, Oct, 3, and so on in the typical dry season. My expectation is this data in wet season has some impact on flood models, and these dry season data have some impact on low streamflow models.
  • Table toolset creates a table where watershed characteristics will be stored.
  • Developed table toolset take gauging site and table list file as inputs and create initial watershed characteristic tables where the watershed characteristics for each watershed are stored.
  • Now all of these raster data and a table are ready. Then, paths to raster datasets are listed in a raster list file. Spatial statistics toolset computes the spatial statistics and save them in the table.
  • Paths to raster datasets and type of statistics to be computed are listed in a raster list file. You can specify, mean, maximum, minimum, standard deviation, and so on. Also this toolset takes watershed boundary data that will be overlaid on raster data. Then, spatial statistics are computed in the batch process and calculated statistics are stored in the table.
  • Now I summarize the GIS processes. These watershed boundaries were delineated from DEMs with 1km, 85m, and 30m horizontal resolution. These raster data were pre-processed with the developed toolsets. And then, the watershed boundaries were overlaid on the raster data and spatial statistics were computed and save in the watershed characteristic tables for each watershed boundary set.
  • In addition to the watershed characteristics explained so far, I added 4 topography variables computed from DEMs. And also, added 1 hydrology variable, surface flow index (SFI), which is computed as 100 - base flow index (BFI). Base flow is the component of streamflow that is contributed from ground water discharge. BFI is average percentage of base flow in annual streamflow. And SFI is average percentage of surface flow in annual streamflow. Also, I summarized 40 year monthly record of precipitation, max and minimum temperatures. They are 40-year record from Jan to Dec, totally 480 variables. They are divided into four seasonal windows and averaged. Then, they are fitted to a certain distribution and these percentiles were computed, resulting 36 variables. So, 480 variables of monthly precipitation were summarized to 36 variables. Max, and min temp are also summarized.
  • The watershed char. Table is a big table, and we categorized variables in the table into 4 classes.
  • Then, from A, B, C, D category, I made five combination of explanatory variables. So, performance of the model developed with only USGS traditional watershed characteristics were this. Then, add topography, soil, and climate data and build the models again, they, these models may be better than the model with only A. and so on.
  • For each watershed boundary set, potential explanatory variable combinations were used to develop the models. Criteria for the model construction are, # of explanatory variables, I developed model with 2, 3, and 4 explanatory variables. To determine which explanatory variables enter the model, I employed the stepwise regression procedure with 5 % of significance level to enter and stay in the models. Model parameters should be significant at 5 % significance level.
  • Here is outline of today’s talk.
  • This chart presents the standard error for all the flood regional regression models. Model with WS1k and A variable combination, A+B variable combination, and so on. Within a single watershed boundary set, models with the ALL variable combination were the best. So I use these results to compare DEM effects. Then, I will use WS1k result to compare the explanatory variable effects.
  • Same
  • Flood regional regression models developed with the ALL potential explanatory variable combination. They are almost same.
  • Low streamflow regional regression models developed with the ALL potential explanatory variable combination. They are almost same.
  • Why it happens? Now I will show you how the delineated watershed boundary are actually differ. This is WS1k polygon.
  • If I overlay WS85 polygon.
  • Then, I overlay WS30.
  • Now, let’s take a look at the variable combination impact on the flood model performance. Models developed with only A is the worst. Also, A+D combination produces the worst performance. A+D, is a traditional watershed characteristics and MODIS data, so MODIS data has nothing to do with flood models. Adding B, C, and B,C,D improved the models. Now let’s take a closer look at the explanatory variables in the models. Here are the model equations. For A, A+D, A+C, only two explanatory variables entered the model, satisfying the criteria. For A+B, up to four variables entered. And, note that the variables entered the model in this order. So, 2-variable model contains DA and DTMIN75, 3-variable model contain DA, DTMIN75, and FOREST, and so on. Which category are these variable belong to? A, B, C or D? Drainage area from A entered all models. Only with the SFI, surface flow index, the model is improved a lot.
  • This is for the low streamflow models. It shows the same trends as the flood except for A+D combination. A is the worst, other combinations are better. For flood models, A and A+D were exactly same, but here A+D combination produced the better models. Now let’s look at the model equations and category of the explanatory variables. Drainage area from A entered all models.
  • This is a typical hydrograph of a storm event. Sources of these streamflow are roughly divided into two flows, base flow, (or ground water discharge) and surface flow. Base flow index is a long-term average percentage of base flow in annual streamflow. Usually streamflow records are needed to compute BFI. But the new approach was taken here, the BFI for more than eight thousand gauged sites were computed. And then, they were interpolated for the conterminous US.
  • Here is outline of today’s talk.
  • Both Q100 and Q7,10 can be computed from the streamflow records using a log-Pearson type 3 distribution. First, the annual maximum flow for each year is calculated, then natural log is taken, and the histogram is made. And it is assumed that this distribution fits a log-Pearson type 3 distribution. Q100 is this point that represents the 99 th percentile of the distribution. Calculation of Q7,10 is similar. First,the annual minimum flow for 7 consecutive day for each year is calculated, then natural log is take, and plotted in the hitogram. And it is assumed that this distribution fits a log-Pearson type 3 distribution. Q7,10 is this point and represents 10 th percentile of this distribution.
  • In addition to the watershed characteristics explained so far, I added 4 topography variables computed from DEMs. And also, added 1 hydrology variable, surface flow index (SFI), which is computed as 100 - base flow index (BFI). Base flow is the component of streamflow that is contributed from ground water discharge. BFI is average percentage of base flow in annual streamflow. And SFI is average percentage of surface flow in annual streamflow. Also, I summarized 40 year monthly record of precipitation, max and minimum temperatures. They are 40-year record from Jan to Dec, totally 480 variables. They are divided into four seasonal windows and averaged. Then, they are fitted to a certain distribution and these percentiles were computed, resulting 36 variables. So, 480 variables of monthly precipitation were summarized to 36 variables. Max, and min temp are also summarized.
  • Now let’s take a look at how the DEM horizontal resolution affects the derived watershed characteristics. Here is an example. These polygon data are watershed boundaries included in WS1k, WS85, and WS30. For an example, if they are overlaid on the 30-year average January precipitation and grids within the polygon are averaged. Differences are very small.
  • Of interest is how the derived watershed characteristics changed based on both the horizontal resolution of the DEM and raster datasets. This chart shows the average percent difference of derived watershed characteristics between each of the three sets of watershed boundaries (WS1k, WS85, and WS30). This indicates that, when you use MODIS data, which is the finest resolution, average difference between WS1k and WS30 is the largest, but is just 3.1 %. Other cases produce the smaller difference. Therefore, there are only small differences in watershed characteristics, and that leads to the small differences in the performance of developed models.
  • Now, let’s take a look at the variable combination impact on the flood model performance. Models developed with only A is the worst. Also, A+D combination produces the worst performance. A+D, is a traditional watershed characteristics and MODIS data, so MODIS data has nothing to do with flood models. Adding B, C, and B,C,D improved the models. Now let’s take a closer look at the explanatory variables in the models. Here are the model equations. For A, A+D, A+C, only two explanatory variables entered the model, satisfying the criteria. For A+B, up to four variables entered. And, note that the variables entered the model in this order. So, 2-variable model contains DA and DTMIN75, 3-variable model contain DA, DTMIN75, and FOREST, and so on. Which category are these variable belong to? A, B, C or D? Drainage area from A entered all models. Only with the SFI, surface flow index, the model is improved a lot.
  • This is for the low streamflow models. It shows the same trends as the flood except for A+D combination. A is the worst, other combinations are better. For flood models, A and A+D were exactly same, but here A+D combination produced the better models. Now let’s look at the model equations and category of the explanatory variables. Drainage area from A entered all models.

Transcript

  • 1. Examining the Impact of Raster Datasets on Flood and Low Streamflow Regional Regression Models Using Custom GIS Applications Satoshi Hirabayashi College of Environmental Science and Forestry State University of New York
  • 2. Outline
    • Abstract
    • Introduction
    • Objectives
    • Methods
    • Results
    • Conclusions
    Outline
  • 3. Outline
    • Abstract
    • Introduction
    • Objectives
    • Methods
    • Results
    • Conclusions
    Outline
  • 4. Abstract Examining the Impact of Raster Datasets on Title: Flood and Low Streamflow Regional Regression Models Abstract Using Custom GIS Applications Raster Datasets Flood and Low Streamflow Regional Regression Models Q = f (X 1 ,X 2 ,X 3 ,…) Custom GIS Applications
  • 5. Outline
    • Abstract
    • Introduction
    • Objectives
    • Methods
    • Results
    • Conclusions
    Outline
  • 6. Flood and Low Streamflow Estimation Introduction
    • Estimate/predict frequency and magnitude of flood and low streamflow
    • Common flood and low streamflow statistics
      • 100-year flood : Q 100
      • 7-day, 10-year low streamflow : Q 7,10
    • Gauged river sites
      • Historic streamflow records are available
      • Typical method is a frequency analysis
    • Ungauged river sites
      • Historic streamflow records are NOT available
      • One method is a regional regression model
  • 7. Regional Regression Model Gauged Ungauged Introduction Q = f (X 1 ,X 2 ,X 3 ,…) Q = f (X 1 ,X 2 ,X 3 ,…) Regional regression models Q = f (X 1 ,X 2 ,X 3 ,…) Q : Response variable (Q 100 , Q 7,10 ) X : Explanatory variable (Watershed characteristics) Xs : available Q : available Xs : available Q : NOT available Develop regional regression model Apply model
  • 8. Regional Regression Model Introduction
    • Parameters may be estimated using the ordinary least squares regression procedure
    • The general form of regional regression models:
      • Q 100 : Flood statistic
      • Q 7,10 : Low streamflow statistic
    Q d,T : d-day, T-year streamflow statistics α , β , γ : model parameter to be estimated
    • By taking logarithm of both sides of the equation, a log-linear model is obtained:
  • 9. Past Studies
    • Flood regional regression models
      • Relatively good performance
      • Standard error 30-60 %
    • Potential reasons for poor performance
      • Manually derived watershed characteristics : insufficient data quality.
      • Exclusion of important watershed characteristics: insufficient data quantity.
    Introduction
    • Low streamflow regional regression models
      • Traditionally poor performance
      • Standard error > 100 %
    Data quality and quantity are focuses of this study.
  • 10. Kroll et al. ’s (2004) Study Low streamflow regional regression models were improved. Quantity Quality USGS’s Hydro-Climatic Data Network (HCDN) 1,545 sites 44-year record length (average) Digitally derived 1,465 raster datasets Took countermeasures for data quality and quantity. Introduction Streamflow Records Watershed Characteristics
  • 11. Kroll et al.’s Processes Watershed boundary delineation Watershed characteristics derivation Watershed Boundary Watershed Characteristics average DEM Introduction Gauging Site Watershed Boundary Raster
  • 12. Motivation for This Study Enhancement of data quality and quantity 1 km Climate : 49 km Climate : 4 km Soil: 1 km 1 1,440 13 12 Quality of watershed boundary may be enhanced by using DEMs with finer horizontal resolution. Quality and quantity of watershed characteristics may be enhanced by using raster data with finer resolution and more raster data. Introduction Data Horizontal resolution Number of data DEM Raster
  • 13. Motivation for This Study Geographic Information System (GIS)
    • Tedious
    • Time-consuming
    • Erroneous
    • Not transferable
    • Not reproduceable
    Lots of data Manual operation GIS is useful to derive input parameters to hydrologic models But… what if there is a lot of data? Introduction Solution : customize/automate GIS processes.
  • 14. Outline
    • Abstract
    • Introduction
    • Objectives
    • Methods
    • Results
    • Conclusions
    Outline
  • 15. Objectives 1. Develop a custom GIS application Objectives
  • 16. Objectives 1. Develop a custom GIS application Objectives 2. Examine impact of horizontal resolution of DEMs
  • 17. Objectives 1. Develop a custom GIS application Objectives 2. Examine impact of horizontal resolution of DEMs 3. Examine impact of new raster datasets
  • 18. Objectives 1. Develop a custom GIS application Objectives 2. Examine impact of horizontal resolution of DEMs 3. Examine impact of new raster datasets 4. Examine the most important watershed characteristics
  • 19. Outline
    • Abstract
    • Introduction
    • Objectives
    • Methods
    • Results
    • Conclusions
    Outline
  • 20. Low Streamflow Regional Regression Models Areas with worst performance (Kroll et al., 2004) Methods
    • Areas with worst performances
      • KS, NE, & OK
      • AL, TN, & KY
      • NC & SC
      • CA
    • May be improved with
      • DEMs with finer resolution
      • more explanatory variables
  • 21. Flood Regional Regression Models Areas with few explanatory variables (Jennings et al. 1994) Methods
    • Only one variable used in models
      • TN, NC, SC, & GA
      • i.e. TN
      • Q 100 =1,253DA 0.670
    • May be improved with
      • more explanatory variables
  • 22. Study Area
    • 36 USGS gauging sites
    • Unregulated streams
    • USGS National Water Information System (NWIS) streamflow records
    • More than 10 years of streamflow records
      • 1905 to 2003
      • record length: 36 years (average)
    Methods
  • 23. Data Methods General watershed characteristics Soil DEM Climate Remote sensed Hydrology + Hydrogeology USGS personal acquisition USDA State Soil Geographic (STATSGO) USGS Hydro1k USGS 1-degree USGS National Elevation Dataset (NED) Spatial Climate Analysis Service Parameter-elevation Regressions on Independent Slopes Model (PRISM) NASA Moderate Resolution Imaging Spectroradiometer (MODIS) USGS Kansas Water Science Center n/a 1 km 1 km 85 m 30 m 4 km 250 m 5km/1km Type Data Source Resolution
  • 24. Watershed boundary delineation Pre-process Overview of GIS processes Spatial statistic calculation Table toolset Spatial statistic toolset Table list Arc Hydro Gauging site DEM STATSGO toolset Polygon Table Table Log Raster toolset PRISM raster Raster list MODIS raster Hydro raster Raster Raster Raster Raster Raster Raster Table Raster list Watershed boundaries Methods Raster Raster Raster Raster Raster Raster
  • 25. Watershed Boundary Delineation
    • 3 watershed boundary sets
      • WS1k
      • WS85
      • WS30
    Methods
    • Fill sinks in DEM
    • Determine flow direction from DEM
    • Delineate watershed boundaries
    DEM
      • Hydro1k (1 km)
      • 1-degree (85 m)
      • NED (30 m)
    Watershed Boundary Input Output Arc Hydro Toolset Gauging Site
  • 26. Watershed boundary delineation Pre-process Overview of GIS processes Spatial statistic calculation Table toolset Spatial statistic toolset Table list Arc Hydro Gauging site DEM STATSGO toolset Polygon Table Table Log Raster toolset PRISM raster Raster list MODIS raster Hydro raster Raster Raster Raster Raster Raster Raster Table Raster list Watershed boundaries Methods Raster Raster Raster Raster Raster Raster
  • 27. Pre-process (Soil: STATSGO) Methods
    • 12 raster data
    • max. and min. of
      • Available water capacity
      • Moist bulk density
      • Organic matter content
      • Permeability
      • Depth to bedrock
      • Depth to water table
    • For each state
      • 1 polygon
      • 2 attribute tables
    • Weight soil data by percentage
    • Weight soil data by thickness
    • Join soil data to polygons
    • Create state raster data
    • Merge state raster data
    Input Output STATSGO Toolset Raster Soil Layer Table Soil Component Table Soil Unit Polygon
  • 28. Watershed boundary delineation Pre-process Overview of GIS processes Spatial statistic calculation Table toolset Spatial statistic toolset Table list Arc Hydro Gauging site DEM STATSGO toolset Polygon Table Table Log Raster toolset PRISM raster Raster list MODIS raster Hydro raster Raster Raster Raster Raster Raster Raster Table Raster list Watershed boundaries Methods Raster Raster Raster Raster Raster Raster
  • 29. Pre-process (Climate: PRISM) Methods
    • 1,453 raster data
      • 30-year average monthly precipitation
      • 30-year average annual precipitation
      • 40-year monthly precipitation
      • 40-year monthly maximum temperature
      • 40-year monthly minimum temperature
    • Batch process
      • Read raster list file
      • Retrieve data
      • Convert format
      • Convert projection
      • Clip for study area
      • Path to raster data
      • 1453 raster data
    Input Output PRISM Raster Raster Toolset ArcInfo ASCII GRID raster Raster List File
  • 30. Pre-process (Hydrology and Hydrogeology) Methods
    • 4 raster data
      • Base flow index
      • Groundwater recharge
      • Infiltration excess overland flow
      • Saturation excess overland flow
    • Batch process
      • Read raster list file
      • Retrieve data
      • Convert projection
      • Clip for study area
    Raster List File
      • Path to raster data
      • 4 raster data
    Input Output Hydro./Hydrogeo. Raster Raster Toolset Hydro./Hydrogeo. Raster
  • 31. Pre-process (Remote sensed: MODIS) Methods
    • 6 raster data
    • calibrated radiance on
      • 03/06/2000 Wet season
      • 09/06/2000
      • 10/03/2000
      • 10/10/2000 Dry season
      • 10/12/2000
      • 10/23/2000
    MODIS Raster
      • Path to raster data
      • 6 raster data
    • Batch process
      • Read raster list file
      • Retrieve data
      • Convert projection
      • Clip for study area
    Input Output MODIS Raster Raster Toolset Raster List File
  • 32. Watershed boundary delineation Pre-process Overview of GIS processes Spatial statistic calculation Table toolset Spatial statistic toolset Table list Arc Hydro Gauging site DEM STATSGO toolset Polygon Table Table Log Raster toolset PRISM raster Raster list MODIS raster Hydro raster Raster Raster Raster Raster Raster Raster Table Raster list Watershed boundaries Methods Raster Raster Raster Raster Raster Raster
  • 33. Pre-process (Watershed Characteristic Table) Methods
    • Batch process
      • Read table list file
      • Create initial table
      • Assign site ID
      • Path to table
    Watershed Char. Table Input Output Table Toolset Table List File Gauging Site
  • 34. Watershed boundary delineation Pre-process Overview of GIS processes Spatial statistic calculation Table toolset Spatial statistic toolset Table list Arc Hydro Gauging site DEM STATSGO toolset Polygon Table Table Log Raster toolset PRISM raster Raster list MODIS raster Hydro raster Raster Raster Raster Raster Raster Raster Table Raster list Watershed boundaries Methods Raster Raster Raster Raster Raster Raster
  • 35. Spatial Statistic Calculation Methods
    • Batch process
      • Read raster list file
      • Retrieve data
      • Overlay watershed boundaries
      • Calculate statistics
      • Save in a table
    Watershed Boundary
      • Path to raster data
      • Statistic type to be computed
    Watershed Char. Table Input Output Spatial Statistic Toolset Raster List File
  • 36. Watershed Characteristics Tables Developed for three watershed boundary sets (WS1k, WS85, and WS30) Methods WS1k WS85 WS30 Watershed Char. Table Watershed Char. Table Watershed Char. Table
  • 37. Watershed Characteristics Post-process Methods
    • Added 4 topography variables computed from DEM
      • Drainage area
      • 2 Slopes
      • Gauging site elevation
    • Added 1 hydrology variable computed from base flow index (BFI)
      • Surface flow index (SFI) = 100 – BFI
    • Summarized 40-year monthly precipitation, max. and min. temperature
      • Averaged in 4 seasonal periods
      • 9 different percentiles
      • 480 variables summarized to 36 variables
  • 38. Watershed Characteristics Categorization Methods Watershed Char. Table USGS traditional watershed characteristics (34) A Topography (4) Soil (12) Climate (121) B Hydrology (3) + Hydrogeology (2) C Remote sensed (6) D
  • 39. Potential Explanatory Variable Combination A A + B A + C A + D ALL A. USGS traditional watershed characteristics B. Topography + Soil + Climate C. Hydrology + Hydrogeology D. Remote sensed 34 171 39 40 182 Methods Category # of potential explanatory variables Potential explanatory variable combination
  • 40. Model Construction WS1k WS85 WS30 Flood and low streamflow regional regression models # of explanatory variables 2: Q = f (X 1 ,X 2 ) 3: Q = f (X 1 ,X 2 ,X 3 ) 4: Q = f (X 1 ,X 2 ,X 3 ,X 4 ) Stepwise regression 5 % significance level Methods A A + B A + C A + D ALL 34 171 39 40 182 A A + B A + C A + D ALL 34 171 39 40 182 A A + B A + C A + D ALL 34 171 39 40 182 Watershed boundary Developed model Combination Criteria
  • 41. Outline
    • Abstract
    • Introduction
    • Objectives
    • Methods
    • Results
    • Conclusions
    Outline
  • 42. Developed Models Flood regional regression models ALL A+D A+C A+B A ALL A+D A+C A+B A ALL A+D A+C A+B A WS1k WS85 WS30 Explanatory Variables Combination Standard Error (%) 23 25 27 29 31 33 35 37 Results DEM comparison Explanatory variable comparison 2 variables 3 variables 4 variables
  • 43. Explanatory Variables Combination Developed Models Low streamflow regional regression models ALL A+D A+C A+B A ALL A+D A+C A+B A ALL A+D A+C A+B A WS1k WS85 WS30 Standard error (%) 20 30 40 50 60 70 80 90 100 110 Results DEM comparison Explanatory variable comparison 2 variables 3 variables 4 variables
  • 44. DEM Impact Flood regional regression models (with the ALL potential explanatory variable combination) Watershed Boundary Standard error (%) 23 25 27 29 31 Almost same performance Results Horizontal resolution of DEMs has almost no impact on performance of flood regional regression models WS1k WS85 WS30 2 variables 3 variables 4 variables
  • 45. DEM Impact Low streamflow regional regression models (with the ALL potential explanatory variable combination) Watershed Boundary Standard error (%) Almost same performance Results 20 30 40 50 60 WS1k WS85 WS30 Horizontal resolution of DEMs has almost no impact on performance of low streamflow regional regression models 2 variables 3 variables 4 variables
  • 46. Delineated Watershed Boundary WS1k Results Flint river near chase, AL Site number: 03575000 Drainage area: 342 mi 2
  • 47. Delineated Watershed Boundary WS1k & WS85 Results Flint river near chase, AL Site number: 03575000 Drainage area: 342 mi 2
  • 48. Delineated Watershed Boundary WS1k & WS85 & WS30 Differences occur only at the boundaries Results Flint river near chase, AL Site number: 03575000 Drainage area: 342 mi 2 Delineated watershed shapes are similar with differences occurring only at the boundaries
  • 49. Variable Investigation Flood regional regression models (with the WS1k) Results
    • A+C
      • Q 100 = 0.91 DA 0.88 SFI 1.06
    • ALL
      • Q 100 = -3.48 DA 0.91 SFI 1.39
      • PMAR 1.08
    Drainage area and surface flow index are the most important explanatory variables for flood regional regression models Explanatory Variables Combination Standard error (%) 2 variables 3 variables 4 variables ALL A+D A+C A+B A 23 25 27 29 31 33 35 37 Best performance Best performance
  • 50. Variable Investigation Low streamflow regional regression models (with the WS1k) Results
    • A+C
      • Q 7,10 =-26.30 DA 0.96 BFI 3.45
      • THICK 2.76
    • ALL
      • Q 7.10 =-27.16 DA 1.02 BFI 3.06
      • RDL 2.72 AWCH -1.58
    • A+D
      • Q 7,10 = -6.21 DA 0.92 M0906 0.98
      • HYGRP -3.99
    Drainage area and base flow index are the most important explanatory variables for low streamflow regional regression models ALL A+D A+C A+B A 20 40 60 80 100 120 Explanatory Variables Combination Standard error (%) 2 variables 3 variables 4 variables Best performance Best performance
  • 51. Important Explanatory Variables Base flow index (BFI) & surface flow index (SFI) Results BFI was calculated for 8,249 gauged sites Interpolated for conterminous U.S.
    • Sources of streamflow
      • Base flow
      • Surface flow
    • Base flow index (BFI)
      • long-term average percentage of base flow in annual streamflow
    • Surface flow index (SFI)
      • long-term average percentage of surface flow in annual streamflow
      • SFI = 100 - BFI
    Time Discharge Base flow Surface flow
  • 52. Outline
    • Abstract
    • Introduction
    • Objectives
    • Methods
    • Results
    • Conclusions
    Outline
  • 53. Conclusions Conclusions 1. Develop a custom GIS application Objective Custom GIS application was developed with ArcObjects and VBA Approximately 1,500 raster datasets were efficiently processed
  • 54. Conclusions Conclusions 1. Develop a custom GIS application Objective 2. Examine impact of horizontal resolution of DEMs Custom GIS application was developed with ArcObjects and VBA Approximately 1,500 raster datasets were efficiently processed No significant impacts on the model performance
  • 55. Conclusions Conclusions 3. Examine impact of new raster datasets Objective Models were improved with the inclusion of newly derived watershed characteristics
  • 56. Conclusions Conclusions 3. Examine impact of new raster datasets Objective 4. Examine the most important watershed characteristics Models were improved with the inclusion of newly derived watershed characteristics The inclusion of hydrologic variables, for flood models, and hydrogeologic variables, for low streamflow models, greatly improved the models.
  • 57. References Jennings, M.E., W.O. Thomas, Jr., and H.C. Riggs, (1994), Nationwide Summary of U. S. Geological Survey Regional Regression Equations for Estimating Magnitude and Frequency of Floods for Ungaged Sites, 1993. U. S. geological Survey Water-Resources Investigations Report , 94-4002, Reston, VA. Kroll, C.N., J.G.. Luz, T.B. Allen, and R.M. Vogel, (2004), Developing a watershed characteristics database of improve low streamflow prediction, Journal of Hydrologic Engineering , March/April 2004, 116-125.
  • 58. Acknowledgement
    • Advisor
      • Chuck N. Kroll
    • Committee
      • Lee P. Herrington
      • Lindi J. Quackenbush
      • Yongwei Sheng
    • Colleagues
      • Adão Mantose
      • Zhenxing Zhang
      • Emiko Ochiai
  • 59. Outline END
  • 60. Q 100 and Q 7,10 Calculation Log-Pearson III distribution 8.86 ln (annual maximum streamflow) Frequency 8.5 8.0 7.5 7.0 6.5 14 12 8 6 4 2 0 99% ln(annual 7-day minimum streamflow ) Frequency 3.5 3.0 2.5 2.0 1.5 1.0 12 10 8 6 4 2 0 10% Q 100 Q 7,10 100-year flood (Q 100 ) Methods 7-day, 10-year low streamflow (Q 7,10 ) 10
  • 61. Watershed Characteristics Post-process Methods
    • Added 4 topography variables computed from DEM
      • Drainage area
      • 2 Slopes
      • Gauging site elevation
    • Added 1 hydrology variable computed from base flow index (BFI)
      • Surface flow index (SFI) = 100 – BFI
    • Summarized 40-year monthly precipitation, max. and min. temperature
    40-year series Jan – Dec 480 variables 1 st , 5 th , 10 th , 25 th , 50 th , 75 th , 90 th , 95 th , 99 th percentiles 36 variables 40-year series Average Jun – Aug Average Sep – Nov Average Dec – Mar Average Apr – May Frequency Average climate value
  • 62. DEM Impact on Derived Watershed Characteristics Example 136.3 mm 136.5 mm 136.8 mm WS1k WS85 WS30 average average average Results 30-year average January precipitation Differences among watershed characteristics are very small
  • 63. Watershed Characteristic Difference Horizontal resolution of raster data Average difference of derived watershed characteristic (%) 4km (PRISM) 1km (STATSGO) 250m (MODIS) 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Results WS85-WS30 WS1k-WS85 WS1k-WS30 Small differences in watershed characteristics Small differences in the performance of developed models
  • 64. Variable Investigation Flood Regional Regression Models (with the WS1k)
    • A, A+D
      • Q 100 = 2.92 DA 0.96 CLAY 0.59
    • A+B
      • Q 100 =-20.84 DA 0.95 DTMIN75 4.72
      • FOREST 0.59 THICK 1.20
    • A+C
      • Q 100 = 0.91 DA 0.88 SFI 1.06
    • ALL
      • Q 100 = -3.48 DA 0.91 SFI 1.39 PMAR 1.08
    Results Explanatory Variables Combination Standard error (%) ALL A+D A+C A+B A 23 25 27 29 31 33 35 37 Drainage area and surface flow index are the most important explanatory variables for flood regional regression models 2 variables 3 variables 4 variables A B C D USGS traditional watershed characteristics Hydrology + Hydrogeology Remote sensed Topography + Soil + Climate
  • 65. Variable Investigation Low Streamflow Regional Regression Models (with the WS1k) Results
    • A
      • Q 7,10 =-29.60 DA 0.91 ELEV10 1.91 THICK 3.53
      • CROP 0.20
    • A+B
      • Q 7,10 = -4.35 DA 0.88 BTMIN99 -20.79
      • CTMAX 99 21.44 WDH -5.53
    • A+C
      • Q 7,10 =-26.30 DA 0.96 BFI 3.45 THICK 2.76
    • A+D
      • Q 7,10 = -6.21 DA 0.92 M0906 0.98 HYGRP -3.99
    • ALL
      • Q 7.10 =-27.16 DA 1.02 BFI 3.06 RDL 2.72 AWCH -1.58
    ALL A+D A+C A+B A 20 40 60 80 100 120 Explanatory Variables Combination Standard error (%) Drainage area and base flow index are the most important explanatory variables for low streamflow regional regression models A B C D USGS traditional watershed characteristics Hydrology + Hydrogeology Remote sensed Topography + Soil + Climate 2 variables 3 variables 4 variables