LAND COVER, POPULATION ESIMATES, AND STATE BOUNDARIES: A COMPARISON OF UNCERTAINTY AMONG GIS DATATSETS
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

LAND COVER, POPULATION ESIMATES, AND STATE BOUNDARIES: A COMPARISON OF UNCERTAINTY AMONG GIS DATATSETS

  • 80 views
Uploaded on

LAND COVER, POPULATION ESIMATES, AND STATE BOUNDARIES: A COMPARISON OF UNCERTAINTY AMONG GIS DATATSETS ...

LAND COVER, POPULATION ESIMATES, AND STATE BOUNDARIES: A COMPARISON OF UNCERTAINTY AMONG GIS DATATSETS

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
80
On Slideshare
80
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. LAND COVER, POPULATION ESIMATES, AND STATE BOUNDARIES: A COMPARISON OF UNCERTAINTY AMONG DATATSETS Kim Boggio March 2014
  • 2. LAND COVER, POPULATION ESIMATES, AND STATE BOUNDARIES: A COMPARISON OF UNCERTAINTY AMONG DATATSETS There are various techniques used in GIS to visualize land coverage, population estimates, and state boundaries. This data is accessed through different sources; each dataset has its advantages and disadvantages. This document evaluates the uncertainty associated with the various methods used to present land cover, population, and state boundaries in a GIS format. This study is conducted as an outgrowth of work done for the Open Space Institute (OSI) in an effort to identify forested lands adjacent to urban and suburban areas that would be appropriate for acquisition. Some background on the Open Space Institute: • The Open Space Institute (OSI) protects scenic, natural, and historic landscapes to ensure public enjoyment, conserve habitats, and sustain community character. • OSI achieves its goals through land acquisition, conservation easements, regional loan programs, fiscal sponsorship, creative partnerships, and analytical research. OSI has protected more than 100,000 acres through the New York land program through direct acquisition and conservation easements in the State of New York. • Through the Conservation Finance Program, which provides low-cost bridge loans, OSI has assisted in the protection of an additional 1.6 million acres across the East Coast. • The Research Program influences land use policy and practice through research, communication and training. METHODS The task associated with the OSI project was to identify land adjacent to urban and suburban areas for open space acquisition. The solution was to use land cover, impervious cover, federal lands, wilderness areas, and LANDSCAN datasets to identify those areas. After using various raster and vector data it became obvious that there were large differences in scale, coordinate systems, classification of data, and underlying attribute tables. This introduced uncertainty, depending on the dataset used. The data used in the OSI project is analyzed below to see how closely the raster and vector layers align.
  • 3. The land cover rasters used in the study were the 30m grid from the National Land Cover Database (NLCD) and the 200m grid land cover from National Atlas (See appendix A for the respective websites). Population raster layers used were Impervious 30m from NLCD) and LANDSCAN (.00833 X .00833 decimal degree cells – approximately 1010m X 488m) from Oak Ridge National Laboratory. State boundary shapefiles were downloaded from National Atlas, Tiger Data (US Census Bureau) and ESRI. Land cover data was compared for similarities in classification, and visual clarity at different map scales on an equal extent basis. Land cover was also compared using 500 random points in a relatively small area at the head of the Chesapeake Bay. Population data was analyzed for classification methods, and visual clarity at different map scales on an equal extent basis. Finally, state boundaries shapefiles were investigated for how closely they follow the Delmarva Peninsula coastline; this would reveal how closely the boundary files match each other. LANDCOVER RESULTS The NLCD classifications used to analyze land cover are shown in Figure1. Figure 1: NLCD Land cover classifications used to analyze the data
  • 4. LAND COVER AREA BY CLASS ANALYSIS Figure 2 shows the various land classes, the cell counts, and area by class for 30m and 200m NLCD data in National Land Cover boundary zone 13. One can see that cell counts are much lower for 200m data compared to 30m data, which would be expected given that the 200m data is comprised of 40ha cells and 30m data is 0.9ha. It appears obvious from figure 2 that 30m and 200m land cover are representing the land cover classes differently. For instance class 43 shows mixed forest area as 164,000,000 m² (16,400 ha) for 30m data and 1,940,280,000 m² (194,028 ha) for 200m data. There are statistically significant differences between 30m and 200m data for urban, forested, and herbaceous wetland cover classes as shown in Tables 2 and 3. Table 1: 200m & 30m Land cover area analysis CLASS DESCRIPTION 30m COUNT 30m AREA (m²) 200m COUNT 200m AREA (m²) AREA DIFFERENCE % 11 OPEN WATER 1,014,628 865,176,000 24,611 984,440,000 13.8% 21 LOW INTENSITY RESIDENTIAL 2,043,829 1,742,780,000 49,179 1,967,160,000 12.9% 22 HIGH INTENSITY RESIDENTIAL 1,859,436 1,585,550,000 10,389 415,560,000 73.8% 23 COMMERCIAL/ INDUSTRIAL/TRANSPORTATION 921,654 785,897,000 15,954 638,160,000 18.8% 24 HIGH INTENSITY URBAN 431,715 368,124,000 - - - - N/A 31 BARE ROCK/ SAND/ CLAY 349,813 298,286,000 - - - - N/A 32 QUARRIES/ STRIP MINES - - - - 2,455 98,200,000 N/A 33 TRANSITIONAL - - - - 1,890 75,600,000 N/A 41 DECIDUOUS FOREST 7,470,356 6,369,990,000 160,357 6,414,280,000 0.7% 42 EVERGREEN FOREST 1,235,803 1,053,770,000 24,979 999,160,000 5.2% 43 MIXED FOREST 192,967 164,543,000 48,507 1,940,280,000 1079.2% 51 SHRUBLAND - - - - - - - - 100.0% 61 ORCHARDS/ VINEYARDS - - - - - - - - N/A 71 GRASSLAND/ HERBACEOUS - - - - - - - - N/A 81 PASTURE/ HAY 6,504,092 5,546,050,000 186,453 7,458,120,000 34.5% 82 ROW CROPS 5,075,837 4,328,180,000 59,044 2,361,760,000 45.4% 83 SMALL GRAINS - - - - - - - - N/A 85 URBAN/ RECREATIONAL GRASSES - - - - 4,666 186,640,000 N/A 90 EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS 1,343,873 1,145,920,000 - - - - N/A 91 WOODY WETLANDS - - - - 18,548 741,920,000 N/A 92 EMERGENT HERBACEOUS WETLANDS - - - - 9,062 362,480,000 N/A 95 EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS 455,573 388,468,000 - - - - N/A TOTALS 28,899,57 6 24,642,734,000 616,094 24,643,760,000 0.0% Table 2: General land cover class totals
  • 5. CELL COUNT AREA (m²) CELL COUNT AREA (m²) % DIFFERENCE FOREST LAND TOTAL 8,899,126 7,588,303,000 233,843 9,353,720,000 23.3% RESIDENTIAL TOTAL 5,256,634 4,482,351,000 75,522 3,020,880,000 32.6% FARMLAND TOTAL 11,579,929 9,874,230,000 245,497 9,819,880,000 0.6% EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS 1,799,446 1,534,388,000 27,610 1,104,400,000 28.0% Table 3: General land cover class statistical analysis Hypotheses: H0: φ1 = φ2 vs. HA: φ1 ≠ .φ2 TEST STATISTIC p1 φ2 SE p1- φ2 Z FOREST LAND (CLASSES 41 – 43) Z= p1- φ2/√ φ1(1- φ)/n ) 0.308 0.380 0.017 - 0.072 -4.26 P < .01 RESIDENTIAL (CASSES 21 – 24) 0.182 0.123 0.008 0.059 7.72 P < .01 FARMLAND (CLASSES 61 – 83) 0.401 0.398 0.017 0.002 0.13 N.S. EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS (CLASSES 90 – 95) 0.062 0.045 0.003 0.017 5.71 P < .01 LAND COVER RANDOM POINT ANALYSIS The random point analysis was performed on a relatively small area at the head of the Chesapeake Bay. ArcMap generated 500 random points for the same geographic coordinates in both 30m and 200 m land cover as depicted in Figure 5 below. The results of the random point analysis were similar to the NLCD area by class analysis in that there are statistically significant differences between 30m and 200m data for urban, forested, and herbaceous wetland cover classes as shown in Tables 5 and 6. The random point analysis provides an accurate cell to cell comparison for both resolutions of NLCD land cover data. Figure 2: 200M & 30M land cover with random points
  • 6. Table 4: 200M & 30M random point NLCD classification analysis CLASS DESCRIPTION POINT COUNT POINT COUNT 11 OPEN WATER 12 12 21 LOW INTENSITY RESIDENTIAL 31 48 22 HIGH INTENSITY RESIDENTIAL 38 8 23 COMMERCIAL/ INDUSTRIAL/TRANSPORTATION 19 8 24 HIGH INTENSITY URBAN 2 31 BARE ROCK/ SAND/ CLAY 4 32 QUARRIES/ STRIP MINES 3 33 TRANSITIONAL 1 41 DECIDUOUS FOREST 127 116 42 EVERGREEN FOREST 23 15 43 MIXED FOREST 2 53 51 SHRUBLAND 61 ORCHARDS/ VINEYARDS 71 GRASSLAND/ HERBACEOUS 81 PASTURE/ HAY 125 153 82 ROW CROPS 91 62 83 SMALL GRAINS 85 URBAN/ RECREATIONAL GRASSES 1 90 EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS 22 91 WOODY WETLANDS 14 92 EMERGENT HERBACEOUS WETLANDS 5 95 EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS 4 TOTALS 500 499
  • 7. Table 5: General land cover class random point totals LAND CLASS POINT COUNT 30m POINT COUNT 200m % DIFFERENCE FOREST LAND TOTAL 152 184 17.4% RESIDENTIAL TOTAL 90 64 40.6% FARMLAND TOTAL 216 215 0.5% EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS 26 19 36.8% Table 6: General land cover class random point statistical analysis LAND CLASS TEST STATISTIC p1 φ2 SE p1- φ2 Z FOREST LAND (CLASSES 41 - 43) Z= p1- φ2/√ φ1(1- φ)/n ) 0.304 0.369 0.017 -0.065 -3.89 P < .01 RESIDENTIAL (CASSES 21 - 24) 0.180 0.128 0.008 0.052 6.48 P < .01 FARMLAND (CLASSES 61 - 83) 0.432 0.431 0.018 0.001 0.06 N.S. EMERGENT HERBACEOUS WETLANDS/ WOODY WETLANDS (CLASSES 90 - 95) 0.052 0.038 0.003 0.014 5.32 P < .01 LAND COVER RESOLUTION ANALYSIS On the following page are two maps at 1:75,000 scale. The 30m map still provides sharp delineations of land cover and boundaries; the 200m map of the same area shows a blur of cells. The 200m map may be appropriate for macro view analysis, but inappropriate for analysis on a small scale. The 200m map falls apart below a 1:1,000,000 scale; whereas the 30m map is still useful at 1:50,000. Figure 3: 30M LAND COVER 1:75,000
  • 8. Figure 4: 200M LANDCOVER 1:75,000 LANDSCAN RESULTS
  • 9. LANDSCAN CLASSES The LANDSCAN database provides population values for cells with an area of 54 ha at 40˚ latitude. The screen shot below shows that LANDSCAN symbology has population class breaks at 5, 25, 50, 100, 500, 2,500, 5,000 and 130,000 people per cell. So each cell has an actual population number associated with it (see Table 8). The NLCD Impervious dataset has only a relative scale of population with no values appearing in the attribute table. However the Impervious cells are 30m and may be useful for cursory analysis. Figure 5: LANDSCAN Symbology Table 8: LANDSCAN zonal statistics by NLCD class
  • 10. CLASS DESCRIPTION COUNT AREA (m) MEAN STD SUM NO. PEOPLE/ HA
  • 11. 11 OPEN WATER 732 747,490,000 83 361 60,810 0.8 21 LOW INTENSITY RESIDENTIAL 1,386 1,415,330,000 491 646 680,973 4.8 22 HIGH INTENSITY RESIDENTIAL 1,299 1,326,490,000 662 1,052 860,134 6.5 23 COMMERCIAL/ INDUSTRIAL/TRANSPORTATION 659 672,945,000 1,171 1,532 771,940 11.5 24 DEVELOPED/ HIGH INTENSITY 326 332,899,000 1,658 2,114 540,591 16.2 31 BARE ROCK/ SAND/ CLAY 201 205,253,000 141 246 28,388 1.4 41 DECIDUOUS FOREST 4,384 4,476,770,000 121 296 532,410 1.2 42 EVERGREEN FOREST 229 233,846,000 133 258 30,446 1.3 43 MIXED FOREST 48 49,015,700 60 180 2,862 0.6 81 PASTURE/ HAY 4,703 4,802,520,000 113 263 530,065 1.1 82 ROW CROPS 3,390 3,461,740,000 118 320 398,785 1.2 90 EMERGENT HERBACEOUS/ WOODY WETLANDS 463 472,798,000 230 559 106,560 2.3 95 EMERGENT HERBACEOUS/ WOODY WETLANDS 276 281,841,000 120 709 33,092 1.2 TOTALS 18,096 18,478,937,700 5,102 8,535 4,577,05 6 POPULATION ESTMATE RESOLUTION ANALYSIS Similar to the land cover datasets, the large cell LANDSCAN map falls apart at scales under 1:1,000,000. The NLCD Impervious layer aligns nicely with the LANDSCAN layer (Figure 6), however LANDSCAN provides actual population estimates. Figure 6: LANDSCAN & NLCD IMPERVIOUS 1:1,500,000 Figure 7: LANDSCAN & NLCD IMPERVIOUS 1:50,000
  • 12. STATE BOUNDARY ANALYSIS Three state boundary shapefiles were used in this analysis: National Atlas, Tiger Data and ESRI. All three shapefiles were compared to the 30m land cover layer. The National Atlas data seemed to provide the most accurate boundaries, followed by Tiger data and ESRI (see figure 8). ESRI boundaries were all encompassing; on a small scale they provided little detail. However the ESRI attribute table provided detailed information for each state that would be useful for demographic studies such as area, population, number of households, etc. National Atlas and Tiger Data attribute tables only provide information regarding the state boundary polygon shapes.
  • 13. Figure 8: State boundaries STATE BOUNDARIES – NATIONAL ATLAS, TIGER DATA, ESRI SHAPEFILES 1:75,000 RED BOUNDARY – NATIONAL ATLAS STATE SHAPEFILE BLUE BOUNDARY – TIGER STATE SHAPEFILE MAGENTA BOUNDARY – ESRI STATE SHAPEFILE CONCLUSIONS The following recommendations are made in order to reduce uncertainty in the Open Space Institute project: LAND COVER • The 30m NLCD land cover rasters provide much more visual detail at small scales. • There are statistically significant differences in which the different NLCD classes are represented in 30m and 200m NLCD data. • 30m land cover data is appropriate for most analyses; 200m data may be used for a “macro” analysis. POPULATION ESTIMATES
  • 14. • NLCD Impervious layer may be used for cursory analysis; LANDSCAN is appropriate for detailed analysis because of the inclusion of actual population by cell data. STATE BOUNDARIES • The National Atlas state boundary shapefile is more accurate than Tiger Data and ESRI, and should be used for most analyses. • The ESRI state boundary shapefile is appropriate for demographic studies. DATA SOURCES http://www.mrlc.gov/nlcd_multizone_map.php Multi – Resolution Land Characteristics Consortium (MRLC) includes: National Land Cover Database (NLCD) multi-zone download site. NLCD 2001 includes 21 classes of Land Cover, Percent Tree Canopy, and Urban Imperviousness at 30m cell resolution. The Urban Imperviousness layer aligns nicely with the LANDSCAN data. http://www.epa.gov/mrlc/nlcd-2001.html The EPA site for NLCD data. http://www.ornl.gov/sci/landscan/ The LANDSCAN Dataset comprises a worldwide population database compiled on a 30" X 30" latitude/longitude grid. Census counts are based on proximity to roads, slope, land cover, nighttime lights, and other information. http://eros.usgs.gov/products/elevation.html The USGS site provides Digital Elevation Models (DEM) at 30m resolution. You can download seamless 7.5 degree quads from this site. Also access to the National Map Seamless Server. http://www.nationalatlas.gov/ Access to National Atlas Seamless server. The OSI map includes state and county boundaries and Federal land locations downloaded from this site. National Atlas also has 200m resolution land cover maps. http://www.census.gov/geo/www/tiger/ The Census Bureau is home to Tiger data shapefiles