Terra Populus is an NSF-funded DataNet project that seeks to lower the barriers for conducting human-environment interactions research. TerraPop provides access to hundreds of census and survey microdata samples, area-level data describing geographic units, and environmental data, commonly stored as raster data, describing land use, land cover, and climate. The data access system adds value to these data by supporting transformations across microdata, area-level data, and raster data. Users may select variables of interest from any of the three formats and obtain output in their desired format. This presentation will provide an overview of the data available in the TerraPop data access system and the system’s transformation functionality, as well as a demonstration of the data access system.
Global debate on climate change and occupational safety and health.
Terra Populus: Integrated Data on Population and Environment
1.
2. TerraPop Goals
Lower barriers to conducting interdisciplinary human-
environment interactions research by making data with
different formats from different scientific domains easily
interoperable
Provide an organizational and technical framework to
preserve, integrate, disseminate, and analyze global-
scale spatiotemporal data describing population and the
environment.
3. TerraPop in Context
Collaborating Organizations
• Data integration expertise
• Large census and survey
data collections & expertise
• Institutional foundation
• Human-environment
interactions research
expertise
• Environmentally-oriented
data collections & expertise
• Data preservation and
sustainability expertise
• Social science data
collections & expertise
• Major producers and distributors of data on both humans and their environment
• Major producers of tools for integrating and transforming data across formats
• Leaders in preservation and sustainability
4. Background
Sustainable Digital Data Access and
Preservation Network (DataNet)
Provide reliable digital preservation, access, integration, and
analysis
Anticipate and adapt to technological change and user needs
Engage with frontiers of computer/information science and CI
Serve as component elements of interoperable data
preservation and access network
7. TerraPop in Context
DataNet Cyberinfrastructure
Curated population and
environment data collection
Exposed through DataONE,
SEAD
Extracts exportable to DFC
Integration services
Potentially available through
DFC, SEAD
Open source components and
API
8. • T W O D O M AI N S : P O P U L AT I O N & E N V I R O N M E N T
• T H R E E D ATA S T R U C T U R E S
• Microdata
• Area-level data
• Rasters
Source Data
9. Making disparate data formats interoperable
Microdata:
Characteristics of individuals
and households
Area-level data:
Characteristics of places defined
by boundaries
Raster data:
Values tied to spatial
coordinates
11. Location-Based Integration
Individuals and households
with their environmental
and social context
Microdata
Area-level dataRasters
Age Sex
36 M
34 F
11 M
8 M
42 M
39 F
15 F
Landcover
Forest
Forest
Forest
Forest
Grassland
Grassland
Grassland
20. Preservation
Data producers have no preservation plan
GLI crops data
Previous versions of data difficult or impossible
to find
MODIS Land Cover Collection 4 superseded by Collection 5,
but Collection 4 is unavailable
35. Creation
Historical subnational GIS data
Matched to census data
Aligned with most recent GIS data available for a given
country
Area-level data
Tabulated from census microdata
Obtained from census agencies as digital files, PDFs, or
HTML tables
44. Beta Raster data
Global Landscapes Initiative (GLI)
Yield and harvested area for 175 crops
Global Land Cover 2000 (GLC2000)
Land cover data, circa 2000, derived from the VEGETATION
instrument on the SPOT 4 satellite
WorldClim
Climate data describing temperature, precipitation, and
bioclimatic variables, created from weather station data
collected from approximately 1950-2000
45. New Raster Data
MODIS Land Cover Type (MCD12Q1)
Yearly land cover data derived from the MODIS Terra and
Aqua satellites, available for 2001 - 2012
500 meter spatial resolution
Available in five land cover classifications
IGBP
University of Maryland
LAI/fPAR
Net Primary Productivity
Plant Functional Type
Now available on our staging site
46. Project Status
Currently in project year 4
Prepping a rollout of new data, but you can
preview it at http://beta2.terrapop.org
Prepping a new UI for summer 2015
Always creating new data!
Editor's Notes
DataONE – University of New Mexico, UC Santa Barbara, Oak Ridge National Laboratory, and many more
DC – Johns Hopkins, National Snow and Ice Data Center, UIUC
SEAD – University of Michigan, Indiana University, Rensselaer, UIUC, ICPSR
DFC – UNC Chapel Hill, U of South Carolina, Drexel, Ocean Observatories Initiative
TerraPop – Minnesota Population Center, ICPSR, Institute on the Environment (UMN), CIESIN (Columbia)
Integration across domains, formats hinges on geography
Users get any type of data in format useful to them
Requires boundary files, boundaries harmonized over time
Integration across domains, formats hinges on geography
Users get any type of data in format useful to them
Requires boundary files, boundaries harmonized over time
Integration across domains, formats hinges on geography
Users get any type of data in format useful to them
Requires boundary files, boundaries harmonized over time
Integration across domains, formats hinges on geography
Users get any type of data in format useful to them
Requires boundary files, boundaries harmonized over time
----- Meeting Notes (4/28/15 09:34) -----
Have to download tiles of data and piece them together - just identifying the tiles is time consuming!
2001 Census of Population – Croatia
Multiple geographic levels in the table. Total figures and female totals, so you have to subtract out to get the males. Multiple data type (counts, percentages) embedded in table. Multiple dimensions embedded in table.
Laos Census of Population, 2005, embedded new variables (urban/rural with and without roads) into the population by sex and province table.
----- Meeting Notes (4/28/15 09:34) -----
We are converting those HTML and PDF tables to machine-readable data files for end users.
----- Meeting Notes (4/28/15 09:34) -----
This is the site that the Institute on the Environment uses to disseminate its Global Landscape Initiative crop data.
----- Meeting Notes (4/28/15 09:34) -----
This is their PDF metadata and technical documentation file that you can download. Doesn't follow any metadata standard and isn't machine readable.
----- Meeting Notes (4/28/15 09:34) -----
If you download one of their GeoTIFFs from EarthStat, this is the metadata file that comes with it.
----- Meeting Notes (4/28/15 09:34) -----
This is the metadata file that we provide through TerraPop
----- Meeting Notes (4/28/15 09:34) -----
During the last few years, we've heard a lot about big data and the data deluge. But, there are still some "deserts" in the deluge - key datasets that just don't exist.
Subnational administrative boundaries, particularly historical admin boundaries, are one of those datasets. So, we have started created this boundary files.
1974 census of population and housing in Liberia – district map, with codes and names
1993 census of population – Gabon – province and department boundaries
TerraPop contain four different types of data – census and survey microdata, area-level data describing the characteristics of geographic entities, raster data describing land cover, land use, and climate, and GIS boundary files delineating first and second admininstrative levels.
For area-level data, NEXT SLIDE
Here you can see the countries we’ve completed and those that are in progress. We will continue to fill in this map as we work in more and more countries during the next two years!
Finished GIS boundary construction for 14 additional countries
Have an additional 35 countries in progress