Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

FME Data Transformation for the Geographic Support System Initiative


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

FME Data Transformation for the Geographic Support System Initiative

  1. 1. FME Data Transformation forthe Geographic SupportSystem InitiativeJay E. SpurlinSoftware Architect and Development Manager for theGSS-I Feature Source Evaluation software system April 8, 2013
  2. 2. U.S. Census Bureau• The Census Bureau serves as the leading source of quality data about the nations people and economy. We honor privacy, protect confidentiality, share our expertise globally, and conduct our work openly. We are guided on this mission by our strong and capable workforce, our readiness to innovate, and our abiding commitment to our customers. 2
  3. 3. Geography Division• The Geography Division plans, coordinates, and administers all geographic and cartographic activities needed to facilitate the Census Bureaus statistical programs throughout the US and its territories. We manage the Census Bureaus programs to continuously update features, boundaries and geographic entities in TIGER and the Master Address File (MAF). We also conduct research into geographic concepts, methods, and standards needed to facilitate the Census Bureaus data collection and dissemination programs. 3
  4. 4. GSS-I• In support of the 2020 Decennial Census, the Census Bureau is evaluating what areas should be targeted for a traditional, on-the-ground address canvassing operation and in which areas a traditional canvassing operation is not necessary.• The task the Census Bureau is undertaking is determining how to decide which areas should be considered for targeting – GEO has evaluated the MAF/TIGER database and assigned quality indicators to each of the census tracts – A Targeted Address Canvassing strategy has been developed that contains an inventory of criteria for evaluation 4
  5. 5. GSS-I• The Geographic Partnership program is now underway. – GEO is receiving both address and spatial data from invited partners • This data is at the state, county, and local level. • The data is being evaluated and integrated with the MAF/TIGER database. • The next step is to determine what level of feedback we can give to the partners about their data.• GEO is also working with statisticians on predictive modeling to help determine where to target.• The combination of the evaluation of the current MAF/TIGER database, the partner data, and the predictive modeling will contribute to the recommendation on which areas of the country should be considered for targeting. 5
  6. 6. The Geographic Partnership Program• A partner provides a set of source files• The source files are moved inside the Census firewall via a secure web-exchange module• The content inventory of the files undergoes initial verification• The files are preserved, as supplied, for later reference• A more detailed content assessment is done, including verification the files meet the minimum guidelines for content and metadata• The files are prepared for automated processing, including re-projection and mapping to a standardized schema• A series of (mostly) automated checks is run, which provides metrics about the data in the files• An interactive review is conducted, in which the files and their associated metrics are reviewed and a decision is made how to capture any new data• Any data that are not useful for updating the MAF/TIGER database get removed from the files• Features or addresses are added or modified, using an automated conflate and review process – or – an interactive update process 6
  7. 7. Feature Source Evaluation Software• A number of MAF/TIGER spatial layers will be extracted for the extent of the partner entity• An analyst will use the supplied data and metadata to map the provided source schema to a standardized schema, and the supplied road centerline file will be converted to an ArcSDE layer, re-projected, and the name and MTFCC mappings applied• The feature names in the source file will be standardized to the parsed, MAF/TIGER naming conventions• The standardized feature names will be checked to see if any contain illegal charactersor prohibited or generic names• A topological check will be run, to gauge the topological stability of the source file• A completeness / change detection check will be run to attempt to identify areas in the source file that contain features not found in MAF/TIGER• A comparison will be run between the universe of feature names in the source file and the universe of feature names found in MAF/TIGER within the extent of the entity• All intersections that meet the requirements for CE95 assessment will be identified 7
  8. 8. Previous FME Technology Architecture• FME Workspaces were developed using FME Workbench 2012 on desktop workstations, running 32-bit Windows XP Service Pack 3• FME Server 2012 (FME Engine only), on batch servers running Linux Redhat Enterprise 5 connected to a SAN (Storage Area Network) Linux Batch Server Cronacle job-queueing system Perl and shell scripts MAF/TIGER FME Server (command line (Oracle Shapefiles on invocation of FME Engines) Database) SAN Oracle Run-Time Client 8
  9. 9. New FME Technology Architecture• FME Workspaces are developed using FME Workbench 2012 SP3 on desktop workstations, running 32-bit Windows XP Service Pack 3• FME Server 2012 SP3 (FME Server Console), on batch servers running Linux Redhat Enterprise 5• FME Server 2012 SP3, on Windows server, with SAN (Storage Area Network) disk(s) mounted via Samba Linux Batch Server Windows Web Server Cronacle job-queueing system MAF/TIGER Shapefiles on ArcGIS for Server (Oracle SAN Database) Perl and shell scripts FME Server (full installation) FME Server Console (remote job submission to FME Server) ArcSDE Oracle Run-Time Client Geodatabase 9
  10. 10. Cross-walking (Transmogrification) 10
  11. 11. Topology Check• The Topology Check workspace compiles a number of topology and tolerance based metrics: – Gaps – endpoints within 5 meters of any line segment – Overshoots – line segments extending less than 5 meters beyond an intersection – Tiny Features – features with a total length less than 5 meters – Floating Features – features or connected sets of features that are not connected to the rest of the road network – Exact Duplicates – features whose geometry and name are identical to another feature – Coincident – features whose geometry overlaps with another feature – Crossing – features that cross but do not intersect at a node – Multi-part – features that consist of multiple geometry parts – Cutbacks – features containing angles less than 25 degrees 11
  12. 12. Completeness / Change Detection Check• The MAF/TIGER road centerline features and the feature source file road centerline features will be compared using and FME workspace.• The MAF/TIGER features will be Buffered to a distance of 15 meters, then “overlayed” with the source file features.• Any source file feature parts that fall outside of the Buffer areas will be chained together, and the total length of difference (and of each part) will be reported as an evaluation metric. 12
  13. 13. CE95 Qualifying Intersection Identification• Qualifying intersections must meet the following criteria: – Must consist of three roads (a “T” intersection) or four roads (an “X” intersection) – Must consist of only secondary roads or local roads – Must meet at 90 or 180 degree angles, with a 15 degree plus/minus tolerance 13
  14. 14. Thank You! Questions? For more information:  Jay E. Spurlin   U.S. Census Bureau 