Ecoinformatics International Technical Collaboration PartnershipInternational Web Meeting - Linked Open Data and Environme...
EcoInformatics – Geospatial Discussion                              November 11, 2010                          December 6,...
EcoInformatics – Geospatial Discussion                              November 11, 2010                       December 6, 20...
Upcoming SlideShare
Loading in...5
×

EcoInformatics FRS Presentation - Discussion 20101206

363

Published on

Discussion Notes: Presentation to Ecoinformatics International Technical Collaboration Partnership

International Web Meeting - Linked Open Data and Environmental Information

Day 1 – December 6, 2010
Geospatial Topic – Dave Smith

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
363
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "EcoInformatics FRS Presentation - Discussion 20101206"

  1. 1. Ecoinformatics International Technical Collaboration PartnershipInternational Web Meeting - Linked Open Data and Environmental InformationDay 1 – December 6, 2010Geospatial Topic – Dave SmithDecember 6, 2010 Dave Smith USEPA/OEI/OIC/IESD/ISSB smith.davidg@epa.gov 202-566-0797Document Change History Revision Date Author Description 1.0 12/6/2010 David G. Smith Initial VersionFRS as a Linked Open Data Pilot - BackgroundEPA maintains a database of facilities, which is aggregated from a variety of sources – 32 federaldatabases (mostly EPA, along with a few others such as Energy Information Administration), and 57state and tribal databases. Information about facilities is conflated from these sources, to includefacility name and geographic location (to include spatial feature type such as point or polygon, latitude,longitude, coordinate reference system, and collection metadata), physical and mailing address, pointsof contact, activities conducted at the given location (via North American Industry Classification System -NAICS and its’ predecessor, Standard Industrial Classification - SIC codes), and any associated programidentifiers, permit numbers, and other related items.This in turn serves as a geospatial foundation piece for some of EPA’s reporting and mapping tools andcapabilities, such as Envirofacts, MyEnvironment and other tools, allowing parametric data and reportsfrom a variety of programs to be linked to facilities.Currently this integration is being done via traditional means, i.e. Relational Database ManagementSystem queries; additionally, web services and APIs are limited - as such, integration opportunity isgenerally limited to what we can do within the Agency.
  2. 2. EcoInformatics – Geospatial Discussion November 11, 2010 December 6, 2010OpportunityVia Linked Open Data approaches, there is opportunity and potential for publishing this facilities dataframework to allow analysis across other agencies as well, such as Occupational Safety and HealthAdministration - OSHA or Mine Safety and Health Administration - MSHA enforcement histories,offshore platforms using Bureau of Ocean Energy Management, Regulation and Enforcement - BOEMREdata, and other types of cross-cutting, government-wide approaches, as more Linked Open Data assetsbecome available.Initial EffortsEPA is still in the planning stages – we have published some initial FRS data as RDF via Data.gov,however we are now working to iteratively refine our LOD publishing approach, through the use of a“cookbook” approach which we hope to be able to apply to a number of EPA datasets, which willestablish a framework to provide consistent methodologies and approaches for publishing Linked OpenData agencywide. Part of this will be to leverage existing agency investments in metadata, datadictionaries, terminologies and ontologies, toward further contextualizing of agency data assets.For FRS, we hope to contextualize the various facets of the data, e.g. corporate/organizational entity,points of contact, activities and other aspects.Geospatial EnablementThere are multiple aspects to geo-enablement via Linked Open Data – one being how to represent thefeatures in a manner that works for mapping, such as points, lines, polygons and associated topologies,the associated coordinates, along with metadata describing such things as coordinate reference systemsand locational accuracy estimates.For the geospatial feature component of FRS, we hope to look at current OGC standards and efforts,such as the GeoSemantics SWG, as well as emergent GeoSPARQL efforts, and to collaborate with theSpatial Ontology Community of Practice (SOCOP). We will need to delve into the most efficacious meansof representing features, such as GeoRSS, along with current coordinate reference systems (e.g. NAD83)toward interoperability and geospatial analysis.Another aspect of this deals with the geography of interest, delving into relating the facility attributeontology with the surrounding terrain ontology to contextualize, for example, if we are dealing with amining facility, can one relate the facility interest with other datasets such as geology, stratigraphy, andother mining-related data?These may require some tuning in how we collect and model data, for example, most of our data hashistorically been program-specific, with some of these subtler nuances currently only reachable throughimperfect derivation, based on things like NAICS code.Next Steps 2
  3. 3. EcoInformatics – Geospatial Discussion November 11, 2010 December 6, 2010We hope to collaborate with our counterparts in other agencies on best practices and lessons learned –in the case of EPA’s Facility Registry System, there are direct, tangible, and implementable pieces whichwe can put into motion, and there is opportunity to develop a more robust Linked Open Data approach,an effort which has already kicked off. 3

×