Your SlideShare is downloading. ×
EcoInformatics FRS Presentation - Discussion 20101206
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

EcoInformatics FRS Presentation - Discussion 20101206


Published on

Discussion Notes: Presentation to Ecoinformatics International Technical Collaboration Partnership …

Discussion Notes: Presentation to Ecoinformatics International Technical Collaboration Partnership

International Web Meeting - Linked Open Data and Environmental Information

Day 1 – December 6, 2010
Geospatial Topic – Dave Smith

Published in: Education

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Ecoinformatics International Technical Collaboration PartnershipInternational Web Meeting - Linked Open Data and Environmental InformationDay 1 – December 6, 2010Geospatial Topic – Dave SmithDecember 6, 2010 Dave Smith USEPA/OEI/OIC/IESD/ISSB 202-566-0797Document Change History Revision Date Author Description 1.0 12/6/2010 David G. Smith Initial VersionFRS as a Linked Open Data Pilot - BackgroundEPA maintains a database of facilities, which is aggregated from a variety of sources – 32 federaldatabases (mostly EPA, along with a few others such as Energy Information Administration), and 57state and tribal databases. Information about facilities is conflated from these sources, to includefacility name and geographic location (to include spatial feature type such as point or polygon, latitude,longitude, coordinate reference system, and collection metadata), physical and mailing address, pointsof contact, activities conducted at the given location (via North American Industry Classification System -NAICS and its’ predecessor, Standard Industrial Classification - SIC codes), and any associated programidentifiers, permit numbers, and other related items.This in turn serves as a geospatial foundation piece for some of EPA’s reporting and mapping tools andcapabilities, such as Envirofacts, MyEnvironment and other tools, allowing parametric data and reportsfrom a variety of programs to be linked to facilities.Currently this integration is being done via traditional means, i.e. Relational Database ManagementSystem queries; additionally, web services and APIs are limited - as such, integration opportunity isgenerally limited to what we can do within the Agency.
  • 2. EcoInformatics – Geospatial Discussion November 11, 2010 December 6, 2010OpportunityVia Linked Open Data approaches, there is opportunity and potential for publishing this facilities dataframework to allow analysis across other agencies as well, such as Occupational Safety and HealthAdministration - OSHA or Mine Safety and Health Administration - MSHA enforcement histories,offshore platforms using Bureau of Ocean Energy Management, Regulation and Enforcement - BOEMREdata, and other types of cross-cutting, government-wide approaches, as more Linked Open Data assetsbecome available.Initial EffortsEPA is still in the planning stages – we have published some initial FRS data as RDF via,however we are now working to iteratively refine our LOD publishing approach, through the use of a“cookbook” approach which we hope to be able to apply to a number of EPA datasets, which willestablish a framework to provide consistent methodologies and approaches for publishing Linked OpenData agencywide. Part of this will be to leverage existing agency investments in metadata, datadictionaries, terminologies and ontologies, toward further contextualizing of agency data assets.For FRS, we hope to contextualize the various facets of the data, e.g. corporate/organizational entity,points of contact, activities and other aspects.Geospatial EnablementThere are multiple aspects to geo-enablement via Linked Open Data – one being how to represent thefeatures in a manner that works for mapping, such as points, lines, polygons and associated topologies,the associated coordinates, along with metadata describing such things as coordinate reference systemsand locational accuracy estimates.For the geospatial feature component of FRS, we hope to look at current OGC standards and efforts,such as the GeoSemantics SWG, as well as emergent GeoSPARQL efforts, and to collaborate with theSpatial Ontology Community of Practice (SOCOP). We will need to delve into the most efficacious meansof representing features, such as GeoRSS, along with current coordinate reference systems (e.g. NAD83)toward interoperability and geospatial analysis.Another aspect of this deals with the geography of interest, delving into relating the facility attributeontology with the surrounding terrain ontology to contextualize, for example, if we are dealing with amining facility, can one relate the facility interest with other datasets such as geology, stratigraphy, andother mining-related data?These may require some tuning in how we collect and model data, for example, most of our data hashistorically been program-specific, with some of these subtler nuances currently only reachable throughimperfect derivation, based on things like NAICS code.Next Steps 2
  • 3. EcoInformatics – Geospatial Discussion November 11, 2010 December 6, 2010We hope to collaborate with our counterparts in other agencies on best practices and lessons learned –in the case of EPA’s Facility Registry System, there are direct, tangible, and implementable pieces whichwe can put into motion, and there is opportunity to develop a more robust Linked Open Data approach,an effort which has already kicked off. 3