EPA OEI Linked Data Process


Published on

EPA OEI Linked Data Process presentation - 2012.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

EPA OEI Linked Data Process

  1. 1. Publishing EPA Data as Linked Data A brief by Michael Pendleton EPA Office of Environmental Information pendleton.michael@epa.gov
  2. 2. What is driving us?“We’re moving from managing documents to managing discrete pieces of open data and content which can be tagged, shared, secured, mashed up and presented in the way that is most useful for the consumer of that information.” -- Report on Digital Government: Building a 21st Century Platform to Better Serve the American People
  3. 3. Goal: Make Open Data, Content, and Web APIs the New Default
  4. 4. Linked DataWhat’s It All About? • Speak the Language of the Web • Just as you surf web pages, linked data lets you surf data. • SOAP was about making the web try to work like applications; REST was about making applications work like the web. • Linked Data is about making your DATA work like the web. Slide Credit: David G. Smith U.S. Environmental Protection Agency 4 Aug 16, 2011 presentation
  5. 5. RDF is a linguafranca for data exchange
  6. 6. Linked DataBasics• Tim Berners-Lee: 5-Star model for publishing dataSlide Credit: David G. Smith U.S. Environmental Protection Agency 6
  7. 7. • Linked Data is about publishing and consuming data using international data standards• Based on 20 year old idea (the Web)• A system of linked information systems
  8. 8. Global requirements• Comprehensively link legislation & regulations for more effective government• Explain context, source, version & publication date with the data itself• We need global standards for metadata
  9. 9. The mission of the Government LinkedData (GLD) Working Group is to providestandards and other information whichhelp governments around the worldpublish their data as effective and usableLinked Data using Semantic Webtechnologies.
  10. 10. Best PracticesVocabulary GuidanceCommunity Building
  11. 11. US EPA publishes lots of CSV files ...
  12. 12. And now, Linked Open Data ...• A proof-of-concept launched 2011 with 5 Star Linked Data• Publication of 1.3M facilities (FRS) and the substances (SRS) regulated by the EPA• TRI program links to 25 years of data on major polluters• Additional pilots in 2012 incorporating EPA and anonymized electronic medical records (EMR) data from Sentara Healthcare• 5 Star Linked Open Data to be hosted & accessible on an EPA production Web site in summer 2012
  13. 13. Increase re-use by publishing Linked Data • Empower users to create their own views of data to satisfy different applications • Build a community around the data in which users help each other to curate and connect as needed • Skip the supermodel - Leave data in the multiple “best of breed” systems; wrap and expose on the Web of Data
  14. 14. There is a ProcessIdentify Identify Model Model Name Name Describe Describe Convert Convert Publish Publish Maintain
  15. 15. 7 steps to publishing Linked Data• Identify a dataset others are likely to want to re-use• Modeling • Onsite modeling session (half day) • Linked Data modeling supported by experts • Validate the model with data owners/stewards• Publish data on the Web (opendata.epa.gov) per Best Practices• Produce automated scripts to maintain current data• Announce Linked Open Data sets *• Review usage reports to support relevance & user feedback * Pending EPA Systems Security Plan approval
  16. 16. Open Data Platforms• We’re using Callimachus, a Web platform for data-driven applications based on Linked Data principles.• It is hosted on Amazon EC2 and we have 24x7x365 data & application support.• There are other data platforms, we selected this one because it is fully W3C standards compliant, no vendor “lock in”• It’s Open Source (Apache 2.0)
  17. 17. Recommendations• Linked Data promotes goals of transparency & economic development during times of fiscal austerity • Publish in reusable format (RDF family of standards) • Use OPEN vs proprietary in data formats • Define a URI Policy and Strategy • Use best practices and vocabularies exist -- don’t recreate the wheel
  18. 18. Publishing Linked Datawill require continualnurturing but therewards are worth it
  19. 19. Resources• VisibleGovernment.ca Website http://visiblegovernment.ca• Hack, Mash and Peer: Crowdsourcing Government Transparency, Jerry Brito, George Mason University, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1023485• Blog on UK Environment Agency Water Quality, see http://data.southampton.ac.uk/datasets.html• Southampton Open Data Service, see http://data.southampton.ac.uk/datasets.html• Blog post on Clean Energy data from Reegle, see http://blog.semantic- web.at/2012/04/13/reegle-info-linked-open-energy-data-cloud/• Blog post on Publishing Linked Open Data in Tight Economic Times, 30-Jan-2012, http://3roundstones.com/2012/01/30/publishing-linked-open-data-makes-good-sense-in- tight-economic-times/• Blog post on HealthData.gov from US Health & Human Services, 4-June-2012, http://www.healthdata.gov/blog/welcome-new-healthdatagov• Blog post on US HHS Domain Challenge 1: Metadata, 2-June-2012, http://www.healthdata.gov/blog/domain-challenge-1-metadata
  20. 20. Coming soon ...• Best Practices for Publishing Linked Data (editor’s Draft 20-Apr-2012), see https://dvcs.w3.org/hg/gld/raw- file/default/bp/index.html• Linked Data Cookbook, see http://www.w3.org/2011/gld/wiki/Linked_Data_Cookboo k• Linked Data Directory, see http://dir.w3.org• Attend the 2012 International Open Government Data Conference co-sponsored by data.gov & The World Bank 10-12 July 2012, Washington DC, see http://www.data.gov/communities/conference
  21. 21. This work is Copyright © 2011-2012 3 Round Stones Inc.It is licensed under the Creative Commons Attribution 3.0 Unported LicenseFull details at: http://creativecommons.org/licenses/by/3.0/You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the workUnder the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.
  22. 22. Credits Jennifer Bell, http://www.slideshare.net/jenniferbell VisibleGovernment.ca (CC-BY-SA) http://lab.linkeddata.deri.ie/2010/star-scheme-by-example/ 1-5 Star Linked Data image LOD Cloud Diagrams Richard Cyganiak, Anja http://lod-cloud.net/ Jentzsch, (CC-BY-SA) Book covers © their respective owners and used under Fair Use for educational purposes© 2012 Bernadette Hyland, released under a CC-BY-SA license