Put Your Desktop in the Cloud In Support of the Open Government Directive and Data.gov/semantic Proposal for Presentation / Interactive Tutorial 2010 Office of Environmental Information National Symposium, May 11 – 13, 2010 By Brand Niemann December 24, 2009
Overview Proposal: Session Objectives Key Audiences Session Format Key Questions to be Addressed Session Participants AV and Other Requirements Tutorial Materials: Background EPA Enterprise Architecture (Land and Water) EPA Ontology Standard (Faceted Search and Desktop Versions) MyAirQuality (iPhone App developed by NOAA)
Proposal: Session Objectives Attendees will learn the three steps to Create an Open Government Webpage; Create an Open Government Dashboard; and Publish Three or More Data Sets. The three steps are to create a set of good URLs for your content, implement those in a tool that supports Web 2.0 standards, and use a tool that converts that Web 2.0 content to Web 3.0 standards. Examples will include: High-quality government data tables; Peer-reviewed scientific and statistical databases and models; the National Information Exchange Model; A “Government Data Facts for the Citizen Newspaper”; and more!
Proposal: Key Audiences and Session Format Key Audiences: All EPA Employees that already do or want to do the following: Telework, Collaboration, Information Sharing, Semantic Web, Cloud Computing, and Mobile Apps. Session Format: Single Presentation / Interactive Tutorial
Proposal: Key Questions to be Addressed and Session Participants Key Questions to be Addressed: How can my EPA Desktop look and function like my iPhone interface with the App Store and accomplish the Office of Environmental Information’s 7 FY2010 Priorities. Session Participants: Brand Niemann, Senior Enterprise Architect, OEI/OTOP/MISD/ITSPB, niemann.brand@epa.gov
Proposal: AV and Other Requirements The session materials are ready now and will be updated by the April 10th deadline.
Tutorial Materials: Background Imagine that your desktop worked like your mobile device (e.g. iPhone): All your files and functions were in the network cloud or downloaded to the device / desktop.  Issued that challenge to the Washington Semantic Web Meetup earlier this week: Partly Cloudy With a Chance of Semantics See  http://semweb.meetup.com/31/calendar/11944383/
Tutorial Materials: Background A free Wiki ( Deki  Express ) that was a "fork" from MediaWiki that evolved to a platform ( web-services with a wiki interface ) that further evolved to a Cloud Computing Internet Operating System Desktop (see  migration to Amazon EC2  and  recent  MindTouch  announcement ) that helped our community produce many semantic applications (e.g.  previous page with matrix of apps ) each organized by topic, subtopic, data table, and data elements with well-defined URL's in support of Data.gov (see especially  http:// federaldata.wik.is ).
Tutorial Materials: Background A semantic publishing environment that supports use on Mobile Apps (e.g.  iPhone ) and Linked Open Data through MindTouch Extensions (e.g.  App Catalog ), conversion of the MySQL database to an RDF triple-store (e.g.  DBpedia ), and use with spreadsheet tools (e,g.  Cambridge Semantics ). Now that Google and other search engines are reorienting rankings to favor inclusion of semantics and RDFa, this becomes a very strong argument for Linked Open Data for the government. See  Creating Linked Data - Parts I -V .
Tutorial Materials: Background Creating Linked Data: Creating Linked Data - Part I: Analysing and Modelling http://www.jenitennison.com/blog/node/135 Creating Linked Data - Part II: Defining URIs http://www.jenitennison.com/blog/node/136 Creating Linked Data - Part III: Defining Concept Schemes http://www.jenitennison.com/blog/node/137 Creating Linked Data - Part IV: Developing RDF Schemas http://www.jenitennison.com/blog/node/138 Creating Linked Data - Part V: Finishing Touches http://www.jenitennison.com/blog/node/139
Tutorial Materials: Background EPA’s Office of Environmental Information FY2010 Priorities: 1. Improve Enterprise Data Management. The Semantic Web with Linked Open Data does this. 2. Enhance the Toxic Release Inventory. This is one of the apps now and pilots for data.gov/semantic. 3. Use of Web 2.0 for Collaboration. This is Web 2.0 for collaboration! 4. Refocus Enterprise Content Management. Organize with URIs and RDF metadata. 5. Improve Enterprise Desktop Support. Put the desktop in the cloud! 6. Support the Mobile Workforce. Mobile apps do that. 7. Strengthen WAN Infrastructure. Improves cloud computing bandwidth. See  http:// epa.gov/oei /
Tutorial Materials: Background Request for comments on the Data.gov Concept of Operations: I think what is missing are: Standard file formats and data standards for the data posted at data.gov that will ease interoperability. Without these, it will be challenging to have mashups that mean anything.  I realize that in the geospatial realm, that may have been addressed, but in other, non-geospatial areas, documents and flat data, this could be a real problem.  If we are ever going to compare across agencies, we will need common terms or numbers for land entities (facilities/properties/sites/etc.), parent companies, etc. Source: Lisa Jenkins, Architect, Web Infrastructure Liaison, System Manager, GIS Contact, Internal Consultant, U.S. EPA/OSWER/OPM/IMDQS, December 18, 2009.
Tutorial Materials: Background I can show you how to solve this problem for your excellent OSWER Segment content using the following methodology: 1. Construct a good system of URL's for your content (e.g. EPA Enterprise Architecture, Segments, OSWER, Topics, Subtopics, Data Tables, and Data Elements). 2. Implement that in a tool that supports Web 2.0 Standards (e.g. PHP, REST, DREAM). 3. Implement 1 and 2 in a tool that supports conversion of Web 2.0 content formats to Web 3.0 content formats (RDF. OWL, RIF).
Tutorial Materials: Background Data.gov needs better definition of and organization of data sets. Source: Jose Manuel Alonzo, W3C e-Gov Special Interest Group, December 22, 2009. Reply: This is the way I look at this - well-constructed data tables consist of a combination of data elements that subject matter experts / statisticians agree make sense together (not apples and oranges as we say) and databases consist of multiple data tables that make sense together, even better have an ontology that relates them and all their data elements. This is what I have been recommending for Data.gov for some time now. See  http://epaontology.wik.is/2008_Report_on_the_Environment_Ontology
Tutorial Materials: Background Data.gov Evolution to the Semantic Web Discussion, December 10, 2009: Participants: Professor James Hendler, Tetherless World Chair, Rensselaer Polytechnic Institute (co-inventor of the Semantic Web). See  http://data- gov.tw.rpi.edu/wiki/The_Data-gov_Wiki Marion Royal, GSA, Data.gov Program Manager. George Thomas, Senior Enterprise Architect, HHS. Brand Niemann, Senior Enterprise Architect, US EPA (EPA’s acting CIO, Linda Travers, co-chairs the Data.gov with Sonny Bhagowalia, DoI, CIO). See  http://semanticommunity.net
Tutorial Materials: Background Data.gov Evolution to the Semantic Web Discussion: My action items list: 1. Marion Royal will establish a listserv for further discussions by a small group initially. 2. Start with data.gov/semantic (instead of semantic.data.gov - a separate web site) with links to partners doing Semantic Web applications with government data (e.g. Jim Hendler, the W3C egov Interest group, our Semantic Community.Net, etc.). 3. Get government data stewards and subject matter experts to work with the partners to create more applications (e.g. Jim Hendler's graduate students need help from government people who know their data and what they want to do with it). 4. Work with CIO's like Chris Kemp (NASA Ames CIO) that are helping scientists put large datasets in the Nebula Cloud this coming year as part of the "Year of Cloud Computing Pilots" (Peter Mell's prediction) to make these part of data.gov/semantic. 5. Evolve NIEM to the Web and especially the Semantic Web (Donna Roy said she would welcome this help). 6. Invite Tim Berners-Lee to look at data.gov/semantic in the future and hopefully he will engage his followers (e.g. 15, 000 on Twitter) to support it as well.
Tutorial Materials: Background Open Government Directive: December 8, 2009, by OMB Director Peter Orszag: Specific actions to implement the principles of transparency, participation, and collaboration. Establishes deadlines for action. 1. Publish Government Information Online. 2. Improve the Quality of Government Information. 3. Create and Institutionalize a Culture of Open Government. 4. Create an Enabling Policy Framework for Open Government. Within 60 days, each agency shall create an Open Government Webpage at  http://www.agency.gov/open Within 60 days, the Federal CIO and CTO shall create an Open Government Dashboard at  http://www.whitehouse.gov/open
Tutorial Materials: Background Open Government Directive: As part of “Put Your Desktop in the Cloud to Support the Open Government Directive and Data.gov/semantic”, I believe that each government employee should: Create an Open Government Webpage; Create an Open Government Dashboard; and Publish Three or More Data Sets.
Tutorial Materials: Background Open Government Directive: My Open Government Webpages are: http://epaenterprisearchitecture.wik.is   http://semanticommunity.net   My Open Government Dashboard (actually Performance Appraisal and Recognition System – PARS) is: http://semanticommunity.wik.is/@api/deki/files/1680/=BrandNiemannPARS2009.doc   My Publish Three or More Data Sets is: CIA Fact Book (266 world entities) http://web-services.gov  (see Example of Field Searching and Data Architecture started in 1999) and  http://federaldata.wik.is  (in process). EPA Report on the Environment Indicators (over 200 data sets): http://epaontology.wik.is/2008_Report_on_the_Environment_Ontology   Census Bureau Annual Statistical Abstract (about 1500 data sets): http://federaldata.wik.is/Statistical_Abstract_of_the_United_States%3a_2009
Tutorial Materials: Background Cloud Computing at EPA - Salesforce has worked on and supported a number of EPA projects over the last 4 years: There are currently over 125 users across multiple EPA programs leveraging our Cloud Computing Software as a Service and Platform as a Service solutions. EPA is using our Cloud solutions to support a number of partnership programs with thousands of partner organizations, primarily focused on driving environmental improvements/results. Our Cloud Computing model has worked very well for these programs, considering many of them have users spread out across multiple geographic locations.  For example, the Green Power Partnership program uses real-time reports and Dashboards in Salesforce to track purchased kilowatt hours of Green Power across the partners in the program.  The program also uses our application to track correspondences, outreach, partner intake from the web, and more.  Furthermore, the EPA’s Salesforce applications are available to the users via their Blackberry devices, providing real-time mobile access.  Some of the programs that we are supporting have replaced Oracle/Upshot with Salesforce.com for improved functionality and cost savings. Examples of EPA Offices/Programs Using SFDC include: Pesticide Partnership Program State & Local Clean Energy Program Climate Leaders Program Combined Heat & Power Partnership Green Power Partnership Water Efficiency Program National Vehicle and Fuel Emissions Program  (ramped down last year) Source: Mark Cerniglia, Sr. Account Executive, Saleforce.com, December 10, 2009.
Tutorial Materials: EPA Enterprise Architecture Segments: Information (1&2) & Data (3&4) Architecture 1. Topics Land (OSWER) Water (Water) 2. Subtopics Example (in process) Example (in process) 3. Tables Example (in process) Example (in process) 4. Data Elements Example (in process) Example (in process)
Tutorial Materials: EPA Enterprise Architecture OSWER has table of contents with topics and subtopics. OSWER has inventory of data elements and some examples of their interrelationships (ontology-like schematic diagrams). EPA Report on the Environment has a section on Land which has tables and data elements (ontology).
Tutorial Materials: EPA Enterprise Architecture EAWG members, I encourage and welcome comments against the draft Target Architecture for OW's Water Quality Management Segment.  Your collective knowledge within your respective Segments and EA experience will greatly assist in advancing OW's draft TA.  I strongly encourage Program Segment leads to think of "your" needs from OW as you read the document.  There are numerous OW program examples in terms of how we intend to use IT, but needs additional program examples outside of OW. For example, how or why does OSWER need impaired waters data/web services/ETL, etc?  What does R5 and stakeholders like GLNPO need from OW's IT infrastructure to advance program objectives.  These examples will serve as a tremendous help in advancing OW's TA and overall IT strategic planning process. Source: Vince Allen, Data Architect, Office of Water, December 18, 2009.
Tutorial Materials: EPA Enterprise Architecture Questions: 1. Does TA have a good outline for an information architecture to be useful here? Yes, very definitely. 2. Does the TA Conceptual Diagram contain enough information for a data architecture to be useful here? Not yet - needs to drill-down to the data tables and data elements (I did some of this for OW several years ago when we harmonized the data elements across the 30 or so major systems to define the mission critical set of data elements). 3. Essentially, is it up to the EPA Ontology Standard? Not yet. The EPA Report on the Environment has a section on Water which has data tables and data elements (ontology). See  http://epaontology.wik.is
Tutorial Materials: EPA Ontology Standard (Faceted Search) http://web-services.gov/lpBin22/lpext.dll/Folder/Infobase11/1?fn=main- j.htm&f =templates&2.0
Tutorial Materials: EPA Ontology Standard (Desktop) http://epaontology.wik.is/2008_Report_on_the_Environment_Ontology
Tutorial Materials: MyAirQuality Many people in the United States are exposed to levels of air pollutants, such as ozone and particulate matter (“PM”), that are high enough to cause harmful health effects, including respiratory and cardiovascular problems. As a result poor air quality contributes to tens of thousands of premature deaths each year in the United States. MyAirQuality displays air quality observations, forecasts, and health messages for locations you select. The application displays air quality conditions using the same criteria and formats as Federal, tribal, state, and local governments use. Since this approach may not be familiar to some people, when MyAirQuality is first started it briefly explains the system. In two minutes, you will be ready to see your air quality conditions. See  http:// developer.apple.com/iphone /   Download App:  http://www.appstorehq.com/myairquality-iphone-41911/app
Tutorial Materials: MyAirQuality Documentation:  http://www.softwareforasong.net/doc/index2.html
Tutorial Materials: Summary Google Search from Mobile Device gives Three Options: Cached, Similar, & Mobile Formatted. These Desktop Apps work well on mobile devices. Now that Google and other search engines are reorienting rankings to favor Inclusion of semantics and RDFa, this becomes a very strong argument for Linked Open Data for the government. This protects and enhances your investment in data by putting it in the cloud of Linked Open Data so it can be re-used. Announcing the Year of Semantic Web Training and Pilots for Data.gov/semantic. For example, EPA is also using Blackbook 2 and 3 (see next slide).
Tutorial Materials: Summary NSA’s Blackbook 2 and 3 - The Standard in Semantic Web Technology for Data Management: Blackbook2 is a project architected by Intervise’s Chief Technology Officer (Scott Streit), which moved into open source on September 1, 2009. Blackbook2 is a standard for semantic web processing and is currently used, in production or pilots, at the Department of Defense, Dole Foods, and the Environmental Protection Agency, just to name a few. Without changing Blackbook2, merely adding new data, Blackbook2 processes anything from Shipping Visibility Data to classified, analytical processing. A web application framework for data analysis, Blackbook2's architecture provides secure support for visualization, transformation, data source integration, asynchronous operations, and is vocabulary agnostic. Application programming interfaces (APIs) exist for plugin components that visualize, discover, transform, extract, enrich, or filter graph‐based data such as social networks. Data sources of many types (RDBMs, Documents, RSS) can be mapped into Blackbook2 as RDF/OWL, either by Ingest or real-time mapping solutions such as D2RQ. The Blackbook2 architecture provides asynchronous operations via a message bus backend so that results are provided just-in-time to real‐time users or to workflows that may run for hours. Blackbook2 is vocabulary agnostic: it can provide mapping to a common ontology for all data sources or it can accommodate disparate vocabularies with common vocabulary subsets (e.g., Dublin Core, VCard). Finally, Blackbook2 is accredited for multi-level security via role-based access and provides integrated logging via standard interfaces (JAAS and Log4J). Blackbook 2, and now 3, is available free to Federal government employees and is currently running on the NSA CloudBase Cloud Computing Platform. Software, documentation, and collaboration are available at  http://rabasrv.jhuapl.edu/wiki/index.php/Main_Page   by contacting Buster Fields at  [email_address]  .

Brand Niemann Tutorial12242009

  • 1.
    Put Your Desktopin the Cloud In Support of the Open Government Directive and Data.gov/semantic Proposal for Presentation / Interactive Tutorial 2010 Office of Environmental Information National Symposium, May 11 – 13, 2010 By Brand Niemann December 24, 2009
  • 2.
    Overview Proposal: SessionObjectives Key Audiences Session Format Key Questions to be Addressed Session Participants AV and Other Requirements Tutorial Materials: Background EPA Enterprise Architecture (Land and Water) EPA Ontology Standard (Faceted Search and Desktop Versions) MyAirQuality (iPhone App developed by NOAA)
  • 3.
    Proposal: Session ObjectivesAttendees will learn the three steps to Create an Open Government Webpage; Create an Open Government Dashboard; and Publish Three or More Data Sets. The three steps are to create a set of good URLs for your content, implement those in a tool that supports Web 2.0 standards, and use a tool that converts that Web 2.0 content to Web 3.0 standards. Examples will include: High-quality government data tables; Peer-reviewed scientific and statistical databases and models; the National Information Exchange Model; A “Government Data Facts for the Citizen Newspaper”; and more!
  • 4.
    Proposal: Key Audiencesand Session Format Key Audiences: All EPA Employees that already do or want to do the following: Telework, Collaboration, Information Sharing, Semantic Web, Cloud Computing, and Mobile Apps. Session Format: Single Presentation / Interactive Tutorial
  • 5.
    Proposal: Key Questionsto be Addressed and Session Participants Key Questions to be Addressed: How can my EPA Desktop look and function like my iPhone interface with the App Store and accomplish the Office of Environmental Information’s 7 FY2010 Priorities. Session Participants: Brand Niemann, Senior Enterprise Architect, OEI/OTOP/MISD/ITSPB, niemann.brand@epa.gov
  • 6.
    Proposal: AV andOther Requirements The session materials are ready now and will be updated by the April 10th deadline.
  • 7.
    Tutorial Materials: BackgroundImagine that your desktop worked like your mobile device (e.g. iPhone): All your files and functions were in the network cloud or downloaded to the device / desktop. Issued that challenge to the Washington Semantic Web Meetup earlier this week: Partly Cloudy With a Chance of Semantics See http://semweb.meetup.com/31/calendar/11944383/
  • 8.
    Tutorial Materials: BackgroundA free Wiki ( Deki Express ) that was a "fork" from MediaWiki that evolved to a platform ( web-services with a wiki interface ) that further evolved to a Cloud Computing Internet Operating System Desktop (see  migration to Amazon EC2  and  recent MindTouch announcement ) that helped our community produce many semantic applications (e.g.  previous page with matrix of apps ) each organized by topic, subtopic, data table, and data elements with well-defined URL's in support of Data.gov (see especially  http:// federaldata.wik.is ).
  • 9.
    Tutorial Materials: BackgroundA semantic publishing environment that supports use on Mobile Apps (e.g.  iPhone ) and Linked Open Data through MindTouch Extensions (e.g.  App Catalog ), conversion of the MySQL database to an RDF triple-store (e.g.  DBpedia ), and use with spreadsheet tools (e,g.  Cambridge Semantics ). Now that Google and other search engines are reorienting rankings to favor inclusion of semantics and RDFa, this becomes a very strong argument for Linked Open Data for the government. See  Creating Linked Data - Parts I -V .
  • 10.
    Tutorial Materials: BackgroundCreating Linked Data: Creating Linked Data - Part I: Analysing and Modelling http://www.jenitennison.com/blog/node/135 Creating Linked Data - Part II: Defining URIs http://www.jenitennison.com/blog/node/136 Creating Linked Data - Part III: Defining Concept Schemes http://www.jenitennison.com/blog/node/137 Creating Linked Data - Part IV: Developing RDF Schemas http://www.jenitennison.com/blog/node/138 Creating Linked Data - Part V: Finishing Touches http://www.jenitennison.com/blog/node/139
  • 11.
    Tutorial Materials: BackgroundEPA’s Office of Environmental Information FY2010 Priorities: 1. Improve Enterprise Data Management. The Semantic Web with Linked Open Data does this. 2. Enhance the Toxic Release Inventory. This is one of the apps now and pilots for data.gov/semantic. 3. Use of Web 2.0 for Collaboration. This is Web 2.0 for collaboration! 4. Refocus Enterprise Content Management. Organize with URIs and RDF metadata. 5. Improve Enterprise Desktop Support. Put the desktop in the cloud! 6. Support the Mobile Workforce. Mobile apps do that. 7. Strengthen WAN Infrastructure. Improves cloud computing bandwidth. See http:// epa.gov/oei /
  • 12.
    Tutorial Materials: BackgroundRequest for comments on the Data.gov Concept of Operations: I think what is missing are: Standard file formats and data standards for the data posted at data.gov that will ease interoperability. Without these, it will be challenging to have mashups that mean anything.  I realize that in the geospatial realm, that may have been addressed, but in other, non-geospatial areas, documents and flat data, this could be a real problem.  If we are ever going to compare across agencies, we will need common terms or numbers for land entities (facilities/properties/sites/etc.), parent companies, etc. Source: Lisa Jenkins, Architect, Web Infrastructure Liaison, System Manager, GIS Contact, Internal Consultant, U.S. EPA/OSWER/OPM/IMDQS, December 18, 2009.
  • 13.
    Tutorial Materials: BackgroundI can show you how to solve this problem for your excellent OSWER Segment content using the following methodology: 1. Construct a good system of URL's for your content (e.g. EPA Enterprise Architecture, Segments, OSWER, Topics, Subtopics, Data Tables, and Data Elements). 2. Implement that in a tool that supports Web 2.0 Standards (e.g. PHP, REST, DREAM). 3. Implement 1 and 2 in a tool that supports conversion of Web 2.0 content formats to Web 3.0 content formats (RDF. OWL, RIF).
  • 14.
    Tutorial Materials: BackgroundData.gov needs better definition of and organization of data sets. Source: Jose Manuel Alonzo, W3C e-Gov Special Interest Group, December 22, 2009. Reply: This is the way I look at this - well-constructed data tables consist of a combination of data elements that subject matter experts / statisticians agree make sense together (not apples and oranges as we say) and databases consist of multiple data tables that make sense together, even better have an ontology that relates them and all their data elements. This is what I have been recommending for Data.gov for some time now. See http://epaontology.wik.is/2008_Report_on_the_Environment_Ontology
  • 15.
    Tutorial Materials: BackgroundData.gov Evolution to the Semantic Web Discussion, December 10, 2009: Participants: Professor James Hendler, Tetherless World Chair, Rensselaer Polytechnic Institute (co-inventor of the Semantic Web). See http://data- gov.tw.rpi.edu/wiki/The_Data-gov_Wiki Marion Royal, GSA, Data.gov Program Manager. George Thomas, Senior Enterprise Architect, HHS. Brand Niemann, Senior Enterprise Architect, US EPA (EPA’s acting CIO, Linda Travers, co-chairs the Data.gov with Sonny Bhagowalia, DoI, CIO). See http://semanticommunity.net
  • 16.
    Tutorial Materials: BackgroundData.gov Evolution to the Semantic Web Discussion: My action items list: 1. Marion Royal will establish a listserv for further discussions by a small group initially. 2. Start with data.gov/semantic (instead of semantic.data.gov - a separate web site) with links to partners doing Semantic Web applications with government data (e.g. Jim Hendler, the W3C egov Interest group, our Semantic Community.Net, etc.). 3. Get government data stewards and subject matter experts to work with the partners to create more applications (e.g. Jim Hendler's graduate students need help from government people who know their data and what they want to do with it). 4. Work with CIO's like Chris Kemp (NASA Ames CIO) that are helping scientists put large datasets in the Nebula Cloud this coming year as part of the "Year of Cloud Computing Pilots" (Peter Mell's prediction) to make these part of data.gov/semantic. 5. Evolve NIEM to the Web and especially the Semantic Web (Donna Roy said she would welcome this help). 6. Invite Tim Berners-Lee to look at data.gov/semantic in the future and hopefully he will engage his followers (e.g. 15, 000 on Twitter) to support it as well.
  • 17.
    Tutorial Materials: BackgroundOpen Government Directive: December 8, 2009, by OMB Director Peter Orszag: Specific actions to implement the principles of transparency, participation, and collaboration. Establishes deadlines for action. 1. Publish Government Information Online. 2. Improve the Quality of Government Information. 3. Create and Institutionalize a Culture of Open Government. 4. Create an Enabling Policy Framework for Open Government. Within 60 days, each agency shall create an Open Government Webpage at http://www.agency.gov/open Within 60 days, the Federal CIO and CTO shall create an Open Government Dashboard at http://www.whitehouse.gov/open
  • 18.
    Tutorial Materials: BackgroundOpen Government Directive: As part of “Put Your Desktop in the Cloud to Support the Open Government Directive and Data.gov/semantic”, I believe that each government employee should: Create an Open Government Webpage; Create an Open Government Dashboard; and Publish Three or More Data Sets.
  • 19.
    Tutorial Materials: BackgroundOpen Government Directive: My Open Government Webpages are: http://epaenterprisearchitecture.wik.is http://semanticommunity.net My Open Government Dashboard (actually Performance Appraisal and Recognition System – PARS) is: http://semanticommunity.wik.is/@api/deki/files/1680/=BrandNiemannPARS2009.doc My Publish Three or More Data Sets is: CIA Fact Book (266 world entities) http://web-services.gov (see Example of Field Searching and Data Architecture started in 1999) and http://federaldata.wik.is (in process). EPA Report on the Environment Indicators (over 200 data sets): http://epaontology.wik.is/2008_Report_on_the_Environment_Ontology Census Bureau Annual Statistical Abstract (about 1500 data sets): http://federaldata.wik.is/Statistical_Abstract_of_the_United_States%3a_2009
  • 20.
    Tutorial Materials: BackgroundCloud Computing at EPA - Salesforce has worked on and supported a number of EPA projects over the last 4 years: There are currently over 125 users across multiple EPA programs leveraging our Cloud Computing Software as a Service and Platform as a Service solutions. EPA is using our Cloud solutions to support a number of partnership programs with thousands of partner organizations, primarily focused on driving environmental improvements/results. Our Cloud Computing model has worked very well for these programs, considering many of them have users spread out across multiple geographic locations.  For example, the Green Power Partnership program uses real-time reports and Dashboards in Salesforce to track purchased kilowatt hours of Green Power across the partners in the program.  The program also uses our application to track correspondences, outreach, partner intake from the web, and more.  Furthermore, the EPA’s Salesforce applications are available to the users via their Blackberry devices, providing real-time mobile access.  Some of the programs that we are supporting have replaced Oracle/Upshot with Salesforce.com for improved functionality and cost savings. Examples of EPA Offices/Programs Using SFDC include: Pesticide Partnership Program State & Local Clean Energy Program Climate Leaders Program Combined Heat & Power Partnership Green Power Partnership Water Efficiency Program National Vehicle and Fuel Emissions Program  (ramped down last year) Source: Mark Cerniglia, Sr. Account Executive, Saleforce.com, December 10, 2009.
  • 21.
    Tutorial Materials: EPAEnterprise Architecture Segments: Information (1&2) & Data (3&4) Architecture 1. Topics Land (OSWER) Water (Water) 2. Subtopics Example (in process) Example (in process) 3. Tables Example (in process) Example (in process) 4. Data Elements Example (in process) Example (in process)
  • 22.
    Tutorial Materials: EPAEnterprise Architecture OSWER has table of contents with topics and subtopics. OSWER has inventory of data elements and some examples of their interrelationships (ontology-like schematic diagrams). EPA Report on the Environment has a section on Land which has tables and data elements (ontology).
  • 23.
    Tutorial Materials: EPAEnterprise Architecture EAWG members, I encourage and welcome comments against the draft Target Architecture for OW's Water Quality Management Segment. Your collective knowledge within your respective Segments and EA experience will greatly assist in advancing OW's draft TA. I strongly encourage Program Segment leads to think of "your" needs from OW as you read the document. There are numerous OW program examples in terms of how we intend to use IT, but needs additional program examples outside of OW. For example, how or why does OSWER need impaired waters data/web services/ETL, etc? What does R5 and stakeholders like GLNPO need from OW's IT infrastructure to advance program objectives. These examples will serve as a tremendous help in advancing OW's TA and overall IT strategic planning process. Source: Vince Allen, Data Architect, Office of Water, December 18, 2009.
  • 24.
    Tutorial Materials: EPAEnterprise Architecture Questions: 1. Does TA have a good outline for an information architecture to be useful here? Yes, very definitely. 2. Does the TA Conceptual Diagram contain enough information for a data architecture to be useful here? Not yet - needs to drill-down to the data tables and data elements (I did some of this for OW several years ago when we harmonized the data elements across the 30 or so major systems to define the mission critical set of data elements). 3. Essentially, is it up to the EPA Ontology Standard? Not yet. The EPA Report on the Environment has a section on Water which has data tables and data elements (ontology). See http://epaontology.wik.is
  • 25.
    Tutorial Materials: EPAOntology Standard (Faceted Search) http://web-services.gov/lpBin22/lpext.dll/Folder/Infobase11/1?fn=main- j.htm&f =templates&2.0
  • 26.
    Tutorial Materials: EPAOntology Standard (Desktop) http://epaontology.wik.is/2008_Report_on_the_Environment_Ontology
  • 27.
    Tutorial Materials: MyAirQualityMany people in the United States are exposed to levels of air pollutants, such as ozone and particulate matter (“PM”), that are high enough to cause harmful health effects, including respiratory and cardiovascular problems. As a result poor air quality contributes to tens of thousands of premature deaths each year in the United States. MyAirQuality displays air quality observations, forecasts, and health messages for locations you select. The application displays air quality conditions using the same criteria and formats as Federal, tribal, state, and local governments use. Since this approach may not be familiar to some people, when MyAirQuality is first started it briefly explains the system. In two minutes, you will be ready to see your air quality conditions. See http:// developer.apple.com/iphone / Download App: http://www.appstorehq.com/myairquality-iphone-41911/app
  • 28.
    Tutorial Materials: MyAirQualityDocumentation: http://www.softwareforasong.net/doc/index2.html
  • 29.
    Tutorial Materials: SummaryGoogle Search from Mobile Device gives Three Options: Cached, Similar, & Mobile Formatted. These Desktop Apps work well on mobile devices. Now that Google and other search engines are reorienting rankings to favor Inclusion of semantics and RDFa, this becomes a very strong argument for Linked Open Data for the government. This protects and enhances your investment in data by putting it in the cloud of Linked Open Data so it can be re-used. Announcing the Year of Semantic Web Training and Pilots for Data.gov/semantic. For example, EPA is also using Blackbook 2 and 3 (see next slide).
  • 30.
    Tutorial Materials: SummaryNSA’s Blackbook 2 and 3 - The Standard in Semantic Web Technology for Data Management: Blackbook2 is a project architected by Intervise’s Chief Technology Officer (Scott Streit), which moved into open source on September 1, 2009. Blackbook2 is a standard for semantic web processing and is currently used, in production or pilots, at the Department of Defense, Dole Foods, and the Environmental Protection Agency, just to name a few. Without changing Blackbook2, merely adding new data, Blackbook2 processes anything from Shipping Visibility Data to classified, analytical processing. A web application framework for data analysis, Blackbook2's architecture provides secure support for visualization, transformation, data source integration, asynchronous operations, and is vocabulary agnostic. Application programming interfaces (APIs) exist for plugin components that visualize, discover, transform, extract, enrich, or filter graph‐based data such as social networks. Data sources of many types (RDBMs, Documents, RSS) can be mapped into Blackbook2 as RDF/OWL, either by Ingest or real-time mapping solutions such as D2RQ. The Blackbook2 architecture provides asynchronous operations via a message bus backend so that results are provided just-in-time to real‐time users or to workflows that may run for hours. Blackbook2 is vocabulary agnostic: it can provide mapping to a common ontology for all data sources or it can accommodate disparate vocabularies with common vocabulary subsets (e.g., Dublin Core, VCard). Finally, Blackbook2 is accredited for multi-level security via role-based access and provides integrated logging via standard interfaces (JAAS and Log4J). Blackbook 2, and now 3, is available free to Federal government employees and is currently running on the NSA CloudBase Cloud Computing Platform. Software, documentation, and collaboration are available at http://rabasrv.jhuapl.edu/wiki/index.php/Main_Page by contacting Buster Fields at [email_address] .