2008-02-11: EPA DataFed Presentation


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • 2008-02-11: EPA DataFed Presentation

    1. 1. The Federated Data System DataFed <ul><li>Non- intrusive data integration infrastructure </li></ul><ul><li>Based on standards-based web services </li></ul><ul><li>Processing tools created from reusable components </li></ul>
    2. 2. Local, Regional, Global Pollution <ul><li>Before 1950s: </li></ul><ul><li>Local </li></ul><ul><ul><li>Smoke, Fly ash </li></ul></ul>Post- 2000s: Global, HTAP Ozone, PM,Global Change 1970s-1990s: Regional, LRTP Acid Rain, Haze <ul><li>The LRTP/HTAP flow of air pollutants is likely to increase as overseas economies grow. </li></ul><ul><li>Pollutant influx leads to significant exceedances of O3 PM NAAQS in some regions </li></ul><ul><li>Even after domestic controls, some US areas will be no-compliant because of LRTP </li></ul>
    3. 3. Terrestrial Airborne Near-Space LEO/MEO Commercial Satellites and Manned Spacecraft Far-Space L1/HEO/GEO TDRSS & Commercial Satellites Deployable Permanent Coordinating Earth Observing Systems Forecasts & Predictions Aircraft/Balloon Event Tracking and Campaigns User Community Vantage Points Capabilities
    4. 4. ` Products Products State & Local Canadian Providences NOAA NWS HHS CDC-EPHTN Aerosol Optical Depth (GASP) TERRA MODIS AQUA MODIS Products CMAQ Forecast Data US EPA AQS Products SLAMS/NAMS SURACE PM 2.5 Data Air Quality/Public Health NTO Integrated Observed-Modeled Air Quality Fields Products Spatial surface Predictions Satellite/Model/Surface Data Fusion State Public Health Departments ~10:30 local overpass ~1:30 local overpass Algorithms/QA NASA GFSC Science Team NASA GFSC DACC *Note: Regional East Atmospheric Lidar Mesonet (REALM) is university led federated network by UMBC and is identified as a NTO in the implementation plan. Products Aerosol Optical Depth (MOD04_L2) NOAA NESDIS NOAA NESDIS/ORA & CREST Institutes ? UMBC CREST Institute GEOS-12 CONUS every 30 minutes REALM Continuous Vertical Resolution Data Products Algorithms/QA EPA OAR & ORD Products CMAQ Assessment Data Products Studies and Impacts to human health US EPA OAQPS/ORD/OEI RSI Gateway
    5. 5. P. Dickerson, EPA
    6. 6. http://www.igospartners.org/ http://earthobservations.org/ http://www.epa.gov/ttn/amtic/monstratdoc.html National Ambient Air Monitoring Strategy Office of Air Quality Planning and Standards Research Triangle Park, NC December 2005 http://www.al.noaa.gov/AQRS/ reports/monitoring.html http://www.empa.ch/gaw/gawsis/ http://www.nesdis.noaa.gov/ http://www.emep.int/ CENR/AQRS GEOSS NOAA CMDL NOAA NESDIS EMEP R. Scheffe http://www.fz-juelich.de/icg/icg-ii/iagos/ http://www.fz-juelich.de/icg/icg-ii/mozaic/home http://www.cmdl.noaa.gov/ Barrow Mauna Loa Trinidad Head A. Samoa S. Pole L2 NCORE L3 L1 GAW
    7. 7. The Scheffe Challenge: Organizations - Programs – Data: A Mess Info System Challenges: What’s the overall dependency Information Flow Forces and Controls on Data Flow Cooperation, Competition, Co-Opetition GEOSS Eco-informatics Accountability/ indicators SIPs, nat.rules designations PHASE PM research Risk/exposure assessments AQ forecasting Programs NAAQS setting EPA NOAA NASA NPS USDA DOE Private Sector States/Tribes RPO’s/Interstate Academia NARSTO NAS, CAAAC CASAC, OMB Enviros Organizations CDC Supersites IMPROVE, NCore PM monit, PAMS CASTNET Lidar systems NADP Satellite data Intensive studies PM centers Other networks: SEARCH, IADN.. Data sources CMAQ GEOS-CHEM Emissions Meteorology Health/mort. records
    8. 8. Relationship Between Organizations - Programs – Data Version 0.1 Goals $$ Info needs, $$ Data need, $$ Judge, Decide, Act Analyze, Report Actionable Knowledge Decision, Action Public Measure, Organize Organized Data Flow of Information Data systems organize the measurements and models and provide them to programs. Programs analyze the data and provide actionable knowledge to organizations. Organizations evaluate multiple information sources, make decisions and act. Flow of Control Public and special interest groups set up organizations and provides them with funding Organizations develop programs, define their scope, governance and funding Programs satisfy their information needs by monitoring or by using other’s data Data sources acquire the data for their parent programs and also expose them for reuse
    9. 9. System of Systems Global Earth Observing System of Systems - GEOSS <ul><li>Characteristics of System of Systems (SoS) </li></ul><ul><li>Autonomous constituents managed/operated independently </li></ul><ul><li>Independent evolution of each constituent </li></ul><ul><li>SoS displays emergent behavior </li></ul><ul><li>Must recognize, manage, exploit the characteristics: </li></ul><ul><li>No stakeholder has complete SoS insight </li></ul><ul><li>Central control is limited; distributed control is essential </li></ul><ul><li>Users, must be involved throughout the life of a SoS </li></ul>
    10. 11. GEOSS Architecture and Interoperability
    11. 12. Screencast: Information Landscape
    12. 13. Screencast: Info System Screencast
    13. 14. Screencast: DataFed Technologies
    14. 15. Screencast: DataFed Tools
    15. 17. KMZ: Google Earth-DataFed Mashup GA Smoke Global Chem
    16. 18. The Transformational Effect of Networking <ul><li>Information has become the main driver of progress </li></ul><ul><li>Time and place are no longer barriers to participation and interaction </li></ul><ul><li>The Web has become a medium participation - ‘Web 2.0’ phenomenon </li></ul>“ Networking has led to an unprecedented surge of productivity” Time Magazine, Person of the Year 2006, YOU <ul><li>These are opportunities to enable Earth Science through more networking </li></ul><ul><li>But many resistances to networking exist that need to be overcome </li></ul>
    17. 19. Networking Multiplies Value Creation Application Data 1 User Stovepipe Value = 1 1 Data x 1 Program = 1 Enclosed Value-Creating Process - ‘Stovepipe’
    18. 20. Application Data Application Application Application Application Stovepipe 1 User Stovepipe Value = 1 1 Data x 1 Program = 1 5 Uses of Data Value = 5 1 Data x 5 Program = 5 Networking Multiplies Value Creation
    19. 21. Networking Multiplies Value Creation Merging data may creates new, unexpected opportunities Not all data are equally valuable to all programs 1 User Stovepipe Value = 1 1 Data x 1 Program = 1 5 Uses of Data Value = 5 1 Data x 5 Program = 5 Open Network Value = 25 5 Data x 5 Program = 25 Data Data Data Data Data Stovepipe Application Application Application Application Application
    20. 22. The Future <ul><li>AQ Science, Management </li></ul><ul><ul><li>Pollutant Characterization (Obs-Model-Emission Integration) </li></ul></ul><ul><ul><li>Agile monitoring and assessment </li></ul></ul><ul><li>GEOSS, Collaboration, Informatics </li></ul><ul><ul><li>The future is bright, too bright? </li></ul></ul><ul><ul><li>So many new things, so little time </li></ul></ul><ul><li>DataFed </li></ul><ul><ul><li>Continue promoting standards-based networking </li></ul></ul><ul><ul><li>Enabling IS users create new, actionable knowledge faster </li></ul></ul><ul><ul><li>Move data flow maintenance from R/D to operational </li></ul></ul>
    21. 23. Integrated observation-modeling complex – R. Scheffe Optimized PM2.5, O3 Characterizations Health Air management ecosystems Land AQ Monitors Total column depth (through Satellites) AQ model results Vertical Profiles Integrated Observation- Modeling
    22. 24. Pollutant Characterization, Understanding <ul><li>Characterization – creating the best available pollutant pattern as distributed in space-time-parameter </li></ul><ul><li>Characterization - achievable by Reanalysis with the ‘best available’ model and assimilated observations </li></ul><ul><li>Understanding gained from the model processes and applying previous/tacit knowledge </li></ul><ul><li>Goal: Pollutant Characterization and Understanding </li></ul>Models Observations Emissions Reanalysis Forward model with assimilated observations Data Interpretation Use of previous & tacit knowledge to explain data GOAL: Knowledge Creation Characterization of pattern; understating of processes Characterization
    23. 25. NAAMS: National Ambient Air Monitoring Strategy and NCore … coordinated multi-pollutant real-time monitoring network
    24. 26. Public Information Health/Exp. Assessment Emissions Planning AQ Trends and Accountability Science Support NAAQS National Air Quality Information Integration AQ Data Pool National Air Quality Info Network Re-examination of Data Access and processing Systems Pooling of data/info resources for re-use in multiple applications; a la GEOSS
    25. 27. Sensing Revolution Web 2.0
    26. 28. Summary <ul><li>There is a slow ‘ aligning of stars ’ for integrating heterogeneous data </li></ul><ul><li>System of Systems architecture is suitable for integrating data </li></ul><ul><ul><li>Standard data access is a key interoperability protocol </li></ul></ul><ul><ul><li>Heterogeneous data can be non-intrusively standardized by mediators </li></ul></ul><ul><ul><li>Service-based software architecture delivers tailored products to diverse uses </li></ul></ul><ul><li>Federated data and shared web-based tools are in use </li></ul><ul><ul><li>DataFed already includes over 100 datasets (emissions, ground, satellite) </li></ul></ul><ul><ul><li>The system has been applied to EPA policy, regulatory and science development </li></ul></ul><ul><li>However, </li></ul><ul><ul><li>DataFed is just one of the many mediator nodes, but these need to be connected </li></ul></ul><ul><ul><li>Much more data would need to be federated </li></ul></ul><ul><ul><li>HTAP model-data comparison would be an attractive use case </li></ul></ul>
    27. 29. DataFed Applications (2002-2007) Science Mystery (Nitrate?) Events Data Integration (PM-Bext; NO2 Sat-Surf; AQ Event Detection Algorithms AQ Management Exceptional Event Analysis (EPA – N. Frank) Network Assessment (EPA – R. Scheffe) Fire-Smoke, Global Emissions (EPA – T. Keating) FASTNET, CATT Tools, S/R Analysis (RPO – R. Poirot) IS Networking Infrastructure NASA/ESIP Web Services, SAO (NASA – L. Friedl, K. Moe) GEOSS Interoperability Demos (Wash. U) HTAP Network, Integration (EPA – Keating)
    28. 30. FASTNET Report: 0409FebMystHaze (RPO – R. Poirot) Mystery Winter Haze: Natural? Nitrate/Sulfate? Stagnation? Contributed by the FASNET Community, Sep. 2004 Correspondence to R Husar , R Poirot Coordination Support by Inter-RPO WG Fast Aerosol Sensing Tools for Natural Event Tracking, FASTNET NSF Collaboration Support for Aerosol Event Analysis NASA REASON Coop EPA -OAQPS AIRNOW PM25 - February Sulfate-driven Jul-Aug peak Feb-Mar peak, of unknown origin
    29. 31. Data Fusion: AIRNOW PM25 - ASOS Bext <ul><li>2004 July 20 14:00 </li></ul>July 21, 2004 July 22, 2004 July 23, 2004 ARINOW PM25 ARINOW PM25 ARINOW PM25 ASOS RHBext ASOS RHBext ASOS RHBext
    30. 32. PM Event Detection from Time Series <ul><li>Contributed by the FASNET Community, Sep. 2004 </li></ul><ul><li>Correspondence to R Husar , R Poirot </li></ul><ul><li>Coordination Support by </li></ul><ul><li>Inter-RPO WG Fast Aerosol Sensing Tools for Natural Event Tracking, FASTNET </li></ul><ul><li>NSF Collaboration Support for Aerosol Event Analysis </li></ul><ul><li>NASA REASON Coop </li></ul><ul><li>EPA -OAQPS </li></ul>Event : Deviation > x*percentile
    31. 33. Speciated PM Network Assessment (EPA – R. Scheffe) CIRA/ VIEWS Database CAPITA/ DataFed Database Network Assessment PPT IMPROVE EPA SPEC CIRA Tools and Processes DataFed Tools and Processes Analysis Tools and Processes Speciated Data Flow and Processing EPA NCore Process Evaluation, Feedback
    32. 34. Distributed Fire Data Sources (S. Falke, EPA, NASA) Numerous state, regional, and national fire related databases and online access applications exist. The challenge is to bring them together, on-the-fly, without requiring substantial changes to the underlying systems. Also need to access data sources that are not “Web-ready”. BlueSkyRAINS GeoMAC WFAS USGS NOAA UMaryland
    33. 35. Combined Aerosol Trajectory Tool, CATT (RPO – R. Poirot) Next Process Next Process Aerosol Data Collection IMP. EPA Aerosol Sensors Integration VIEWS Integrated AerData AEROSOL Weather Data Assimilate NWS Gridded Meteor. Trajectory ARL Traject.Data TRANSPORT TrajData Cube Aggreg. Traject. AerData Cube CATT Aggreg.Aerosol CATT-In CAPITA CATT-In CAPITA Trajectory Browser Kitty: Simple CATT CATT Transport Analyzer
    34. 36. HTAP Data Network (EPA – T. Keating) TF HTAP Workshop Forshungszentrum Juelich, Oct 17-19, 2007, Juelich, Germay Application Examples for NOx Analysis Collaborators: Rudolf Husar , Washington U. St. Louis Stefan Falke , Northrop, Wash U. Greg Leptoukh , NASA, Goddard Martin Schultz , FZJ, Juelich
    35. 37. GEOSS Interoperability Demos (Washington Univ.) Beijing Barcelona Denver
    36. 39. Origin of Fine Dust Events over the US <ul><li>Sulfate is local, no major spikes </li></ul>Gobi dust transport in spring Sahara dust import in summer Fine dust spikes over the entire US are mainly from intercontinental transport
    37. 40. Air Quality Management System: Components and Functions Public Analyzing Interpreting Evaluating Separating Synthesizing Organizing Quality control Formatting Documenting Displaying Deciding Evaluate options Matching goals Compromising Choosing Data Manager, Organizer Technical Analysts, Program Manager Policy Analysts, Decision Maker Value Adding Processes Human Agents Decision Support System (DSS) The primary purpose of data systems is to mediate between data providers and programs/projects Programs perform analysis for Orgs., the DSS is within programs The big decisions of societal importance are done by Organizations (This needs more wisdom from the practioners)
    38. 41. Flow of Data and Usage Control Data Control Requesting Information Providing Information Sensors Acquisition processing User Programs NAAQS SIPs Forecast GEOSS … Info System Negotiating Space Domain Processing Data Sharing Std. Interface Gen. Processing Std. Interface Data Control Reports Reporting Obs. & Models Decision Support System User Agencies
    39. 42. DataFed Tools - Subset Consoles: Data from diverse sources are displayed to create a rich context for exploration and analysis CATT: Combined Aerosol Trajectory Tool for the browsing backtrajectories for specified chemical conditions Viewer: General purpose spatio-temporal data browser and view editor applicable for all DataFed datasets
    40. 43. Summary <ul><li>Global Monitoring - Modeling Revolution – ‘May you live in interesting times’ </li></ul><ul><ul><li>We are in the midst of an observational revolution (satellites, monitoring networks). </li></ul></ul><ul><ul><li>The global distribution and transport of some pollutants can be monitored daily </li></ul></ul><ul><ul><li>Global models are maturing into effective analytical and predictive tools </li></ul></ul><ul><li>Results to Date: </li></ul><ul><ul><li>Compelling evidence for global-scale transport of PM and Ozone </li></ul></ul><ul><ul><li>Qualitative evidence of ‘extra-jurisdictional’ impact on the US air quality </li></ul></ul><ul><ul><li>Potential for quantification of natural and non-US impact </li></ul></ul><ul><li>Such massive job will require: </li></ul><ul><ul><li>International, interagency, interdisciplinary collaboration. </li></ul></ul><ul><ul><li>Open flow of data/knowledge </li></ul></ul><ul><ul><li>Scientific ‘value-adding chains’ </li></ul></ul>
    41. 44. FASTNET and DataFed FASTNET (Fast Aerosol Sensing Tools for Natural Event Tracking) an open communal information sharing facility to study aerosol events , including detection, tracking and impact on PM and haze. The main asset of FASTNET is the community of data analysts, modelers, managers participating in the production of actionable knowledge from data and models The community is supported by a non-intrusive data integration infrastructure based on Internet standards (web services) and a set of web-tools evolving under the federated data system, DataFed DataFed is supported by its community and is under the umbrella of the interagency Earth Science Information Partners, ESIP (NASA, NOAA and EPA)
    42. 45. Emerging Air Quality Data Flow Network OGC WCS Data Access Protocol GEOSS Provides SOA for Coupling for Autonomous Nodes Facilitates Publishing, Finding and Accessing Data
    43. 46. Application of OGC WCS Data Access Protocol <ul><li>Regardless of the data location, data type and format, </li></ul><ul><li>the parameter-space-time query is the same </li></ul><ul><li>the return is in user selectable format from the offerings </li></ul>Coverage=THEEDDS.T& BBOX=-126,24,-65,52,0,0 &TIME=2002-07-07/2002-07-07 &FORMAT=NetCDF Coverage=SEAW.Refl& BBOX=-126,24,-65,52,0,0 &TIME=2002-07-07/2002-07-07 &FORMAT=GeoTIFF Coverage=SURF.Bext& BBOX=-126,24,-65,52,0,0 &TIME=2002-07-07/2002-07-07 &FORMAT=NetCDF-table Grid Image Station Data Parameter Bounding Box Time Range Out Format
    44. 47. Web 1.0 -> Web 2.0 Transformation <ul><li>The Web is being transformed: It is becoming more participatory </li></ul><ul><li>Its content is increasingly generated and distributed by individuals </li></ul><ul><li>See the explosive growth of wikies, picture-sharing, blogs, Facebook </li></ul><ul><li>This architectural, technological and cultural change is Web 2.0 </li></ul><ul><li>Web 2.O is good for AtmosphericScience community since it allows </li></ul><ul><ul><li>Better harvesting of current knowledge </li></ul></ul><ul><ul><li>Collaborative creation new knowledge. </li></ul></ul>
    45. 48. <ul><li>Distributed Responsibility DataFed </li></ul><ul><li>The data lies with the data providers </li></ul><ul><li>The wrappers and mediators with DataFed community </li></ul><ul><li>Application programs with end user </li></ul><ul><li>Data discovery with data & service registries </li></ul>Distribution of Responsibility
    46. 49. The Information Interoperability Stack
    47. 50. Imagine…More Shared Obs & Models…. On Your Fingertips or Google Earth.. 2007++ More Global Data & Models 2007 Global Data & Models
    48. 51. Regional Haze Rule: Natural Aerosol <ul><li>Looking ahead to reach natural conditions </li></ul><ul><li>… in 60+ years!!! </li></ul>
    49. 52. Asian Dust Cloud over N. America On April 27, 1998 the dust cloud arrived in North America. Regional average PM10 concentrations increased to 65 mg/m 3 In Washington State, PM10 concentrations exceeded 100 mg/m 3 Asian Dust 100  g/m 3 Hourly PM10
    50. 53. Aircraft Detection of Siberian Forrest Smoke near Seattle, WA Jaffe et. al., 2003