The Federated Data System DataFed: Experiences in Data Homogenization and Networking <ul><li>R.B. Husar, K. Hoijarvi, S. R...
DataFed in a Nutshell: A Federation of  autonomous, distributed  data providers Performs non-intrusive  wrapping   of data...
Five practices for agile, seamless data federation: <ul><li>Space-Time Query  for standardized access to all data (WCS) </...
Parameter-Space-Time Query  Using OGC WCS Data Access Protocol <ul><li>Regardless of the data location, data type and form...
<ul><li>DataFed wrappers are non-intrusive, third party  </li></ul>Third Party Data Wrappers   Heterogeneous input data  >...
Mediated User-Data Interface Mediator turns data into Views Mediated Integration  is a flexible design pattern for System ...
SOAP RDF Mashup Workflow Mashups: Loose Coupling of Autonomous Applications DataFed – Wiki -- GoogleEarth
DataSpaces for Datasets GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Ca...
GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches,  ha...
-+ GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches, ...
Wiki ‘DataSpaces’ Creating and Sharing Metadata Community Catalog - Find Dataset Describe Dataset Discuss Dataset ESIP Com...
Sharing Best Practices: GEO Best Practice Wiki
Developments and Challenges: <ul><li>Favorable Engineering Developments:   </li></ul><ul><li>A  Core network  for Air Qual...
ESIP Coordination Application
 
 
Upcoming SlideShare
Loading in …5
×

2008-05-27: Spring AGU, DataFed Best Practices

580 views
553 views

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
580
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • This talk is about the federated.. In particular our experineces with datafed over past 4 years of data homog and networking
  • 2008-05-27: Spring AGU, DataFed Best Practices

    1. 1. The Federated Data System DataFed: Experiences in Data Homogenization and Networking <ul><li>R.B. Husar, K. Hoijarvi, S. R. Falke, E. M. Robinson , Washington University, St. Louis </li></ul><ul><li>G. Leptoukh , NASA GSFC </li></ul>Spring AGU, May 29, 2008, Ft. Lauderdale
    2. 2. DataFed in a Nutshell: A Federation of autonomous, distributed data providers Performs non-intrusive wrapping of data into web services Provides service-based analysis services and tools General Experience with DataFed: It is an agile virtual data system can deliver info products to diverse users Third-party mediation can homogenize distributed data on the fly Since 2005, DataFed is used by EPA and in research DataFed Motivated by GEOSS DataFed development is guided by the meme of GEOSS
    3. 3. Five practices for agile, seamless data federation: <ul><li>Space-Time Query for standardized access to all data (WCS) </li></ul><ul><li>Data Wrappers for turning heterogeneous data into web services </li></ul><ul><li>Data Mediators for transforming data into ‘Views’ </li></ul><ul><li>Mashups for connecting autonomous application </li></ul><ul><li>DataSpaces for shared metadata by the users, for the users </li></ul>
    4. 4. Parameter-Space-Time Query Using OGC WCS Data Access Protocol <ul><li>Regardless of the data location, data type and format, </li></ul><ul><li>the parameter-space-time query is the same </li></ul><ul><li>the return is in user selectable format from the offerings </li></ul>Coverage=THEEDDS.T& BBOX=-126,24,-65,52,0,0 &TIME=2002-07-07/2002-07-07&FORMAT=NetCDF Coverage=SEAW.Refl& BBOX=-126,24,-65,52,0,0 &TIME=2002-07-07/2002-07-07&FORMAT=GeoTIFF Coverage=SURF.Bext& BBOX=-126,24,-65,52,0,0 &TIME=2002-07-07/2002-07-07&FORMAT=NetCDF-table Grid Image Station Data Parameter Bounding Box Time Range Out Format
    5. 5. <ul><li>DataFed wrappers are non-intrusive, third party </li></ul>Third Party Data Wrappers Heterogeneous input data >>> Homogeneous (WCS) Query
    6. 6. Mediated User-Data Interface Mediator turns data into Views Mediated Integration is a flexible design pattern for System of Systems Client-Server design is demanding: User carries the burden of integration Query Data Views
    7. 7. SOAP RDF Mashup Workflow Mashups: Loose Coupling of Autonomous Applications DataFed – Wiki -- GoogleEarth
    8. 8. DataSpaces for Datasets GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches, harvests invokes references publishes provides Standards; SIF Registry Adopted from Percivall , Feb 2008 by R. Husar, March 2008 Community AQ Catalog Catalog User Service Workflow composes Data Analyst visualizes Reports to Decision Maker Policy Analyst Informs find links to GEOSS Core Service Offerors and Users Shared Metadata by the Users, for the Users Services Community DataSpaces
    9. 9. GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches, harvests invokes references publishes provides Standards; SIF Registry Adopted from Percivall , Feb 2008 by R. Husar, March 2008 Community AQ Catalog Service Workflow composes Data Analyst visualizes Reports to Decision Maker Policy Analyst Informs find links to GEOSS Core Service Offerors and Users Shared Metadata by the Users, for the Users views report Services Community DataSpaces
    10. 10. -+ GEOSS Comp. Registry Community AQ Portal extracts Service Offeror registers GEOSS Clearinghouse Catalog list Searches, harvests invokes references publishes provides Standards; SIF Registry Adopted from Percivall , Feb 2008 by R. Husar, March 2008 Community AQ Catalog Service Workflow composes Data Analyst visualizes Decision Maker Policy Analyst Informs find links to GEOSS Core Service Offerors and Users Shared Metadata by the Users, for the Users views report Services Community DataSpaces
    11. 11. Wiki ‘DataSpaces’ Creating and Sharing Metadata Community Catalog - Find Dataset Describe Dataset Discuss Dataset ESIP Communal Wiki <ul><li>Semantic Wiki: Structured (RDF and Unstructured Content </li></ul><ul><li>Open, Standard Matadata - RDF </li></ul><ul><li>Ready for Export/Harvesting by Registries, Catalogs </li></ul>
    12. 12. Sharing Best Practices: GEO Best Practice Wiki
    13. 13. Developments and Challenges: <ul><li>Favorable Engineering Developments: </li></ul><ul><li>A Core network for Air Quality data sharing is emerging. </li></ul><ul><li>Standards are available for sharing previously unstructured data </li></ul><ul><li>Third-party mediation can homogenize the distributed data </li></ul><ul><li>Agile SOA-based systems can deliver info products to diverse users </li></ul><ul><li>Since 2005, one such IS, DataFed is used by EPA and in research </li></ul><ul><li>However: </li></ul><ul><li>Service interfaces are still uneven; networks are still fragile </li></ul><ul><li>The utility of social networking in science is not understood </li></ul><ul><li>Users can not provide feedback to upstream providers </li></ul><ul><li>Many cultural, legal and other barriers hamper progress </li></ul>
    14. 14. ESIP Coordination Application

    ×