SC13 BoF: RDA and HPC
Upcoming SlideShare
Loading in...5
×
 

SC13 BoF: RDA and HPC

on

  • 373 views

5 minute presentation during the SC13 Birds of a Feather Session on the relationship between the Research Data Alliance and High Performance Computing.

5 minute presentation during the SC13 Birds of a Feather Session on the relationship between the Research Data Alliance and High Performance Computing.

Statistics

Views

Total Views
373
Slideshare-icon Views on SlideShare
371
Embed Views
2

Actions

Likes
3
Downloads
3
Comments
0

1 Embed 2

https://www.linkedin.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    SC13 BoF: RDA and HPC SC13 BoF: RDA and HPC Presentation Transcript

    • Research Data Alliance (RDA) for HPC SC13 Birds of a Feather session November 20, 2013 17:30-19:00 MST Colorado Convention Center Denver Colorado Contribution of John W. Cobb Oak Ridge National Lab. DataONE Project
    • Why Am I here? From what perspectives do I speak? •  Discipline scientist •  HPC application evangelist •  Cyberinfrastructure leverage for experimental facilities •  Cyberinfrastructure/HPC center operations •  Cyberinfrastructure efforts for data-Intensive science efforts Without data there is no science 2 Presentation name
    • HPC centers and archive have different service objectives Cycles not used are lost Data management involves a long-term commitment of resources 3 Presentation name
    • Comparing HPC centers and data archives Simulations Experiment/Observation •  Generate data at will •  Collect data from physical events •  Can programmatically control data quality •  Data quality may be limited by collection methods •  Can be reproduced more easily •  May be difficult, expensive, or impossible to reproduce •  ==> Can be copious •  ==> May be more limited •  weaker tradition of metadata and data quality •  long-term focus on metadata and data quality 4 Presentation name
    • Consequently different challenges •  HPC centers excel at: –  Volume and velocity –  Analysis at scale 5 Presentation name •  Archives excel at: –  Variety –  Metadata capture –  Data quality
    • Convergence of data and HPC Some DataONE experience 6 Presentation name
    • eBird pilot project exploration and visualization Diverse  bird  observa$ons  and   environmental  data  from   300,00  loca$ons  in  the  US   integrated  and  analyzed  using   High  Performance  Compu$ng   Resources   Model  results   Occurrence  of  Indigo  Bun=ng  (2008)   Land  Cover   Jan   Meteorology   MODIS  –   Remote   sensing  data   7 Presentation name Apr   Jun   Sep   Dec   •  Examine  pa;erns  of   migra$on     Spa$o-­‐Temporal  Exploratory   Model  iden$fies  factors   affec$ng  pa;erns  of  migra$on   •  Infer  how  climate   change  may  affect   bird  migra$on  
    • 8 Presentation name 8
    • Exploration, Visualization, and Analysis Benchmark   Observa=ons   Workflows for hypothesis development, testing, and exploration Interactive maps and plots for multidimensional data exploration and analysis Terrestrial   Biosphere   Model  Output   Model     Structure   Informa=on   Provenance Framework 9 9 Presentation name
    • DataONE experience •  CI created: interoperable data service functional interfaces •  4 reference interface implementations completed •  8 client-side “investigator toolkit” tools released, 4 more in development •  16 collaborating Member Node repositories (internationally) •  > 100,000 data objects published •  Conducted 81 workshops of data management •  Published 65 data management “best practices” •  Completed several baseline and follow-up surveys on state of data management with scientists, libraries, librarians, … 10 Presentation name
    • DataONE experience (cont.) About half the effort has been on education, training and outreach about data management practices 11 Presentation name
    • “Data = Human” - Genevieve Bell SC13 Keynote 12 Presentation name