HDF5 iRODS
 

Like this? Share it with your network

Share
  • 250 views

Numerous scientific teams use the HDF5 format to store very large datasets. Efficient use of this data in a distributed environment depends on client applications being able to read any subset of the ...

Numerous scientific teams use the HDF5 format to store very large datasets. Efficient use of this data in a distributed environment depends on client applications being able to read any subset of the data without transferring the entire file to the local machine. The goal of the HDF5-iRODS Project was to develop an HDF5-iRODS module for the iRODS datagrid server that supported this capability, and to apply the technology to an NCSA/SDSC Strategic Applications Program (SAP) project, FLASH.

A joint team from The HDF Group (representing NCSA) and the SDSC SRB group collaborated to accomplish the project goal. The team implemented five HDF5 microservices functions on the iRODS server, and developed an iRODS FLASH slice client application. The client implementation also includes a JNI interface that allows HDFView, a standard tool for browsing HDF5 files, to access HDF5 files stored remotely in iRODS. Finally, three new collection client/server calls were added to the iRODS APIs, making it easier for users to query the content of an iRODS collection.

Statistics

Views

Total Views
250
Views on SlideShare
233
Embed Views
17

Actions

Likes
1
Downloads
4
Comments
0

4 Embeds 17

http://hdfeos.org 10
http://localhost 4
http://www.hdfeos.org 2
http://hdfdap 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

HDF5 iRODS Presentation Transcript

  • 1. HDF5-iRODS Peter Cao The HDF Group Mike Wan San Diego Supercomputer Center HDF and HDF-EOS Workshop XII October 16, 2008 October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 1
  • 2. Imagine 100 Frames x 1 GB = 100 GB 1 GB HPSS DB October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO HPC 2
  • 3. Outline • HDF5-iRODS module • Applications • Demo (if time permits) October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 3
  • 4. What is iRODS? • Stands for i Rule Oriented Data Systems. • Developed by the Storage Resource Broker (SRB) team at the San Diego Supercomputer Center (SDSC). • A data grid software system that enables a customizable architecture for sharing data distributed across heterogeneous resources. October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 4
  • 5. What is iRODS? Distributed Storage Database System Rule System For more information and download, visit www.irods.org October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 5
  • 6. Motivation iRODS Distributed data system Indexing and searching Access control, etc. HDF5 Large and diverse data High-performance I/O Subsetting, etc. High-performance distributed data system October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 6
  • 7. Whole File Access client I need to see the eye of Hurricane Bob! server Get the file HDF5 Transfer large file – slow! October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 7
  • 8. HDF5 Object or Subset Level Access client I need to see the eye of Hurricane Bob! Get me th e eye of hu rric server ane Bob HDF5 Small transfer – fast! October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 8
  • 9. HDF5-iRODS Module Distributed Storage HDF5 iRODS Module Database System Rule System October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO Micro-services 9
  • 10. HDF5-iRODS Data Flow client server HDF5 Library HDF5 HDF5 Object or Subset HDF5 Object or Subset (File, Group, Dataset, Subset of Dataset, Attribute) iRODS message (pack/unpack) (File, Group, Dataset, Subset of Dataset, Attribute) iRODS message (pack/unpack) October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 10
  • 11. New iRODS Micro-services • Five iRODS micro-services − msiH5File_open − msiH5File_close − msiH5Dataset_read • reads entire dataset or subset of dataset − msiH5Dataset_read_attribute − msiH5Group_read_attribute Rule Engine msiH5Dataset_read H5Dataset.read() File October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 11
  • 12. HDF5-Enabled iRODS Server • HDF5 library • Other external libraries (SZIP, ZLIB) • iRODS version 1.1 or later from https://www.irods.org/index.php/Downloads/ Follow the README instruction at module/hdf5 October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 12
  • 13. Client Application Requirements • • • • HDF5 object header files and client handlers iRODS client library and header files HDF5-iRODS JNI for Java applications only $HOME/.irods/.irodsEnv irodsHost 'kagiso.hdfgroup.uiuc.edu' irodsPort 1247 irodsUserName 'rods‘ … For more information and download, visit http://www.hdfgroup.org/projects/irods October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 13
  • 14. Example: HDFView Client Application October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO HDF5-Enabled iRODS Server 14
  • 15. Example: HDFView October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 15
  • 16. Example: islice FLASH is an adaptive-mesh simulation code for astrophysical hydrodynamics problems • Command-line tool to visualize data produced by FLASH simulation runs • Data is huge (~ 100 GB) • Interesting part is small adaptive mesh 16*16*16*47531 For more information, visit flash.uchicago.edu October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 16
  • 17. Example: islice “./islice -t flash.pal -m rpv1 -p 2 rundir_055_8km_hdf5_plt_cnt_0424” Star Ash Flow Collision focus point 2048*2048*8 (32MB) Breakout point A slice from a 3D simulation of The Detonation of a White Dwarf Star October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 17
  • 18. Thank You! This project is sponsored by CIP/NLADR, NSF PACI Project in Support of the Collaboration between the National Center for Supercomputing Applications (NCSA) and the San Diego Supercomputer Center (SDSC). The project is managed under the CyberInfrastructure Partnership (CIP), a joint effort led by NCSA and SDSC to help scientists and engineers take full advantage of the high-end CyberInfrastructure resources funded by the National Science Foundation (NSF). October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 18
  • 19. Questions/comments? October 15-17, 2008 HDF and HDF-EOS Workshop XII, Denver, CO 19