Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HDF-VFS

221 views

Published on

Enabling HDF5 files to function through the operating system as traditional filesystems offers a number of intriguing possibilities. The foremost concept is to allow non-HDF application software to transparently interact with HDF5 files & datasets, thereby enhancing methods for interoperability and high performance across user formats and user communities. A second option is to use the HDF5 filesystem as a project management tool for organizing disparate datasets into a coherent and comprehensive data compilation, thereby making long-term archiving more tenable through a systemic method of provenance management. A third capability is to insert a new type of community filter between application software and the operating system, allowing for the creation of standardized community data structures by user communities. Because of the robust API capabilities of HDF5, any filesystem derived shall have features beyond traditional filesystem designs, such as dedicated virtual filenames that perform unique functions such multiple simultaneous filenames within this filesystem, or automated concatenation of datasets into a single visible filename. Described are prototypes, proposed design strategies, and operational objectives.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

HDF-VFS

  1. 1. HDF‐VFS

  2. 2. MOTIVATION
 •  advanced
3D
spa3al
EM
imagery
(1998)
 •  pedestrian
MPEG
video
imagery
(2004)
 •  all
project
and
anima3on
intermediate
files
(2008)

  3. 3. REQUIREMENTS
 •  •  •  •  HPC
 Interoperability
 Preserva3on
 Legacy
transparency

  4. 4. MECHANISMS
 •  •  •  •  •  •  HDF‐VFS
 Performance
 Provenance
 Community
extensions
 Automated
project
management
 Registries
(CPID,
communi3es,
s/w)

  5. 5. ESTABLISHED
COMMUNITIES 

 •  •  •  •  •  EOS
 NeXus
 Matlab
 IDL
 netCDF

  6. 6. EVOLVING
COMMUNITIES
 •  •  •  •  •  •  •  •  EM
 Bio‐VIZ
 X‐ray
crystallography
 Op3cal
microscopy
 Astrophysics
 Data
storage
 Astronomy
 Genomics

  7. 7. OUTREACH
 •  •  •  •  ACM
 NIST
 NSF
PROPOSALS
 IWGDD

  8. 8. TRIGGER
 •  Anima3on
produc3on
 •  EM
databank/PDB/EBI/NSF

  9. 9. The
PLAN
 •  Phase
one
 

 

 
 
 
<HDF‐VFS
a`ribute>
 
‐

 •  Phase
two
 /groups/
 {datasets}
 ‐  Performance
 ‐  Provenance
 ‐  Community
filters


  10. 10. <hdf‐vfs>
 •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  •  Version
ID
 Visible
flag
 Ini3al
UUID
 Current
UUID
 Image
UUID
 Created
3mestamp
 Modified
3mestamp
 Accessed
3mestamp
 VFS
file/directory
name
 Concatena3on
link
 Permissions
(read,
permanent
read,
write,
non‐extensible
write)
 Group
to
dataset
inheritance
 Dataset
size
 Provenance
 Performance
 Filter 


  11. 11. Sleeping
Gorilla
issues
 •  Design,
development,
maintenance,
long‐term
support
 •  Loca3on
and
control
of
Code
 
 
 

(sourceforge,
google,
NCMI,
HDFgroup)
 •  HDFGroup
overlap
 •  HDF
file
inside
an
HDF
file
 •  Imagery
RDF
design
 •  Provenance
RDF
design
 •  Performance
RDF
design
 •  Filter
&
community
registries
 •  Microsog

  12. 12. Sugges3ons,
comments
 Ma`hew
Dougherty
 ma`hewd@bcm.edu


×