The HDF Group

Parallel HDF5 Developments
Quincey Koziol

The HDF Group
koziol@hdfgroup.org

Copyright © 2010 The HDF Grou...
Parallel I/O in HDF5
• Goal is to be invisible: get same performance with HDF5
as with MPI I/O
• Project with LBNL/NERSC t...
Parallel I/O In HDF5
• Up to 12GB/s to shared file (out of 15GB/s) on
NERSC’sfranklinsystem:

Copyright © 2010 The HDF Gro...
The HDF Group

Recent Improvements to
Parallel HDF5

Copyright © 2010 The HDF Group. All Rights Reserved

4

www.hdfgroup....
Recent Parallel I/O Improvements
• Reduce number of file truncation operations
• Distribute metadata I/O over all processe...
Reduced File Truncations
• HDF5 library was very conservative about truncating
file when H5Fflush called.
• However, file ...
Distributed Metadata Writes
• HDF5 caches metadata internally, to improve both
read and write performance
• Historically, ...
Dsitributed Metadata Writes
• I/O Trace Before Changes
• Note long sequence of I/O from process 0

• I/O Trace After Chang...
Improved Selection Matching
• When HDF5 performs I/O between regions in memory
and the file, it compares the regions to se...
Improved Selection Matching
• Change resulted in ~20x I/O performance
improvement when reading 1-D buffer from 2-D file
da...
The HDF Group

Upcoming Improvements to
Parallel HDF5

Copyright © 2010 The HDF Group. All Rights Reserved

11

www.hdfgro...
High-Level “HPC” API for HDF5
• HPC environments typically have unusual, possibly even
unique, computing, network and stor...
High-Level “HPC” API for HDF5 – API Overview
• File System Tuning:
• Automatic file system tuning
• Pass file system tunin...
The HDF Group

Parallel HDF5 in the Future

Copyright © 2010 The HDF Group. All Rights Reserved

14

www.hdfgroup.org
HPC Funding in 2010 and Beyond
• DOE Exascale FOA w/LBNL &PNNL Proposal Funded
• Exascale-focused enhancements to HDF5

• ...
Future Parallel I/O Improvements
• Library Enhancements Proposed:
•
•
•
•
•
•
•

Remove collective metadata modification r...
The HDF Group

Performance Hints for Using
Parallel HDF5

Copyright © 2010 The HDF Group. All Rights Reserved

18

www.hdf...
Hints for Using Parallel HDF5
• Pass along MPI Info hints to file open: H5Pset_fapl_mpio
• Use MPI-POSIX file driver to ac...
Nächste SlideShare
Wird geladen in …5
×

Parallel HDF5 Developments

388 Aufrufe
260 Aufrufe

Veröffentlicht am

Veröffentlicht in: Technologie, Bildung
0 Kommentare
0 Gefällt mir
Statistik
Notizen
  • Als Erste(r) kommentieren

  • Gehören Sie zu den Ersten, denen das gefällt!

Keine Downloads
Aufrufe
Aufrufe insgesamt
388
Auf SlideShare
0
Aus Einbettungen
0
Anzahl an Einbettungen
5
Aktionen
Geteilt
0
Downloads
2
Kommentare
0
Gefällt mir
0
Einbettungen 0
Keine Einbettungen

Keine Notizen für die Folie

Parallel HDF5 Developments

  1. 1. The HDF Group Parallel HDF5 Developments Quincey Koziol The HDF Group koziol@hdfgroup.org Copyright © 2010 The HDF Group. All Rights Reserved 1 www.hdfgroup.org
  2. 2. Parallel I/O in HDF5 • Goal is to be invisible: get same performance with HDF5 as with MPI I/O • Project with LBNL/NERSC to improve HDF5 performance on parallel applications: • 6-12x performance improvements on various applications (so far) Copyright © 2010 The HDF Group. All Rights Reserved 2 www.hdfgroup.org
  3. 3. Parallel I/O In HDF5 • Up to 12GB/s to shared file (out of 15GB/s) on NERSC’sfranklinsystem: Copyright © 2010 The HDF Group. All Rights Reserved 3 www.hdfgroup.org
  4. 4. The HDF Group Recent Improvements to Parallel HDF5 Copyright © 2010 The HDF Group. All Rights Reserved 4 www.hdfgroup.org
  5. 5. Recent Parallel I/O Improvements • Reduce number of file truncation operations • Distribute metadata I/O over all processes • Detect same “shape” of selection in more cases, allowing optimized I/O path to be taken more often • Many other, smaller, improvements to library algorithms for faster/better use of MPI Copyright © 2010 The HDF Group. All Rights Reserved 5 www.hdfgroup.org
  6. 6. Reduced File Truncations • HDF5 library was very conservative about truncating file when H5Fflush called. • However, file truncation very expensive in parallel. • Library modified to defer truncation until file closed. Copyright © 2010 The HDF Group. All Rights Reserved 6 www.hdfgroup.org
  7. 7. Distributed Metadata Writes • HDF5 caches metadata internally, to improve both read and write performance • Historically, process 0 writes all dirtied metadata to HDF5 file, while other processes wait • Changed to distribute ranges of metadata within the file across all processes • Results in ~10x improvement in I/O for Vorpal (see next slide) Copyright © 2010 The HDF Group. All Rights Reserved 7 www.hdfgroup.org
  8. 8. Dsitributed Metadata Writes • I/O Trace Before Changes • Note long sequence of I/O from process 0 • I/O Trace After Changes • Note distribution of I/O across all processes, taking much less time Copyright © 2010 The HDF Group. All Rights Reserved 8 www.hdfgroup.org
  9. 9. Improved Selection Matching • When HDF5 performs I/O between regions in memory and the file, it compares the regions to see if the application’s buffer can be directly used for I/O • Historically, this algorithm couldn’t detect that a region with the same shape, but embedded in arrays of different dimensionality were the same • For example, a 10x10 region in a 2-D array should compare equal to the equivalent 1x10x10 region in a 3-D array • Changed to detect same shaped region in arbitrary source and destination buffer array dimensions, allowing I/O from application’s buffer in more circumstances. Copyright © 2010 The HDF Group. All Rights Reserved 9 www.hdfgroup.org
  10. 10. Improved Selection Matching • Change resulted in ~20x I/O performance improvement when reading 1-D buffer from 2-D file dataset • From ~5-7 seconds (or worse) to ~0.25-0.5 seconds, on a variety of machine architectures (Linux: amani, hdfdap, jam; Solaris: linew) Copyright © 2010 The HDF Group. All Rights Reserved 10 www.hdfgroup.org
  11. 11. The HDF Group Upcoming Improvements to Parallel HDF5 Copyright © 2010 The HDF Group. All Rights Reserved 11 www.hdfgroup.org
  12. 12. High-Level “HPC” API for HDF5 • HPC environments typically have unusual, possibly even unique, computing, network and storage configurations. • The HDF5 distribution should provide easy to use interfaces that ease scientists and developers’ use of these platforms: • Tune and adapt to the underlying parallel file system. • New high-­­level API routines that wrap existing HDF5 functionality in a way that iseasier for HPC application developers to use and help them move applications from one HPC environment to another. • RFC: http://www.hdfgroup.uiuc.edu/RFC/HDF5/HPC-High-LevelAPI/H5HPC_RFC-2010-09-28.pdf Copyright © 2010 The HDF Group. All Rights Reserved 12 www.hdfgroup.org
  13. 13. High-Level “HPC” API for HDF5 – API Overview • File System Tuning: • Automatic file system tuning • Pass file system tuning info to HDF5 library • Convenience Routines: • “Macro” routines • Encapsulate common parallel I/O operations • E.g. - create a dataset and write a different hyperslab from each process, etc. • “Extended” routines • Provide special parallel I/O operations not available in main HDF5 API • Examples: • • • • “Group” collective I/O operations Collective raw data I/O on multiple datasets Collective multiple object manipulation Optimized collective object operations Copyright © 2010 The HDF Group. All Rights Reserved 13 www.hdfgroup.org
  14. 14. The HDF Group Parallel HDF5 in the Future Copyright © 2010 The HDF Group. All Rights Reserved 14 www.hdfgroup.org
  15. 15. HPC Funding in 2010 and Beyond • DOE Exascale FOA w/LBNL &PNNL Proposal Funded • Exascale-focused enhancements to HDF5 • LLNL Support & Development Contract • Performance, support and medium-term focused development • DOE Exascale FOA w/ANL and ORNL Proposal Funded • Research on alternate file formats for Exascale I/O • LBNL Development Contract • Performance and short-term focus Copyright © 2010 The HDF Group. All Rights Reserved 15 www.hdfgroup.org
  16. 16. Future Parallel I/O Improvements • Library Enhancements Proposed: • • • • • • • Remove collective metadata modification restriction Append-only mode, targeting restart files Embarrassingly parallel mode, for decoupled applications Overlapping compute & I/O, with asynchronous I/O Auto-tuning to underlying parallel file system Improve resiliency of changes to HDF5 files Bring FastBit indexing of HDF5 files into mainstream use for queries during data analysis and visualization • Virtual file driver enhancements • Improved Support: • Parallel I/O performance tracking, testing and tuning Copyright © 2010 The HDF Group. All Rights Reserved 16 www.hdfgroup.org
  17. 17. The HDF Group Performance Hints for Using Parallel HDF5 Copyright © 2010 The HDF Group. All Rights Reserved 18 www.hdfgroup.org
  18. 18. Hints for Using Parallel HDF5 • Pass along MPI Info hints to file open: H5Pset_fapl_mpio • Use MPI-POSIX file driver to access file: H5Pset_fapl_mpiposix • Align objects in HDF5 file: H5Pset_alignment • Use collective mode when performing I/O on datasets: H5Pset_dxpl_mpio before H5Dwrite/H5Dread • Avoid datatype conversions: make memory and file datatypes the same • Advanced: explicitly manage metadata flush operations with H5Fset_mdc_config Copyright © 2010 The HDF Group. All Rights Reserved 19 www.hdfgroup.org

×