Update on HDF, including recent changes to the software, new releases, THG collaborations, and future plans. Session will include an overview of the HDF4.2r2, HDF5 1.6.6, and 1.8.0 releases, as well as updates on completed and on-going THG projects including crash-proofing HDF5, efficient append to HDF5 datasets, and indexing in HDF5.
Status of HDF-EOS and access tools will be summarized. Updates on HDF-EOS, TOOLKIT, HDFView plug-in and The HDF-EOS to GeoTIFF (HEG) conversion tool, including recent changes to the software, ongoing maintenance, upcoming releases, future plans, and issues will be discussed.
This document discusses assigning Digital Object Identifiers (DOIs) to data products from NASA's Earth Observing System Data and Information System (EOSDIS). It reviews different identification schemes and recommends DOIs for their persistence and ability to provide unique, citable identifiers. The document outlines a pilot process to assign DOIs to specific EOSDIS data products, including embedding DOIs in metadata and registering them with the DataCite registration agency. Guidelines are provided for constructing the DOI suffix to make identifiers descriptive and recognizable to researchers.
The document provides an introduction to NetCDF4 and covers its key features and performance. It discusses NetCDF4's history as a joint project between Unidata and HDF Group to combine the strengths of netCDF and HDF5. NetCDF4 uses HDF5 as its storage layer and allows writing netCDF files with HDF5 features like compression, groups and parallel I/O. It provides an overview of NetCDF4's features and APIs, and shows performance benchmarks demonstrating the significant size reductions and minor performance impacts of using compression. The document concludes with suggestions for users regarding chunking for performance and using the classic model for backward compatibility.
Kirankumar MV has over 11 years of experience in software development for storage and embedded systems. He has expertise in C programming, Linux device drivers, embedded Linux, NAND flash memory, and storage controllers. He has worked on projects involving PCIe switches, expanders, multimedia frameworks, and mobile chipsets. Kirankumar seeks a role that utilizes his strong technical skills and ability to work well under pressure on customer-facing projects.
The document discusses recent and upcoming improvements to parallel HDF5 for improved I/O performance on HPC systems. Recent improvements include reducing file truncations, distributing metadata writes across processes, and improved selection matching. Upcoming work includes a high-level HPC API, funding for Exascale-focused enhancements, and future improvements like asynchronous I/O and auto-tuning to parallel file systems. Performance tips are also provided like passing MPI hints and using collective I/O.
This tutorial is designed for new HDF5 users. We will cover basic HDF5 Data Model objects and their properties, give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples will be used to illustrate HDF5 concepts.
This Tutorial gives a brief introduction to HDF5 for people who have never used it. It covers the HDF5 Data Model including HDF5 objects and their properties. It also briefly describes the HDF5 Programming Model and prepares participants for further self-study of HDF5 and hands-on sessions.
Update on HDF, including recent changes to the software, new releases, THG collaborations, and future plans. Session will include an overview of the HDF4.2r2, HDF5 1.6.6, and 1.8.0 releases, as well as updates on completed and on-going THG projects including crash-proofing HDF5, efficient append to HDF5 datasets, and indexing in HDF5.
Status of HDF-EOS and access tools will be summarized. Updates on HDF-EOS, TOOLKIT, HDFView plug-in and The HDF-EOS to GeoTIFF (HEG) conversion tool, including recent changes to the software, ongoing maintenance, upcoming releases, future plans, and issues will be discussed.
This document discusses assigning Digital Object Identifiers (DOIs) to data products from NASA's Earth Observing System Data and Information System (EOSDIS). It reviews different identification schemes and recommends DOIs for their persistence and ability to provide unique, citable identifiers. The document outlines a pilot process to assign DOIs to specific EOSDIS data products, including embedding DOIs in metadata and registering them with the DataCite registration agency. Guidelines are provided for constructing the DOI suffix to make identifiers descriptive and recognizable to researchers.
The document provides an introduction to NetCDF4 and covers its key features and performance. It discusses NetCDF4's history as a joint project between Unidata and HDF Group to combine the strengths of netCDF and HDF5. NetCDF4 uses HDF5 as its storage layer and allows writing netCDF files with HDF5 features like compression, groups and parallel I/O. It provides an overview of NetCDF4's features and APIs, and shows performance benchmarks demonstrating the significant size reductions and minor performance impacts of using compression. The document concludes with suggestions for users regarding chunking for performance and using the classic model for backward compatibility.
Kirankumar MV has over 11 years of experience in software development for storage and embedded systems. He has expertise in C programming, Linux device drivers, embedded Linux, NAND flash memory, and storage controllers. He has worked on projects involving PCIe switches, expanders, multimedia frameworks, and mobile chipsets. Kirankumar seeks a role that utilizes his strong technical skills and ability to work well under pressure on customer-facing projects.
The document discusses recent and upcoming improvements to parallel HDF5 for improved I/O performance on HPC systems. Recent improvements include reducing file truncations, distributing metadata writes across processes, and improved selection matching. Upcoming work includes a high-level HPC API, funding for Exascale-focused enhancements, and future improvements like asynchronous I/O and auto-tuning to parallel file systems. Performance tips are also provided like passing MPI hints and using collective I/O.
This tutorial is designed for new HDF5 users. We will cover basic HDF5 Data Model objects and their properties, give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples will be used to illustrate HDF5 concepts.
This Tutorial gives a brief introduction to HDF5 for people who have never used it. It covers the HDF5 Data Model including HDF5 objects and their properties. It also briefly describes the HDF5 Programming Model and prepares participants for further self-study of HDF5 and hands-on sessions.
A distributed video management cloud platform using hadoopredpel dot com
This document describes a distributed video management cloud platform using Hadoop. The platform utilizes Hadoop's parallel processing and flexible storage capabilities to efficiently store and process large amounts of video data. It integrates J2EE, Flex, Red5 streaming media server, and Hadoop to provide a user-friendly interface for managing videos. The platform is evaluated and shown to satisfy the requirements of massive video data management through optimized MapReduce processing of video tasks like encoding, decoding, and background subtraction.
This document provides an overview of HDF5 (Hierarchical Data Format version 5) and introduces its core concepts. HDF5 is an open source file format and software library designed for storing and managing large amounts of numerical data. It supports a data model with objects such as datasets, groups, attributes, and datatypes. HDF5 files can be accessed through its software library and APIs from languages like C, Fortran, C++, Python and more. The document covers HDF5's data model, file format, programming interfaces, tools and example code.
The document discusses HDF command line tools that can be used to view, modify, and manipulate HDF5 files. It provides examples of using tools like h5dump to view file structure and dataset information, h5repack to optimize file layout and compression, h5diff to compare files and datasets, and h5copy to copy objects between files. The tutorial was presented at the 15th HDF and HDF-EOS workshop from April 17-19, 2012.
The document introduces several free and inexpensive software alternatives for analyzing X-ray diffraction data, as commercial software packages can be very expensive. It recommends the Collaborative Computational Project #14 (CCP14) website as the best source for free software information. It then describes several free data conversion tools, including the PowDLL Converter and ConvX, which can convert between different file formats. Finally, it summarizes the capabilities of the free PowderX software for general data processing tasks like smoothing, background subtraction, and pattern indexing.
This tutorial is designed for anyone who needs to work with data stored in HDF and HDF5 files.
The first part of the tutorial will focus on the HDF5 utilities to display the contents of HDF5 files, to extract and to import data from and to HDF5 files, to compare two HDF5 files, and more. Participants will be guided through the hand-on examples and will learn about different tools options. New changes and advanced features will be covered in a separate session (Updates on HDF tools) on Wednesday.
The second part of tutorial includes a hands-on session to learn the HDF (4 & 5) Java browsing tool, HDFView. The tool and special plug-ins will be used to work with the existing HDF, HDF-EOS, and netCDF-4 files, and to create a new HDF5 file. The tutorial will cover basic features of HDFView.
In this talk we will discuss caching and buffering strategies in HDF5. The information presented will help developers write more efficient applications and avoid performance bottlenecks.
It will cover features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: datatype and partial I/O
This tutorial is for persons who are already familiar with HDF5 and wish to take advantage is some of its advanced features.
This document describes IBM's Visualization Data Explorer (DX), a data analysis and visualization tool. DX allows users to work with data from multiple sources using a powerful and unified data model. It provides a visual programming environment and large library of modules for importing, analyzing, displaying, and exporting data. Examples of DX use cases are shown from fields like computational fluid dynamics, earth science, and NASA research.
The document summarizes a workshop between NASA, software developers, science communities, and data centers to discuss HDF and HDF-EOS tools. Key topics included interactions between these groups, technical details of EOSDIS and HDF-EOS, available and needed tools, resources for developers, and next steps to continue engagement through websites and future meetings.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against developing mental illness and improve symptoms for those who already suffer from conditions like anxiety and depression.
The document summarizes a prototype for dataset-independent subsetting developed by UAH. The prototype allows users to spatially, temporally, and spectrally subset HDF-EOS format Earth science datasets via a web interface. It extracts only the requested data to reduce delivery time and resource usage. However, its use is currently limited as HDF-EOS has not been widely adopted and many legacy datasets are not in its format.
This document discusses incorporating ISO metadata standards into HDF files using the HDF Product Designer tool. It describes how the HDF Product Designer allows users to import pre-built ISO metadata components from a separate project into their HDF file designs. This allows essential contextual data or metadata to be stored in HDF5 files according to ISO 19115 standards.
This document discusses a pilot project to incorporate ISO 19115-2 metadata attributes at the granule level for the NASA SWOT mission. The metadata will be stored in HDF5 groups and generated in two ways - via XML serialization or XML style sheet conversions. The project aims to capture essential metadata attributes from the SWOT information architecture and ISO metadata model, and generate an HDF5 structure specification and example metadata snippets.
This document discusses NEON's use of HDF5 file format for its ecological data. The goals are to implement a fast and efficient file format, develop a standardized data delivery structure, and provide metadata. It describes the HDF5 file structure, metadata inclusion, and an example workflow for processing eddy covariance data into HDF5 files. Future work includes integrating R code for HDF5 file generation and embedding ecological metadata.
This document discusses using HDF4 file content maps to enable cloud computing capabilities for HDF4 files. HDF4 files contain scientific data but their large size and legacy format pose challenges. The document proposes creating XML maps that describe HDF4 file structure and contents, including chunk locations and sizes. These maps could then be indexed and searched to locate relevant data chunks. Only those chunks would need to be extracted to the cloud, avoiding unnecessary data transfers. This would allow HDF4 files to be queried and analyzed using cloud-based tools while reducing storage costs.
This document discusses two HDF5-based file formats for storing Earth observation data:
1. The Sorted Pulse Data (SPD) format stores laser scanning data including pulses and point data with attributes. It was created in 2008 and updated to version 4 to improve flexibility.
2. The KEA image file format implements the GDAL raster data model in HDF5, allowing large raster datasets and attribute tables to be stored together with compression. It was created in 2012 to address limitations of other formats.
Both formats take advantage of HDF5 features like compression but also discuss some limitations and lessons learned for effectively designing scientific data formats.
ICESat-2 is a NASA satellite mission scheduled to launch in December 2017 that will use photon counting laser altimetry to measure ice sheet and sea ice elevations. It will carry an advanced laser altimeter that splits each laser pulse into 6 beams in a cross-track pattern to provide dense sampling. This will allow for improved elevation estimates over rough terrain. The document discusses ICESat-2's science objectives, measurement concept, data products, processing workflow, and approach to managing metadata across different levels of data products.
The HDF Group provides updates on new features in HDF including faster compression, single writer/multiple reader file access, virtual datasets, and dynamically loaded filters. They also discuss tools like HDFView, nagg for data aggregation, and a new HDF5 ODBC driver. The work is supported by NASA.
HDF Cloud Services aims to bring HDF5 to the cloud by defining a REST API for HDF5 and implementing related services. The HDF REST API allows HDF5 data to be accessed via HTTP requests and responses. H5serv is an open source reference implementation of the HDF REST API. The HDF Scalable Data Service (HSDS) is being developed to support large HDF5 repositories in a scalable, cost effective manner using object storage like AWS S3.
A distributed video management cloud platform using hadoopredpel dot com
This document describes a distributed video management cloud platform using Hadoop. The platform utilizes Hadoop's parallel processing and flexible storage capabilities to efficiently store and process large amounts of video data. It integrates J2EE, Flex, Red5 streaming media server, and Hadoop to provide a user-friendly interface for managing videos. The platform is evaluated and shown to satisfy the requirements of massive video data management through optimized MapReduce processing of video tasks like encoding, decoding, and background subtraction.
This document provides an overview of HDF5 (Hierarchical Data Format version 5) and introduces its core concepts. HDF5 is an open source file format and software library designed for storing and managing large amounts of numerical data. It supports a data model with objects such as datasets, groups, attributes, and datatypes. HDF5 files can be accessed through its software library and APIs from languages like C, Fortran, C++, Python and more. The document covers HDF5's data model, file format, programming interfaces, tools and example code.
The document discusses HDF command line tools that can be used to view, modify, and manipulate HDF5 files. It provides examples of using tools like h5dump to view file structure and dataset information, h5repack to optimize file layout and compression, h5diff to compare files and datasets, and h5copy to copy objects between files. The tutorial was presented at the 15th HDF and HDF-EOS workshop from April 17-19, 2012.
The document introduces several free and inexpensive software alternatives for analyzing X-ray diffraction data, as commercial software packages can be very expensive. It recommends the Collaborative Computational Project #14 (CCP14) website as the best source for free software information. It then describes several free data conversion tools, including the PowDLL Converter and ConvX, which can convert between different file formats. Finally, it summarizes the capabilities of the free PowderX software for general data processing tasks like smoothing, background subtraction, and pattern indexing.
This tutorial is designed for anyone who needs to work with data stored in HDF and HDF5 files.
The first part of the tutorial will focus on the HDF5 utilities to display the contents of HDF5 files, to extract and to import data from and to HDF5 files, to compare two HDF5 files, and more. Participants will be guided through the hand-on examples and will learn about different tools options. New changes and advanced features will be covered in a separate session (Updates on HDF tools) on Wednesday.
The second part of tutorial includes a hands-on session to learn the HDF (4 & 5) Java browsing tool, HDFView. The tool and special plug-ins will be used to work with the existing HDF, HDF-EOS, and netCDF-4 files, and to create a new HDF5 file. The tutorial will cover basic features of HDFView.
In this talk we will discuss caching and buffering strategies in HDF5. The information presented will help developers write more efficient applications and avoid performance bottlenecks.
It will cover features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: datatype and partial I/O
This tutorial is for persons who are already familiar with HDF5 and wish to take advantage is some of its advanced features.
This document describes IBM's Visualization Data Explorer (DX), a data analysis and visualization tool. DX allows users to work with data from multiple sources using a powerful and unified data model. It provides a visual programming environment and large library of modules for importing, analyzing, displaying, and exporting data. Examples of DX use cases are shown from fields like computational fluid dynamics, earth science, and NASA research.
The document summarizes a workshop between NASA, software developers, science communities, and data centers to discuss HDF and HDF-EOS tools. Key topics included interactions between these groups, technical details of EOSDIS and HDF-EOS, available and needed tools, resources for developers, and next steps to continue engagement through websites and future meetings.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive function. Exercise causes chemical changes in the brain that may help protect against developing mental illness and improve symptoms for those who already suffer from conditions like anxiety and depression.
The document summarizes a prototype for dataset-independent subsetting developed by UAH. The prototype allows users to spatially, temporally, and spectrally subset HDF-EOS format Earth science datasets via a web interface. It extracts only the requested data to reduce delivery time and resource usage. However, its use is currently limited as HDF-EOS has not been widely adopted and many legacy datasets are not in its format.
This document discusses incorporating ISO metadata standards into HDF files using the HDF Product Designer tool. It describes how the HDF Product Designer allows users to import pre-built ISO metadata components from a separate project into their HDF file designs. This allows essential contextual data or metadata to be stored in HDF5 files according to ISO 19115 standards.
This document discusses a pilot project to incorporate ISO 19115-2 metadata attributes at the granule level for the NASA SWOT mission. The metadata will be stored in HDF5 groups and generated in two ways - via XML serialization or XML style sheet conversions. The project aims to capture essential metadata attributes from the SWOT information architecture and ISO metadata model, and generate an HDF5 structure specification and example metadata snippets.
This document discusses NEON's use of HDF5 file format for its ecological data. The goals are to implement a fast and efficient file format, develop a standardized data delivery structure, and provide metadata. It describes the HDF5 file structure, metadata inclusion, and an example workflow for processing eddy covariance data into HDF5 files. Future work includes integrating R code for HDF5 file generation and embedding ecological metadata.
This document discusses using HDF4 file content maps to enable cloud computing capabilities for HDF4 files. HDF4 files contain scientific data but their large size and legacy format pose challenges. The document proposes creating XML maps that describe HDF4 file structure and contents, including chunk locations and sizes. These maps could then be indexed and searched to locate relevant data chunks. Only those chunks would need to be extracted to the cloud, avoiding unnecessary data transfers. This would allow HDF4 files to be queried and analyzed using cloud-based tools while reducing storage costs.
This document discusses two HDF5-based file formats for storing Earth observation data:
1. The Sorted Pulse Data (SPD) format stores laser scanning data including pulses and point data with attributes. It was created in 2008 and updated to version 4 to improve flexibility.
2. The KEA image file format implements the GDAL raster data model in HDF5, allowing large raster datasets and attribute tables to be stored together with compression. It was created in 2012 to address limitations of other formats.
Both formats take advantage of HDF5 features like compression but also discuss some limitations and lessons learned for effectively designing scientific data formats.
ICESat-2 is a NASA satellite mission scheduled to launch in December 2017 that will use photon counting laser altimetry to measure ice sheet and sea ice elevations. It will carry an advanced laser altimeter that splits each laser pulse into 6 beams in a cross-track pattern to provide dense sampling. This will allow for improved elevation estimates over rough terrain. The document discusses ICESat-2's science objectives, measurement concept, data products, processing workflow, and approach to managing metadata across different levels of data products.
The HDF Group provides updates on new features in HDF including faster compression, single writer/multiple reader file access, virtual datasets, and dynamically loaded filters. They also discuss tools like HDFView, nagg for data aggregation, and a new HDF5 ODBC driver. The work is supported by NASA.
HDF Cloud Services aims to bring HDF5 to the cloud by defining a REST API for HDF5 and implementing related services. The HDF REST API allows HDF5 data to be accessed via HTTP requests and responses. H5serv is an open source reference implementation of the HDF REST API. The HDF Scalable Data Service (HSDS) is being developed to support large HDF5 repositories in a scalable, cost effective manner using object storage like AWS S3.
Aashish Chaudhary gave a presentation on Kitware's work with scientific computing and visualization using HDF. HDF is a widely used data format at Kitware for domains like climate modeling, geospatial visualization, and information visualization. Kitware is looking to improve HDF support for cloud and web environments to enable streaming analytics and web-based data analysis. The company also aims to further open source collaboration and scientific computing.
The document describes an HDF-EOS DataBlade that allows accessing HDF-EOS grid, swath, and point data types using SQL queries within an object-relational database. The DataBlade includes new data types, routines for data access and subsetting, tables to store HDF-EOS metadata and data, and indexing methods. This allows users to search across HDF-EOS granules, browse metadata, and retrieve subsets of geospatial data through simple SQL queries for analysis and correlation studies.
This document discusses using MATLAB for working with big data and scientific data formats. It provides an overview of MATLAB's capabilities for scientific data, including interfaces for HDF5 and NetCDF formats. It also describes how MATLAB can be used to access, analyze, and visualize big data from sources like Hadoop, databases, and RESTful web services. As a demonstration, it shows how MATLAB can access HDF5 data stored on an HDF Server through RESTful web requests and analyze the data using in-memory data types and functions.
The document discusses tools for visualizing and manipulating HDF and HDF-EOS files. It categorizes tools as utilities, which perform specific data functions from the command line, or applications, which allow for significant data processing. Tools are further broken down into ESDIS project tools like DIAL and EOSView, other government tools from agencies like NASA and JPL, and commercial tools from companies. A survey of tool developers found most are working on HDF-EOS support but are unsure of release timelines, and more work is needed in areas like Java interfaces, metadata handling, and mapping/overlay capabilities.
Current status of HDF-EOS and access tools will be summarized. Update on HDF-EOS, HDFView plug-in and The HDF-EOS to GeoTIFF (HEG) conversion tool, including recent changes to the software, ongoing maintenance, upcoming releases, future plans, and issues will be discussed.
A preponderance of data from NASA's Earth Observing System (EOS) is archived in the HDF Version 4 (HDF4) format. The long-term preservation of these data is critical for climate and other scientific studies going many decades into the future. HDF4 is very effective for working with the large and complex collection of EOS data products. Unfortunately, because of the complex internal byte layout of HDF4 files, future readability of HDF4 data depends on preserving a complex software library that can interpret that layout. Having a way to access HDF4 data independent of a library could improve its viability as an archive format, and consequently give confidence that HDF4 data will be readily accessible forever, even if the HDF4 library is gone.
To address the need to simplify long-term access to EOS data stored in HDF4, a collaborative project between The HDF Group and NASA Earth Science Data Centers is implementing an approach to accessing data in HDF4 files based on the use of independent maps that describe the data in HDF4 files and tools that can use these maps to recover data from those files. With this approach, relatively simple programs will be able to extract the data from an HDF4 file, bypassing the need for the HDF4 library.
A demonstration project has shown that this approach is feasible. This involved an assessment of NASA�s HDF4 data holdings, and development of a prototype XML-based layout mapping language and tools to read layout maps and read HDF4 files using layout maps. Future plans call for a second phase of the project, in which the mapping tools and XML schema are made production quality, the mapping schema are integrated with existing XML metadata files in several data centers, and outreach activities are carried out to encourage and facilitate acceptance of the technology.
This document provides information about HDF (Hierarchical Data Format) tools and resources for working with Earth observation data. It summarizes HDF's focus on helping users at different stages of working with data, from initial product design to long-term archiving. It also describes specific HDF tools for viewing, comparing, converting between formats and adding metadata to scientific data files.
NetCDF and HDF5 are data formats and software libraries used for scientific data. NetCDF began in 1989 and allows for array-oriented data with dimensions, variables, and attributes. NetCDF-4 introduced new features while maintaining backward compatibility. It uses HDF5 for data storage and can read HDF4/HDF5 files. NetCDF provides APIs for C, Fortran, Java, and is widely used for earth science and climate data. It supports conventions, parallel I/O, and reading many data formats.
This document introduces h5web, a web-based viewer for HDF5 files developed by Loïc Huder at the European Synchrotron Radiation Facility. HDF5 has become the standard format for data acquired at ESRF. H5web allows users to browse HDF5 file hierarchies, inspect metadata and attributes, and display n-dimensional datasets with interactive visualizations. Built with React and available as open-source, h5web demonstrates browsing and visualizing capabilities through a demo. Future work includes incorporating NeXus support and deploying the visualization components in other applications.
This is a slide from
HDF AND HDF-EOS WORKSHOP V
February 26 - 28, 2002
Source: http://hdfeos.org/workshops/ws05/presentations/Ullman/11c-Discussion_notes.ppt
This document summarizes a presentation about the current status and future directions of the Hierarchical Data Format (HDF) software. It provides updates on recent HDF5 releases, development efforts including new compression methods and ways to access HDF5 data, and outreach resources. It concludes by inviting the audience to share wishes for future HDF development.
2011-03-15 Lockheed Martin Open Source DayShawn Wells
This document provides an overview of Red Hat's enterprise products and technologies, including Red Hat Enterprise Linux, Red Hat Network Satellite, Red Hat Enterprise Virtualization, JBoss middleware, and Red Hat cloud technologies. It discusses features of Red Hat Enterprise Linux 6 such as resource management, security capabilities, and support period. It also summarizes Red Hat Network Satellite, Red Hat Enterprise Virtualization, JBoss products, and Red Hat's approach to open source cloud computing technologies and standards.
This document contains the resume of Hassan Qureshi. He has over 9 years of experience as a Hadoop Lead Developer with expertise in technologies like Hadoop, HDFS, Hive, Pig and HBase. Currently he works as the technical lead of a data engineering team developing insights from data. He has extensive hands-on experience installing, configuring and maintaining Hadoop clusters in different environments.
Accompanying slides for the class “Introduction to Hadoop” at the PRACE Autumn school 2020 - HPC and FAIR Big Data organized by the faculty of Mechanical Engineering of the University of Ljubljana (Slovenia).
Accessibility and usability of NPP/NPOESS data in HDF5 can be enhanced by providing tools that simplify and standardize how data is accessed and presented. In this project, The HDF Group is creating such tools in the form of software to read and write certain key data types and data aggregates used in NPP/NPOESS data products, and extending HDFView to extract, present and export these data effectively. In particular, the work will focus on NPP/NPOESS use of HDF5 region references and quality flags. The HDF Group will also provide high quality user support for the project.
This presentation provides an overview of big data concepts and Hadoop technologies. It discusses what big data is and why it is important for businesses to gain insights from massive data. The key Hadoop technologies explained include HDFS for distributed storage, MapReduce for distributed processing, and various tools that run on top of Hadoop like Hive, Pig, HBase, HCatalog, ZooKeeper and Sqoop. Popular Hadoop SQL databases like Impala, Presto and Stinger are also compared in terms of their performance and capabilities. The document discusses options for deploying Hadoop on-premise or in the cloud and how to integrate Microsoft BI tools with Hadoop for big data analytics.
This tutorial is designed for new HDF5 users. We will go over a brief history of HDF and HDF5 software, and will cover basic HDF5 Data Model objects and their properties; we will give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples, and Java tool HDFView will be used to illustrate HDF5 concepts.
This document discusses the Big Data Europe Integrator Platform, which aims to empower communities with data technologies through an open-source platform that supports a variety of use cases and emerging big data technologies through simple integration of custom components. The platform architecture includes layers for resource management, data, semantic technologies, and applications. It can be installed manually or using Docker and supports frameworks for data processing, storage, semantic modeling, and user interfaces for workflow building and monitoring.
The document introduces several free and inexpensive software alternatives for analyzing X-ray diffraction data, as commercial software packages can be very expensive. It summarizes key data processing and analysis software including PowderX, WinFit, FitYK, Powder Cell for Windows, and specialized tools like Mudmaster that can be used in Microsoft Excel. The best source for finding free diffraction software is the Collaborative Computational Project #14 website.
The document discusses various free and open source graphical library options for embedded devices. It provides an overview of several "low-level" solutions like DirectFB and X.org/KDrive that interface directly with graphics hardware as well as "high-level" toolkits like GTK+, Qt, FLTK, and WxWidgets that provide widget-based APIs. For each solution, it discusses functionality, size/dependencies when built, architecture, common uses, and developer community/long-term maintenance. The conclusion recommends DirectFB, X.org, GTK, and Qt as particularly good choices due to their large user and developer bases and ongoing support.
Szehon Ho gave a presentation on big data technologies at a Meetup in Paris in July 2017. He discussed his background working with big data in Silicon Valley and his current role leading the analytic data storage team at Criteo in Paris. He provided overviews of Hadoop file systems, MapReduce execution, Hive as an interface for accessing Hadoop, and new technologies like Spark and Hive on Spark.
Shiv Shakti has over 4 years of experience in Hadoop development and automation projects. He has expertise in handling structured and unstructured data using tools like Hive, Pig, Sqoop, HDFS, and Oozie. He is proficient in frameworks like Hadoop, Hive, and Pig and languages like Shell scripting, Python, SQL, and Java. He has worked on projects involving log analysis, data processing, and dashboard creation for clients in the banking industry. His roles included developing MapReduce functions, writing Hive queries, creating Pig scripts, and automating workflows with Oozie.
This document discusses how to optimize HDF5 files for efficient access in cloud object stores. Key optimizations include using large dataset chunk sizes of 1-4 MiB, consolidating internal file metadata, and minimizing variable-length datatypes. The document recommends creating files with paged aggregation and storing file content information in the user block to enable fast discovery of file contents when stored in object stores.
This document provides an overview of HSDS (Highly Scalable Data Service), which is a REST-based service that allows accessing HDF5 data stored in the cloud. It discusses how HSDS maps HDF5 objects like datasets and groups to individual cloud storage objects to optimize performance. The document also describes how HSDS was used to improve access performance for NASA ICESat-2 HDF5 data on AWS S3 by hyper-chunking datasets into larger chunks spanning multiple original HDF5 chunks. Benchmark results showed that accessing the data through HSDS provided over 2x faster performance than other methods like ROS3 or S3FS that directly access the cloud storage.
This document summarizes the current status and focus of the HDF Group. It discusses that the HDF Group is located in Champaign, IL and is a non-profit organization focused on developing and maintaining HDF software and data formats. It provides an overview of recent HDF5, HDF4 and HDFView releases and notes areas of focus for software quality improvements, increased transparency, strengthening the community, and modernizing HDF products. It invites support and participation in upcoming user group meetings.
This document provides an overview of HSDS (HDF Server and Data Service), which allows HDF5 files to be stored and accessed from the cloud. Key points include:
- HSDS maps HDF5 objects like datasets and groups to individual cloud storage objects for scalability and parallelism.
- Features include streaming support, fancy indexing for complex queries, and caching for improved performance.
- HSDS can be deployed on Docker, Kubernetes, or AWS Lambda depending on needs.
- Case studies show HSDS is used by organizations like NREL and NSF to make petabytes of scientific data publicly accessible in the cloud.
This document discusses creating cloud-optimized HDF5 files by rearranging internal structures for more efficient data access in cloud object stores. It describes cloud-native and cloud-optimized storage formats, with the latter involving storing the entire HDF5 file as a single object. The benefits of cloud-optimized HDF5 include fast scanning and using the HDF5 library. Key aspects covered include using optimal chunk sizes, compression, and minimizing variable-length datatypes.
This document discusses updates and performance improvements to the HDF5 OPeNDAP data handler. It provides a history of the handler since 2001 and describes recent updates including supporting DAP4, new data types, and NetCDF data models. A performance study showed that passing compressed HDF5 data through the handler without decompressing/recompressing led to speedups of around 17-30x by leveraging HDF5 direct I/O APIs. This allows outputting HDF5 files as NetCDF files much faster through the handler.
This document provides instructions for using the Hyrax software to serve scientific data files stored on Amazon S3 using the OPeNDAP data access protocol. It describes how to generate ancillary metadata files called DMR++ files using the get_dmrpp tool that provide information about the data file structure and locations. The document explains how to run get_dmrpp inside a Docker container to process data files on S3 and generate customized DMR++ files that the Hyrax server can use to serve the files to clients.
This document provides an overview and examples of accessing cloud data and services using the Earthdata Login (EDL), Pydap, and MATLAB. It discusses some common problems users encounter, such as being unable to access HDF5 data on AWS S3 using MATLAB or read data from OPeNDAP servers using Pydap. Solutions presented include using EDL to get temporary AWS tokens for S3 access in MATLAB and providing code examples on the HDFEOS website to help users access S3 data and OPeNDAP services. The document also notes some limitations, such as tokens being valid for only 1 hour, and workarounds like requesting new tokens or using the MATLAB HDF5 API instead of the netCDF API.
The HDF5 Roadmap and New Features document outlines upcoming changes and improvements to the HDF5 library. Key points include:
- HDF5 1.13.x releases will include new features like selection I/O, the Onion VFD for versioned files, improved VFD SWMR for single-writer multiple-reader access, and subfiling for parallel I/O.
- The Virtual Object Layer allows customizing HDF5 object storage and introduces terminal and pass-through connectors.
- The Onion VFD stores versions of HDF5 files in a separate onion file for versioned access.
- VFD SWMR improves on legacy SWMR by implementing single-writer multiple-reader capabilities
This document discusses user analysis of the HDFEOS.org website and plans for future improvements. It finds that the majority of the site's 100 daily users are "quiet", not posting on forums or other interactive elements. The main user types are locators, who search for examples or data; mergers, who combine or mosaic datasets; and converters, who change file formats. The document outlines recent updates focused on these user types, like adding Python examples for subsetting and calculating latitude and longitude. It proposes future work on artificial intelligence/machine learning uses of HDF files and examples for processing HDF data in the cloud.
The document describes H5Coro, a new C++ library for reading HDF5 files from cloud storage. H5Coro was created to optimize HDF5 reading for cloud environments by minimizing I/O operations through caching and efficient HTTP requests. Performance tests showed H5Coro was 77-132x faster than the previous HDF5 library at reading HDF5 data from Amazon S3 for NASA's SlideRule project. H5Coro supports common HDF5 elements but does not support writing or some complex HDF5 data types and messages to focus on optimized read-only performance for time series data stored sequentially in memory.
This document summarizes MathWorks' work to modernize MATLAB's support for HDF5. Key points include:
1) MATLAB now supports HDF5 1.10.7 features like single-writer/multiple-reader access and virtual datasets through new and updated low-level functions.
2) Performance benchmarks show some improvements but also regressions compared to the previous HDF5 version, and work continues to optimize code and support future versions.
3) There are compatibility considerations for Linux filter plugins, but interim solutions are provided until MathWorks can ship a single HDF5 version.
HSDS provides HDF as a service through a REST API that can scale across nodes. New releases will enable serverless operation using AWS Lambda or direct client access without a server. This allows HDF data to be accessed remotely without managing servers. HSDS stores each HDF object separately, making it compatible with cloud object storage. Performance on AWS Lambda is slower than a dedicated server but has no management overhead. Direct client access has better performance but limits collaboration between clients.
HDF5 and Zarr are data formats that can be used to store and access scientific data. This presentation discusses approaches to translating between the two formats. It describes how HDF5 files were translated to the Zarr format by creating a separate Zarr store to hold HDF5 file chunks, and storing chunk location metadata. It also discusses an implementation that translates Zarr data to the HDF5 format by using a special chunking layout and storing chunk information in an HDF5 compound dataset. Limitations of the translations include lack of support for some HDF5 dataset properties in Zarr, and lack of support for some Zarr compression methods in the HDF5 implementation.
The document discusses HDF for the cloud, including new features of the HDF Server and what's next. Key points:
- HDF Server uses a "sharded schema" that maps HDF5 objects to individual storage objects, allowing parallel access and updates without transferring entire files.
- Implementations include HSDS software that uses the sharded schema with an API and SDKs for different languages like h5pyd for Python.
- New features of HSDS 0.6 include support for POSIX, Azure, AWS Lambda, and role-based access control.
- Future work includes direct access to storage without a server intermediary for some use cases.
This document compares different methods for accessing HDF and netCDF files stored on Amazon S3, including Apache Drill, THREDDS Data Server (TDS), and HDF5 Virtual File Driver (VFD). A benchmark test of accessing a 24GB HDF5/netCDF-4 file on S3 from Amazon EC2 found that TDS performed the best, responding within 2 minutes, while Apache Drill failed after 7 minutes. The document concludes that TDS 5.0 is the clear winner based on performance and support for role-based access control and HDF4 files, but the best solution depends on use case and software.
This document discusses STARE-PODS, a proposal to NASA/ACCESS-19 to develop a scalable data store for earth science data using the SpatioTemporal Adaptive Resolution Encoding (STARE) indexing scheme. STARE allows diverse earth science data to be unified and indexed, enabling the data to be partitioned and stored in a Parallel Optimized Data Store (PODS) for efficient analysis. The HDF Virtual Object Layer and Virtual Data Set technologies can then provide interfaces to access the data in STARE-PODS in a familiar way. The goal is for STARE-PODS to organize diverse data for alignment and parallel/distributed storage and processing to enable integrative analysis at scale.
This document provides an overview and update on HDF5 and its ecosystem. Key points include:
- HDF5 1.12.0 was recently released with new features like the Virtual Object Layer and external references.
- The HDF5 library now supports accessing data in the cloud using connectors like S3 VFD and REST VOL without needing to modify applications.
- Projects like HDFql and H5CPP provide additional interfaces for querying and working with HDF5 files from languages like SQL, C++, and Python.
- The HDF5 community is moving development to GitHub and improving documentation resources on the HDF wiki site.
This document summarizes new features in HDF5 1.12.0, including support for storing references to objects and attributes across files, new storage backends using a virtual object layer (VOL), and virtual file drivers (VFDs) for Amazon S3 and HDFS. It outlines the HDF5 roadmap for 2019-2022, which includes continued support for HDF5 1.8 and 1.10, and new features in future 1.12.x releases like querying, indexing, and provenance tracking.
The document discusses leveraging cloud resources like Amazon Web Services to improve software testing for the HDF group. Currently HDF software is tested on various in-house systems, but moving more testing to the cloud could provide better coverage of operating systems and distributions at a lower cost. AWS spot instances are being used to run HDF5 build and regression tests across different Linux distributions in around 30 minutes for approximately $0.02 per hour.
More from The HDF-EOS Tools and Information Center (20)
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
Digital Marketing Trends in 2024 | Guide for Staying AheadWask
https://www.wask.co/ebooks/digital-marketing-trends-in-2024
Feeling lost in the digital marketing whirlwind of 2024? Technology is changing, consumer habits are evolving, and staying ahead of the curve feels like a never-ending pursuit. This e-book is your compass. Dive into actionable insights to handle the complexities of modern marketing. From hyper-personalization to the power of user-generated content, learn how to build long-term relationships with your audience and unlock the secrets to success in the ever-shifting digital landscape.
Webinar: Designing a schema for a Data WarehouseFederico Razzoli
Are you new to data warehouses (DWH)? Do you need to check whether your data warehouse follows the best practices for a good design? In both cases, this webinar is for you.
A data warehouse is a central relational database that contains all measurements about a business or an organisation. This data comes from a variety of heterogeneous data sources, which includes databases of any type that back the applications used by the company, data files exported by some applications, or APIs provided by internal or external services.
But designing a data warehouse correctly is a hard task, which requires gathering information about the business processes that need to be analysed in the first place. These processes must be translated into so-called star schemas, which means, denormalised databases where each table represents a dimension or facts.
We will discuss these topics:
- How to gather information about a business;
- Understanding dictionaries and how to identify business entities;
- Dimensions and facts;
- Setting a table granularity;
- Types of facts;
- Types of dimensions;
- Snowflakes and how to avoid them;
- Expanding existing dimensions and facts.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Project Management Semester Long Project - Acuityjpupo2018
Acuity is an innovative learning app designed to transform the way you engage with knowledge. Powered by AI technology, Acuity takes complex topics and distills them into concise, interactive summaries that are easy to read & understand. Whether you're exploring the depths of quantum mechanics or seeking insight into historical events, Acuity provides the key information you need without the burden of lengthy texts.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
3. HDF Utilities
make24 ristosds
Convert Raster Image Sets to Scientific Data Sets
vshow
Display the contents and structure of Vgroups and Vdatas
hdfcomp
Compress Raster Image Sets
hdfpack
Free unused space in file; join linked blocks
hdftor8
Output 8-bit Raster Image Set as raw 8-bit image
paltohdf
Ingest raw palette as HDF palette
vcompat
Update Vset 1.0 files to Vset 2.0 and higher
HDF Vendors/Software Developers Workshop
3
4. HDF Utilities
Utility
Description
hdf24to8
24-bit Raster Image Sets to 8-bit Raster Image Sets
hdfed
Low-level file browse with limited editing capabilities
hdfrseq
Play an animation sequence through NCSA/BYU Telnet
jpeg2hdf
Ingest raw JPEG compressed image as a compressed RIS
r8tohdf
Ingest raw 8-bit image as 8-bit Raster Image Set
vmake
Create Vset structures from ASCII text
hdf2jpeg
Output Raster Image Set as raw JPEG image
hdfls
List contents of an HDF file (tags and reference numbers)
hdftopal
Ouput HDF palette as raw palette
HDF Vendors/Software Developers Workshop
4
5. NCSA Tools
Tool
Description
The NCSA Java-based HDF Viewer (JHV): It is a Java-based implementation of an HDF Viewer.
The NCSA Collaborative Java-based HDF Viewer: It is a version of JHV which allows several users
to simultaneously browse the contents of an HDF
file across a local area network or the internet.
HDF WWW Scientific Data Browser (sdb-CGI)
Image/
Standalone image/animation display and processing
X Image
SGI Iris/Indigo, Sun SPARC, DECstation, IBM RS/6000, Cray, Macintosh
X Data Slice
Standalone 3-D data set display
SGI Iris/Indigo, Sun SPARC, DECstation, DEC Alpha, IBM RS/6000, Cray
Datascope
Standalone 2-D data set display and processing
Reformat/
Conversion of data into HDF
Xreformat
SGI Iris/Indigo, Sun SPARC, DECstation, DEC Alpha, IBM RS/6000
URL http//www.ncsa.uiuc.edu/indices/software/
HDF Vendors/Software Developers Workshop
5
6. EOSDIS Project Tools
Data and Information Access Link ( DIAL) is a package of
software tools based on WWW to provide access to data. It has
many features such as browsing, plotting, subsampling and
subsetting. Currently it works with HDF files in UNIX
environment. This package will work with HDF-EOS files in
future.
EOSView is a HDF browser developed by the ECS project to
display HDF data. This tool is under development and eventually
will become HDF-EOS browser.
HDF Vendors/Software Developers Workshop
6
7. Other Free Tools
Product
Name
DDI
Developer
LLNL
Platforms
Unix
Description
The Data and Dimensions
Interface addresses a significant
problem in the visualization of
large data sets: Extracting only
the relevant data and providing it
to a chosen graphics engine in the
required form without undue
effort. DDI reads and writes a
number of publicly available file
formats, and sends data to public
domain and commercial
visualization systems.
URL: http:// www-pcmdi.llnl.gov/williams/ddi/ddi.html
HDF Vendors/Software Developers Workshop
7
8. Other Free Tools
Product
Name
DODS
Developer
Platforms
USN
Description
The Distributed Oceanographic
Data System, developed
specifically for oceanographic
data, provides flexible access to a
wide variety of data and facilitates
their analysis with existing
software.
URL: http://dods.gso.uri.edu/DODS/home/home.html
HDF Vendors/Software Developers Workshop
8
9. Other Free Tools
Product
Name
Envision
Developer
Platforms
NCSA
Description
This is an interactive system for
the management and visualization
of large scientific data sets. It runs
under X/Motif, manages data
stored in HDF or netCDF files, and
does visualization using IDL,
NCSA Collage, and NCSA
XDataSlice.
URL http:// www.atmos.uiuc.edu/envision/envision.html
HDF Vendors/Software Developers Workshop
9
11. Other Free Tools
Product
Name
GRASS
Developer
U. S. Army
Platforms
UNIX
Description
The Geographical Resources
Analysis Support System
(GRASS), is an integrated set of
programs designed to provide
digitizing, image processing, map
production, and geographic
information system capabilities to
its users.
URL: http:/www.cecer.army.mil/grass/GRASS.main.html
HDF Vendors/Software Developers Workshop
11
12. Other Free Tools
Product
Name
HDF Browser
Developer
Fortner Research
Platforms
Win3.1/95/
NT and Mac
Description
Offers point-and-click access to
data stored in the HDF format.
Opening an HDF file with the
HDF Browser lets you see all
hierarchical components of any
HDF file and then view the data
stored in each component. The
HDF Browser contains editors to
view data stored as tables, 2-D
arrays, multidimensional arrays,
annotations, text attributes, raster
images, and color palettes
URL: http://www.fortner.com/docs/product_hdf_b.html
HDF Vendors/Software Developers Workshop
12
13. Other Free Tools
Product
Name
HDFLook
Developer
Platforms
Description
Solaris,
Alpha VMS,
HP-UX,
IRIX 5.3,
and AIX.
HDFLook is a friendly Motif HDF
viewer, useful for quality control
of Scientific Datasets. It allows
easy access to physical values and
ancillary data, and includes 2-D
graphics (radial, histogram). The
latest version supports image print
capabilities.
URL: Louis.Gonzalez@univ-lille1.fr
HDF Vendors/Software Developers Workshop
13
14. Other Free Tools
Product
Name
hdfv
Developer
Platforms
UNIX
HDF Vendors/Software Developers Workshop
Description
The hdfv tool is an HDF read-only
interface via TCL. It contains
"tclhdf" "hdfv". "tclhdf" is a
simple extension of tclsh, with
HDF's Vgroup/Vdata queries.
"hdfv" is an HDF-Viewer with a
GUI based on Tk. Currently, it
only supports the Vgroup/Vdata
model. This tool can also be
downloaded from the HDF
Contributed Software directory.
14
15. Other Free Tools
Product
Name
LinkWinds
Developer
JPL
Platforms
UNIX
Description
LinkWinds is a visual data analysis
and exploration system designed to
rapidly and interactively
investigate large multivariate and
multidisciplinary data sets to detect
trends, correlations and anomalies.
URL: http://linkwinds.jpl.nasa.gov/lwhome.html
HDF Vendors/Software Developers Workshop
15
16. Other Free Tools
Product
Name
ImageMagick
Developer
Platforms
Description
ImageMagick, version 3.7.3, is a
package for interactive
manipulation of images for the X
Window System. It is written in C
and interfaces to the X library,
and therefore does not require any
proprietary toolkit in order to
compile.
URL: http://www.wtech.ruhr-uni-bochum.de/doc/ImageMagick/ImageMagick.html
HDF Vendors/Software Developers Workshop
16
17. Other Free Tools
Product
Name
Ingrid
Developer
Platforms
Description
This tool is designed to manipulate
large datasets and model
input/output. It reads and writes
netCDF files, writes HDF files,
and generates plots, including line,
contour, vector, and scatter plots,
as well as histograms.
URL: http://exigente.Idgo.columbia.edu:81/
HDF Vendors/Software Developers Workshop
17
18. Other Free Tools
Product
Name
netCDF
Developer
Platforms
Description
Most of the netCDF tools can be
used with HDF, since HDF's
netCDF can be used in place of the
regular netCDF (except for
creating new netCDF files). A
non-exhaustive list of netCDF
tools can be found at Unidata
URL: http:// www4.etl.noaa.gov/dms.html
HDF Vendors/Software Developers Workshop
18
20. Other Free Tools
Product
Name
Radiance
Developer
Platforms
Description
This application is a suite of
programs for the analysis and
visualization of lighting in design.
URL: http://radsite.lbl.gov/radiance/HOME.html
Translator scheme from CAD to Radiance:
http://radsite.lbl.gov/radiance/man_html/Notes/translators.html
HDF Vendors/Software Developers Workshop
20
21. Other Free Tools
Product
Name
REINAS
Developer
Platforms
University of
California, Santa
Cruz
Description
The Real-time Environmental
Information Network and Analysis
System (REINAS) is a system
built to support realtime data
acquisition, management, and
visualization of environmental
data.
URL: http://csl.cse.ucsc.edu/reinas/
HDF Vendors/Software Developers Workshop
21
22. Other Free Tools
Product
Name
SciAn
Developer
Platforms
Description
Silicon
Graphics
workstations
and IBM
RS/6000
workstations
with the GL
option.
This is a scientific visualization
and animation package. r It brings
together the power of 3dimensional scientific visualization
and movie making with the ease of
use and familiarity of objectoriented drawing packages.
URL: http://www.scri.fsu.edu/~lyons/scian/
HDF Vendors/Software Developers Workshop
22
24. Other Free Tools
Product
Name
VCS
Developer
Platforms
Program for
Climate Model
Diagnosis and
Intercomparison
(PCMDI) at the
Lawrence
Livermore
National
Laboratory
(LLNL).
Description
The Visualization and
Computation System VCS version
2.7, when it comes out, will
support the HDF format for both
read and write operations. VCS
greatly facilitates the selection,
manipulation, and display of
scientific data. By specifying
the desired data set, the graphics
method, and the display template,
the VCS user gains virtually
complete control over the
appearance of the data display and
associated text. Although VCS is
designed expressly to meet the
needs of climate scientists, the
breadth of its capabilities make it a
useful tool for other scientific
applications.
URL: http:// www-pcmdi.llnl.gov/software/vcs/index.html
HDF Vendors/Software Developers Workshop
24
25. Commercial Tools for HDF
Product
Name
AVS
Developer
Platforms
Advanced
Visual
Systems
DEC, HP,
IBM, SGI,
and SUN
Description
Includes a suite of data
visualization and analysis
techniques, incorporating
traditional visualization tools
such as 2D plots and graphs
and image processing as well
as advanced tools such as 3D
interactive rendering and
volume visualization.
URL: http://www.avs.com
HDF Vendors/Software Developers Workshop
25
26. Commercial Tools for HDF
Product
Name
Data
Explorer
Developer
Platforms
Description
IBM
major
UNIX
platforms
General-purpose software
package for data visualization
and analysis. It employs a
data-flow driven client-server
execution model and provides
a graphical program editor that
allows the user to create a
visualization using a point and
click interface.
URL: http://www.almaden.ibm.com/dx/
HDF Vendors/Software Developers Workshop
26
27. Commercial Tools for HDF
Product
Name
EASI/PACE
Developer
Platforms
Description
PCI
All major
UNIX
platforms,
MAC,
Win
3.1/95/NT
Image classification,
geometric correction,
orthorectification,
enhancement, filtering,
vector edit with image
backdrop, terrain analysis and
visualization, radar image
processing, DEM extraction,
atmospheric correction, and
hyperspectral data analysis.
URL: http://www.pci.on.ca/prod.html
HDF Vendors/Software Developers Workshop
27
28. Commercial Tools for HDF
Product
Name
ER Mapper
Developer
Platforms
ER
Mapping
All major
UNIX
platforms,
Win
3.1/95/NT
Description
Integrated mapping software
featuring image processing,
map production, 3D
presentations and GIS
integration
URL: http://www.ermapper.com
HDF Vendors/Software Developers Workshop
28
29. Commercial Tools for HDF
Product
Name
GDB
Developer
Platforms
Description
PCI
Major
UNIX
platforms
Generic Data Base library
(GDB) is used to access image
and auxiliary information from
data files. This allows
different file types to be used
interchangeably where it
makes sense for the file type.
URL: http://www.pci.on.ca/cgi-bin/pcihlp/gdb
HDF Vendors/Software Developers Workshop
29
30. Commercial Tools for HDF
Product
Name
IDL
Developer
Platforms
Description
Research
Systems
Inc
Major
UNIX
platforms,
Win
3.1/95/NT
MAC
Allows you to display a
satellite image (e.g. AVHRR)
on a map projection. The input
is a 2D array of satellite image
values, and corresponding 2D
arrays of latitude and
longitude for each pixel.
URL: http://www.rsinc.com
HDF Vendors/Software Developers Workshop
30
31. Commercial Tools for HDF
Product
Name
IRIS
Explorer
Developer
Platforms
NAG
Major
UNIX
platforms
Description
Data visualization system
which allows users to build
complex applications
URL: http://www.nag.co.uk/Welcome_IEC.html
HDF Vendors/Software Developers Workshop
31
32. Commercial Tools for HDF
Product
Name
Noesys
Developer
Platforms
Description
Fortner
Research
Win
3.1/95/NT
MAC
Provides native support for
data stored in the HDF file
format and allows users to
view and edit individual
components of these complex
files.
URL: http://www.fortner.com
HDF Vendors/Software Developers Workshop
32
33. Commercial Tools for HDF
Product
Name
Plot
Developer
Platforms
Description
Fortner
Research
Major
UNIX
platforms,
Win
3.1/95/NT
MAC
Reads a variety of data
formats directly and can
handle large datasets. Creates
scientific plots. Generates line
plots, color scatter plots,
parametric plots, and doubleY plots.
URL: http://www.fortner.com/docs/product_plot.html
HDF Vendors/Software Developers Workshop
33
34. Commercial Tools for HDF
Product
Name
T3D
Developer
Platforms
Description
Fortner
Research
Win
3.1/95/NT
MAC
Provides a volumetric
visualization tool for
Macintosh or Windows. It can
read a variety of file types,
and each provides volumetric
rendering with slices,
isosurfaces, and animation.
URL: http://www.fortner.com/docs/product_T3D.html
HDF Vendors/Software Developers Workshop
34
35. Commercial Tools for HDF
Product
Name
Transform
Developer
Platforms
Description
Fortner
Research
Major
UNIX
platforms,
Win
3.1/95/NT
MAC
Directly reads a variety of
matrix and image file formats,
including HDF, TIFF, PICT,
FITS, and ASCII.
Generates pseudocolor
images, color surface plots,
contour plots, and vector plots.
URL: http://www.fortner.com/docs/product_transform.html
HDF Vendors/Software Developers Workshop
35
36. Commercial Tools for HDF
Product
Name
PV-Wave
Developer
Platforms
Visual
Numerics
Major
UNIX
platforms,
Open
VMS,
Win
95/NT
Description
Allows the user to view,
analyze and compare data for
many business, science and
engineering applications.
URL: http://www.boulder.vni.com/products/wave
HDF Vendors/Software Developers Workshop
36
37. Summary
•NASA does not endorse any particular tool discussed in
this presentation
•Little additional tools development has occurred in the
last two years for browsing and displaying HDF
formatted data.
•Very few tools are available to support HDF-EOS
formatted data.
HDF Vendors/Software Developers Workshop
37
38. Conclusion
New HDF-EOS tools and utilities are needed to support
the terabytes of EOS AM-1 data from multiple
instruments and disciplines.
Types of tools
HDF-EOS data ingest capability for the existing image
processing and analysis packages
New image processing packages with additional
functionality for analysing and integrating Earth Science data.
HDF Vendors/Software Developers Workshop
38
39. Types of tools
•Browsers
•Data dumpers
•Translation tools
Tools and utilities developed to work with
HDF-EOS data as well as HDF will attract the
broader community of HDF users.
HDF Vendors/Software Developers Workshop
39