An Overview of HDF-EOS (Part 1)

HDF AND HDF-EOS WORKSHOP II (1998) Source: http://hdfeos.org/workshops/ws02/presentations/ilg1/ilg1.ppt

Technology

An Overview of
HDF-EOS
(Part I)

Doug Ilg
Raytheon STX
Doug.Ilg@gsfc.nasa.gov
(301) 441-4089
1

Outline
What is HDF-EOS?
The Grid Interface
The Point Interface

2

Why HDF-EOS?
Standard HDF lacks well defined ways of
handling some key needs of EOSDIS
Data structures for Earth remote
sensing data and in-situ measurements
with:
– tightly coupled geolocation information
– subsetting services based on geolocation

ECS metadata model

4

HDF-EOS Platforms
HDF-EOS Version 2.3 is available for:
Sun SPARC - Solaris
SGI - IRIX
DEC Alpha - Digital UNIX
HP 9000 - HP-UX
IBM RS/6000 - AIX
PC - Windows 95/NT
5

HDF-EOS Interfaces
C and FORTRAN Interfaces for:
Grid Data (GD)
Point Data (PT)
Swath Data (SW)

6

HDF-EOS Programming
Model
Writing
–
–
–
–
–
–
–
–

open file
create object
define structure
detach object*
attach object*
write data
detach object
close file

Reading
–
–
–
–
–
–

open file
attach object
inquire object
read data
detach object
close file

7

A Grid Structure
Xdim
Size: 2000

Projinfo
Ydim
Size: 800

9

Projections Supported
Geographic
Transverse Mercator
Universal
Transverse Mercator
Hotine Oblique
Mercator
Space Oblique
Mercator
Polar Stereographic

Lambert Azimuthal
Equal Area
Lambert Conformal
Conic
Polyconic
Interrupted Goode’s
Homolosine
Integerized
Sinusoidal
10

Components of the Grid
Interface
Access
Definition
Basic I/O
Inquiry
Subset
Tiling
11

Tips on Writing a Grid
Order of calls is significant:
– Setting a compression method affects all
subsequently defined fields
– Setting a tiling scheme affects all
subsequently defined fields

12

Grid Subsetting Features
By Geolocation
– GDdefboxregion/Gdextractboxregion

By “Vertical” Field
– GDdefvrtregion/GDextractvrtregion

By Time (special case of vertical)
Tip: use Geolocation, then Vertical/
Temporal
13

Compression Methods for
Grids
Run-Length Encoding
Adaptive Huffman
Gzip

14

A Point Data Set
Lat
61.12
45.31
38.50
38.39
30.00
37.45
18.00
43.40
34.03
32.45
33.30
42.15
35.05
34.12
46.32
47.36
39.44
21.25
44.58
41.49
25.45

Lon Temp(C) Dewpt(C)
-149.48 15.00 5.00
-122.41 17.00 5.00
-77.00 24.00 7.00
-90.15 27.00 11.00
-90.05 22.00 7.00
-122.26 25.00 10.00
-76.45 27.00 4.00
-79.23 30.00 14.00
-118.14 25.00 4.00
-96.48 32.00 8.00
-112.00 30.00 10.00
-71.07 28.00 7.00
-106.40 30.00 9.00
-77.56 28.00 9.00
-87.25 30.00 8.00
-122.20 32.00 15.00
-104.59 31.00 16.00
-78.00 28.00 7.00
-93.15 32.00 13.00
-87.37 28.00 9.00
-80.11 19.00 3.00

15

A Point Structure
Lat
Long
Buoy ID
25.2645 091.2564
0126
22.3549 -93.4657
3564
23.2564 -89.2546
1256

Buoy ID
0126
0126
3564
1256
1256
0126
3564

Time Wave Height(ft) Temp(C)
01:26
2.54
18.4
05:56
3.58
18.2
06:28
12.64
16.4
08:12
7.58
17.1
09:58
7.76
17.2
09:59
4.23
20.1
10:16
10.23
17.5

16

The Point Interface
Access
Definition
Basic I/O
Inquiry
Subset

17

Tips on Writing a Point
Every level in a Point data set must be
linked into the hierarchy.
Before two levels can be linked, a link
field must exist.

18

Point Subsetting Features
By Time
– PTdeftimeperiod/PTextractperiod

By Geolocation
– PTdefboxregion/PTextractregion

Tip: use one or the other, not both

19

Tips for HDF-EOS Coding
Most operations (read, write, subset)
work on a single field at a time.
Region IDs and Period IDs are interchangeable and can be reused to
further reduce a subset.
Partial writes (appending) on
compressed fields are only supported
through tiling.
21

This document discusses several microscopy techniques including structured illumination fluorescence microscopy, time-of-flight secondary ion mass spectrometry, coherent anti-Stokes Raman scattering microscopy, photoactivated localization microscopy, stimulated emission depletion microscopy, and 4Pi microscopy. It focuses on describing improvements made to structured illumination fluorescence microscopy including parallel GPU processing to accelerate image analysis and a new automated imaging framework. Time-of-flight secondary ion mass spectrometry imaging is discussed with applications to iterative clustering and classification analysis.

ISR

Hossein Mobasher

The document discusses various disk scheduling algorithms used by operating systems to optimize disk access time and efficiency. It describes common algorithms like First Come First Serve (FCFS), Shortest Seek Time First (SSTF), SCAN, C-SCAN, LOOK, and C-LOOK. For each algorithm, it provides an example to calculate the total seek length for a sample request queue. It then compares the performance of the different algorithms based on total and average seek lengths. In conclusion, it notes that SCAN and C-SCAN work best under heavy disk loads while SSTF and LOOK are commonly used default algorithms.

06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...

Zalando adtech lab

Implement a modified algorithm PF in a FPGA

Bruno Martínez Bargiela

This document describes implementing a modified particle filter algorithm for localization in an FPGA. The modified algorithm improves speed and accuracy. It was tested through simulations of global localization, localization and tracking, and kidnapping scenarios. The hardware implementation was 34x faster than software and successfully localized the robot in all experiments, demonstrating the FPGA is capable of running the particle filter in real-time.

NUMA-optimized Parallel Breadth-first Search on Multicore Single-node System

The document proposes a NUMA-optimized parallel breadth-first search (BFS) algorithm for multicore systems. It discusses how the hybrid BFS algorithm combines top-down and bottom-up approaches but can result in unnecessary edge traversals. The proposal distributes the graph columns to each NUMA node's local memory and binds threads and data to improve locality. It uses a library called ULIBC to intelligently manage CPU affinity and NUMA considerations. Numerical results show the NUMA-optimized hybrid BFS achieves up to 2.2x speedup over the original algorithm.

Graph500 and Green Graph500 benchmarks on SGI UV2000 @ SGI UG SC14

The document discusses Graph500 and Green Graph500 benchmarks for evaluating graph processing performance on the SGI UV2000 system. It provides an overview of the benchmarks and describes testing various graph workloads, including social networks and road networks, on different hardware from smartphones to supercomputers. The authors aim to optimize breadth-first search (BFS) graph algorithms on the NUMA-based SGI UV2000 without using MPI through NUMA-aware techniques.

NUMA-aware Scalable Graph Traversal on SGI UV Systems

The document discusses NUMA-aware scalable graph traversal on SGI UV systems. It proposes an efficient NUMA-aware breadth-first search (BFS) algorithm for large-scale graph processing by pruning remote edge traversals. Numerical results on SGI UV 300 systems with 32 sockets show the algorithm achieves 219 billion traversed edges per second (GTEPS), setting a new single-node performance record on the Graph500 benchmark.

1) The document describes a real-time GPU implementation of visual smoke simulation using the incompressible Navier-Stokes equations. 2) Key steps in the simulation algorithm include adding forces, advecting velocity and scalar fields, solving for pressure, projecting the velocity field, and applying boundary conditions. 3) Volume rendering is achieved by slicing the 3D grid from the viewer's perspective and compositing the slices using the "under" operator, implementing shadows using half-angle slicing.

Fast & Energy-Efficient Breadth-First Search on a Single NUMA System

This document summarizes a research paper that proposes a degree-aware breadth-first search (BFS) algorithm to improve the performance and energy efficiency of graph processing on non-uniform memory access (NUMA) systems. The paper introduces related work on BFS optimization. It then analyzes bottlenecks in previous NUMA-optimized BFS algorithms and proposes a degree-aware BFS approach. Experimental results show the proposal achieves faster performance on the Graph500 benchmark and improved energy efficiency on the Green Graph500 benchmark compared to prior work.

Neighbourhood Preserving Quantisation for LSH SIGIR Poster

This document proposes a neighbourhood preserving quantisation (NPQ) method for locality sensitive hashing (LSH) that assigns multiple bits per hyperplane using multiple thresholds, rather than the standard single bit. The NPQ method optimizes an F1 score using pairwise constraints from training data to determine threshold values. Evaluation on image retrieval tasks shows NPQ consistently outperforms single and double bit baselines across different projection methods, achieving higher precision-recall curves, especially at higher bit rates. Future work includes exploring variable bits per hyperplane and full retrieval evaluations.

Advancements in-tiled-rendering

mistercteam

This document discusses advancements in tiled-based compute rendering. It describes current proven tiled rendering techniques used in games. It then discusses opportunities for improvement like using parallel reduction to calculate depth bounds more efficiently than atomics, improved light culling techniques like modified Half-Z, and clustered rendering which divides the screen into tiles and slices to reduce lighting workloads. The document concludes clustered shading has potential savings on culling and offers benefits over traditional 2D tiling.

GDC16: Improving geometry culling for Deus Ex: Mankind Divided by Nicolas Trudel

Umbra Software

NUMA-aware thread-parallel breadth-first search for Graph500 and Green Graph5...

LHCb Computing Workshop 2018: PV finding with CNNs

Henry Schreiner

The document discusses using a convolutional neural network (CNN) to quickly find primary vertices (PVs) in high-energy physics events recorded by the LHCb experiment. A prototype tracking algorithm is used to generate a 1D kernel density estimate (KDE) histogram from hit triplets. This histogram is then used to train a CNN to predict the locations of PVs. Initial results show the CNN approach can find PVs with 70-75% efficiency and a false positive rate of 0.08-0.13, outperforming current algorithms. Further work aims to improve resolution, find secondary vertices, and integrate the approach into iterative tracking.

ARPS Architecture 1

This document outlines a proposed VLSI architecture for deformable motion estimation using the Adaptive Rood Pattern Search (ARPS) technique. It begins with an introduction to motion estimation and the ARPS method. It then presents the objectives to design an efficient architecture using ARPS and enhance it for mesh-based motion estimation. Simulation results showing the performance of ARPS are provided, followed by descriptions of the proposed architecture and Xilinx simulation results. Future work plans to optimize the architecture and develop an adaptive mesh-based motion estimation.

Performance Analysis with Scalasca, part II

George Markomanolis

30th コンピュータビジョン勉強会@関東 DynamicFusion

Hiroki Mizuno

DynamicFusion is a method for reconstructing and tracking non-rigid scenes in real-time by extending KinectFusion. It uses a volumetric truncated signed distance function (TSDF) to integrate depth maps from multiple viewpoints into a global reconstruction. Live depth frames are aligned to a dense surface prediction generated by raycasting the TSDF. This closes the loop between mapping and localization for tracking dynamic, non-rigid scenes.

BEFLIX

Richard Thomson

Logistic Regression in R-An Exmple.

Dr. Volkan OBAN

ARPS Architecture

This document describes a VLSI architecture for block matching motion estimation using the Adaptive Rood Pattern Search (ARPS) algorithm. It aims to enhance the performance of video encoders. The proposed architecture uses ARP and URP modules to perform an initial search and refined search. It consists of address generation, comparison, and sum of absolute difference blocks. Simulation results show the architecture achieves similar PSNR to full search with significantly fewer search points, indicating better computational efficiency.

WSE 6A-Octo-X Terrain Mapping UAV

Manuel De La Cruz

The document summarizes a student project to develop an unmanned aerial vehicle (UAV) for terrain mapping. The goals were to build a functional UAV that could collect lidar data and video to create digital elevation models. Software included MATLAB and C++ for autonomous flight control. Hardware included a Pixhawk flight controller, GPS, lidar sensor, and other components. Initial simulated flight tests were promising. Future work would include live testing and controller optimization. Students learned lessons about scheduling, equipment choices, and communication.

論文紹介"DynamicFusion: Reconstruction and Tracking of Non-‐rigid Scenes in Real...

Ken Sakurada

P1341cle

SirWilliam Wallas

This document describes an experiment to verify the laws of conservation of momentum and energy using a track, two trolleys, and two light barriers. Velocities of the trolleys are measured before and after collision using the light barriers. Measured values are transferred to tables to evaluate conservation of momentum, total momentum, energy, total energy, and energy loss. Formulas shown can be used to compare results to theory for elastic and inelastic collisions.

Archaeological Surveying in the Middle East

pmabry

This document summarizes surveying work conducted at the Mudabi site in 2009 and outlines plans for future work. It discusses using a total station, data collector, and software like AutoCAD and ArcGIS to create site maps, excavation grids, topographies, and cross sections. Future possibilities mentioned include photo modeling with photomodeler, point cloud modeling with active laser scanning, and creating 3D models with photosynth. The document provides technical specifications about units, coordinates, and accuracy and offers tips for working in the desert conditions.

The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...

Johan Andersson

The document discusses current and future uses of graphics processing units (GPUs) in game engines. It covers topics like shader programming, parallel rendering, texture techniques, raytracing, and general purpose GPU (GPGPU) computing. The author envisions future improvements like more robust shader subroutines, enhanced texture sampling capabilities, hardware-accelerated sparse textures, and limited case raytracing integrated into game engines.

Graph Regularised Hashing

Hashing has witnessed an increase in popularity over the past few years due to the promise of compact encoding and fast query time. In order to be effective hashing methods must maximally preserve the similarity between the data points in the underlying binary representation. The current best performing hashing techniques have utilised supervision. In this paper we propose a two-step iterative scheme, Graph Regularised Hashing (GRH), for incrementally adjusting the positioning of the hashing hypersurfaces to better conform to the supervisory signal: in the first step the binary bits are regularised using a data similarity graph so that similar data points receive similar bits. In the second step the regularised hashcodes form targets for a set of binary classifiers which shift the position of each hypersurface so as to separate opposite bits with maximum margin. GRH exhibits superior retrieval accuracy to competing hashing methods.

Coq for ML users

tmiya

Coq is a proof assistant based on type theory that can be used to formally verify programs and proofs. It supports program extraction to OCaml and can be used to prove properties of programs written in languages like OCaml, Java, C, and Assembly. Coq has been used to verify high assurance systems like the seL4 microkernel and TLS and JavaCard implementations. Formal verification in Coq is based on the Curry-Howard correspondence where types correspond to propositions and programs correspond to proofs. Tactics and rewriting rules are used to interactively prove goals in Coq.

Metadata Requirements for EOSDIS Data Providers

HDF Project Update

The HDF Group provides software for managing large, complex data and services to support users of this technology. It derives most of its revenue from projects related to earth science, including supporting HDF-EOS, JPSS, and other earth science projects. It maintains various tools for working with HDF files and conducts maintenance, support, and development activities to support new versions and capabilities of HDF libraries and software.

What's hot

Real-Time Visual Simulation of Smoke

Muhammad Karim

Fast & Energy-Efficient Breadth-First Search on a Single NUMA System

Neighbourhood Preserving Quantisation for LSH SIGIR Poster

Advancements in-tiled-rendering

mistercteam

GDC16: Improving geometry culling for Deus Ex: Mankind Divided by Nicolas Trudel

Umbra Software

NUMA-aware thread-parallel breadth-first search for Graph500 and Green Graph5...

LHCb Computing Workshop 2018: PV finding with CNNs

Henry Schreiner

ARPS Architecture 1

Performance Analysis with Scalasca, part II

George Markomanolis

30th コンピュータビジョン勉強会@関東 DynamicFusion

Hiroki Mizuno

BEFLIX

Richard Thomson

Logistic Regression in R-An Exmple.

Dr. Volkan OBAN

ARPS Architecture

WSE 6A-Octo-X Terrain Mapping UAV

Manuel De La Cruz

論文紹介"DynamicFusion: Reconstruction and Tracking of Non-‐rigid Scenes in Real...

Ken Sakurada

P1341cle

SirWilliam Wallas

Archaeological Surveying in the Middle East

pmabry

The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...

Johan Andersson

Graph Regularised Hashing

Coq for ML users

tmiya

What's hot (20)

Real-Time Visual Simulation of Smoke

Fast & Energy-Efficient Breadth-First Search on a Single NUMA System

Neighbourhood Preserving Quantisation for LSH SIGIR Poster

Advancements in-tiled-rendering

GDC16: Improving geometry culling for Deus Ex: Mankind Divided by Nicolas Trudel

NUMA-aware thread-parallel breadth-first search for Graph500 and Green Graph5...

LHCb Computing Workshop 2018: PV finding with CNNs

ARPS Architecture 1

Performance Analysis with Scalasca, part II

30th コンピュータビジョン勉強会@関東 DynamicFusion

BEFLIX

Logistic Regression in R-An Exmple.

ARPS Architecture

WSE 6A-Octo-X Terrain Mapping UAV

論文紹介"DynamicFusion: Reconstruction and Tracking of Non-‐rigid Scenes in Real...

P1341cle

Archaeological Surveying in the Middle East

The Intersection of Game Engines & GPUs: Current & Future (Graphics Hardware ...

Graph Regularised Hashing

Coq for ML users

Viewers also liked

Metadata Requirements for EOSDIS Data Providers

HDF Project Update

IBM Visualization Data Explorer

This document describes IBM's Visualization Data Explorer (DX), a data analysis and visualization tool. DX allows users to work with data from multiple sources using a powerful and unified data model. It provides a visual programming environment and large library of modules for importing, analyzing, displaying, and exporting data. Examples of DX use cases are shown from fields like computational fluid dynamics, earth science, and NASA research.

HDF Explorer

This document discusses the development of HDF Explorer software for visualizing oceanographic model output data. It describes how the developers switched from using proprietary binary formats to the HDF data format for portability and access to graphical tools. It also discusses the HDF Explorer software itself, which was created to provide a graphical interface for visualizing the specific features of their ocean model, including grids, density fields, and velocity fields. The document outlines future plans to utilize HDF-EOS and add additional visualization capabilities like 3D and contours to the HDF Explorer software.

HDF Server

The document describes HDF Server (h5serv), which exposes HDF5 files via a RESTful API. H5serv allows full read/write access and supports HDF5 features like compression and hyperslab selection. It uses the Tornado framework to implement a stateless, cacheable API accessed through HTTP requests in JSON. This provides a web interface for HDF5 data while maintaining HDF5 functionality. Future plans include client libraries, authentication/authorization, and improving performance for large repositories.

HDF-EOS Development Status and Maintenance Support

The document discusses the HDF-EOS data format and software toolkit. It summarizes recent additions to the toolkit, including new capabilities for handling swath data and accessing metadata. Near-future development plans include additional testing and preparing the software and test data for the EOS-AM1 launch. Potential enhancements discussed include supporting new data types and sensor geometries, incorporating HDF5, and adding more sophisticated subsetting capabilities. Maintenance and user support is provided by ECS through 2022.

MrSID

The document discusses MrSID, a wavelet-based image compression technology that allows for instant viewing and manipulation of massive raster images locally and over networks while maintaining high image quality. MrSID offers advantages over other formats like seamless mosaicking, multiresolution viewing, and selective decompression that allows viewing parts of large images quickly. Integrating MrSID's technology and file format into NASA's EOSDIS system could allow terabytes of remote sensing data to be accessed more efficiently by scientists, commercial users, and software applications.

HDF And HDF-EOS Tools

An Introduction to HDF (1997)

MATLAB and HDF-EOS

Dataset Independent Subsetting

The document summarizes a prototype for dataset-independent subsetting developed by UAH. The prototype allows users to spatially, temporally, and spectrally subset HDF-EOS format Earth science datasets via a web interface. It extracts only the requested data to reduce delivery time and resource usage. However, its use is currently limited as HDF-EOS has not been widely adopted and many legacy datasets are not in its format.

Current HDF Tools (1997)

EOS Overview

Incorporating ISO Metadata Using HDF Product Designer

PCMDI Software System

Desktop Support for HDF and HDF-EOS

Indexing HDF5: A Survey

The document discusses indexing methods for HDF5 files to enable efficient searching and access of data based on data values. It describes several existing implementations of HDF5 indexing, including PyTables, FastQuery/FastBit, Alacrity, and prototypes from The HDF Group. It also outlines current and future work to develop indexing and querying capabilities within HDF5 to allow complex multi-dimensional searches across metadata and datasets.

America Runs on Excel and HDF5 - Glued together by Python

This document discusses the development of PyHexad, a Python-based add-in for Excel that allows users to access and analyze HDF5 data directly in Excel. PyHexad 0.1 allows users to display HDF5 file contents, read arrays and tables into Excel, and read HDF5 images. The developer is seeking feedback on usability, the Python dependency, and interest in helping advance the project to version 1.0. A prototype will be released in early August for testing and further input.

His Expert's Voice

This document discusses expert systems and conventions for data modeling. It provides an overview of expert systems and how rule-based systems work using examples like traffic lights and Conway's Game of Life. It describes how an expert system could be used for the HDF Product Designer to enforce data modeling conventions. The status, challenges, and future work are outlined for integrating an expert system into the HDF Product Designer to provide convention knowledge and guide the data modeling process.

HDF

HDF is a file format for managing scientific data in heterogeneous environments. It provides data interoperability through I/O software, utilities, and search/access tools. HDF supports a variety of data types and structures, large datasets, metadata, portability across systems, fast I/O, and efficient storage. HDF-EOS extends HDF to define standard profiles for organizing Earth science remote sensing and in-situ data.

Viewers also liked (20)

Metadata Requirements for EOSDIS Data Providers

HDF Project Update

IBM Visualization Data Explorer

HDF Explorer

HDF Server

HDF-EOS Development Status and Maintenance Support

MrSID

HDF And HDF-EOS Tools

An Introduction to HDF (1997)

MATLAB and HDF-EOS

Dataset Independent Subsetting

Current HDF Tools (1997)

EOS Overview

Incorporating ISO Metadata Using HDF Product Designer

PCMDI Software System

Desktop Support for HDF and HDF-EOS

Indexing HDF5: A Survey

America Runs on Excel and HDF5 - Glued together by Python

His Expert's Voice

HDF

Similar to An Overview of HDF-EOS (Part 1)

The HDF-EOS5 Tutorial

The document describes HDF-EOS5, an extension of HDF used by NASA for Earth science data. HDF-EOS5 is based on HDF5 and contains standardized structures for gridded, swath, point, and zonal average data. It provides a library for reading, writing, and manipulating these data structures and their associated metadata. The library contains functions prefixed with "HE5_" for accessing, defining, input/output, inquiry, and subsetting HDF-EOS5 data.

HDF-EOS Overview and Status

This document provides an overview of HDF-EOS, which is an extension to HDF that defines standard data structures for remote sensing and in-situ data with tightly coupled geolocation information. It describes the core components of HDF-EOS files, including Grid, Swath, and Point structures, and provides examples. It also outlines the development of an HDF5-based version to overcome limitations of the HDF4-based library and allow for larger files.

Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...

Naoki (Neo) SATO

The document provides information about Microsoft's Cognitive Toolkit (CNTK), including benchmark performance comparisons with other deep learning frameworks and examples of using CNTK for common neural network architectures and natural language processing tasks. It shows that CNTK achieves state-of-the-art performance and scales nearly linearly with multiple GPUs. The document also provides code examples for defining common neural network components and training models with CNTK.

EECSCon Poster

Vincent Kee

The document discusses simultaneous localization and mapping (SLAM) techniques for dense 3D reconstruction of indoor scenes. It presents the deformation graph (D-Graph) approach which can accurately model small-scale environments or estimate trajectories over large-scales. The system takes in RGB-D frame data, extracts point clouds and pose information, builds a truncated signed distance function (TSDF) and pose graph, then optimizes poses and outputs a 3D point cloud map. It compares to other dense visual SLAM algorithms on metrics like trajectory and surface model error.

Dpdk applications

Vipin Varghese

Here are some useful GDB commands for debugging: - break <function> - Set a breakpoint at a function - break <file:line> - Set a breakpoint at a line in a file - run - Start program execution - next/n - Step over to next line, stepping over function calls - step/s - Step into function calls - finish - Step out of current function - print/p <variable> - Print value of a variable - backtrace/bt - Print the call stack - info breakpoints/ib - List breakpoints - delete <breakpoint#> - Delete a breakpoint - layout src - Switch layout to source code view - layout asm - Switch layout

BWC Supercomputing 2008 Presentation

lilyco

The document summarizes the use of the Sector and Sphere cloud computing software on the Open Cloud Testbed for the SC08 Bandwidth Challenge. Key points include: - Sector is a distributed storage system and Sphere simplifies distributed data processing using a map-reduce model. - The Open Cloud Testbed provided 101 nodes across 4 locations for running applications like TeraSort (sorting 1TB of data) and CreditStone (analyzing 3TB of credit card transactions). - Sector/Sphere applications achieved transfer rates of up to 20Gbps for TeraSort and 7.2Gbps for CreditStone, utilizing the distributed resources for large-scale data processing.

Your Game Needs Direct3D 11, So Get Started Now!

Johan Andersson

S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf

DLow6

RAPIDS accelerates data science and machine learning workflows in Python by leveraging GPUs. It includes cuDF for GPU-accelerated pandas functionality, cuML for scikit-learn compatible machine learning algorithms, cuGraph for graph analytics, and integrations with Dask and Spark. RAPIDS has a large community of contributors and is used by many Fortune 100 companies to speed up workflows, reduce costs, and scale to large datasets.

D3 D10 Unleashed New Features And Effects

Thomas Goddard

The document summarizes the key features and capabilities of Direct3D 10, which was designed to maximize GPU performance by reducing CPU overhead and enabling more work to be done on the GPU. Some of the main features discussed include constant buffers, geometry shaders, texture arrays, and other capabilities that reduce draw calls and state changes. Direct3D 10 also provides a standardized, consistent API and enables new visual effects by exposing more of the GPU's programmability and functionality to developers.

Status of HDF-EOS, Related Software and Tools

Microsoft Mobile Developer

060128 Galeon Rept

Rudolf Husar

The document discusses using the OGC Web Coverage Service (WCS) protocol to deliver air quality data from various sources through a system called DataFed. The WCS allows querying distributed air quality monitoring data in various formats. It provides a common data model and can deliver gridded data, images, and point data like that from monitoring stations. For air quality analysis, extending WCS to better support point data from stations would be useful.

Achitecture Aware Algorithms and Software for Peta and Exascale

inside-BigData.com

RAPIDS – Open GPU-accelerated Data Science

Data Works MD

RAPIDS – Open GPU-accelerated Data Science RAPIDS is an initiative driven by NVIDIA to accelerate the complete end-to-end data science ecosystem with GPUs. It consists of several open source projects that expose familiar interfaces making it easy to accelerate the entire data science pipeline- from the ETL and data wrangling to feature engineering, statistical modeling, machine learning, and graph analysis. Corey J. Nolet Corey has a passion for understanding the world through the analysis of data. He is a developer on the RAPIDS open source project focused on accelerating machine learning algorithms with GPUs. Adam Thompson Adam Thompson is a Senior Solutions Architect at NVIDIA. With a background in signal processing, he has spent his career participating in and leading programs focused on deep learning for RF classification, data compression, high-performance computing, and managing and designing applications targeting large collection frameworks. His research interests include deep learning, high-performance computing, systems engineering, cloud architecture/integration, and statistical signal processing. He holds a Masters degree in Electrical & Computer Engineering from Georgia Tech and a Bachelors from Clemson University.

DAW: Duplicate-AWare Federated Query Processing over the Web of Data

Muhammad Saleem

Location based services for Nokia X and Nokia Asha using Geo2tag

With the open source Geo2tag platform, developers can use JSON or XML to manage location references in apps for Nokia X and Nokia Asha phones. In this webinar, we’ll show how to use the Geo2tag API and how to manage a local database of georeferences. We’ll begin the training by introducing the fundamentals of Location Based Services and the REST API of Geo2Tag LBS Platform (www.geo2tag.org). We’ll focus on networking, JSON and web services. Then we will demonstrate several applications developed on top of Geo2Tagand share the newest enhancements to the platform. We’ll end the training with a discussion of integrating Geo2Tag and third-party map widgets.

State of the Art Web Mapping with Open Source

OSCON Byrum

This document discusses the importance of open source tools and data for web mapping. It begins by providing background on TileMill and Mapbox, which provide open source tools for making maps. It then discusses key concepts in web mapping like geospatial data formats, tile rendering, and minimal code examples. Modern approaches to web mapping involve preprocessing data, using tile renderers and caches, and gradually rendering more client-side. Upcoming improvements may optimize tiled formats and storage. TileMill is demonstrated as an open source tool for making maps. The talk concludes by emphasizing other open mapping tools like CartoDB, Stamen, and CartoDB that build on these concepts.

Klessydra t - designing vector coprocessors for multi-threaded edge-computing...

RISC-V International

The document describes a proposed Klessydra-T1 vector coprocessor architecture designed for multi-threaded edge computing cores. It achieves a 3x speedup over a baseline core through configurable SIMD and MIMD vector acceleration schemes. Benchmark results show cycle count reductions for workloads like convolution and matrix multiplication when using the coprocessor in various SISD, SIMD, and MIMD configurations. Resource utilization and maximum frequency are also analyzed.

2006-01-11 Data Flow & Interoperability in DataFed Service-based AQ Analysis ...

Rudolf Husar

The document discusses using open standards like OGC Web Coverage Service (WCS) to provide access to air quality data through web services. WCS allows querying subsets of coverages, which are datasets representing varying phenomena over space and time. It is applicable to grid, image and point data types. Efforts are ongoing to add point coverage support for monitoring station data and improve compatibility between WCS servers and clients.

0603 Esip Fed Wash Dc Tech Pres 060103 Esip Aq Tech Track

Rudolf Husar

Presentation NBMP and PCC

Rufael Mekuria

Similar to An Overview of HDF-EOS (Part 1) (20)

The HDF-EOS5 Tutorial

HDF-EOS Overview and Status

Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...

EECSCon Poster

Dpdk applications

BWC Supercomputing 2008 Presentation

Your Game Needs Direct3D 11, So Get Started Now!

S51281 - Accelerate Data Science in Python with RAPIDS_1679330128290001YmT7.pdf

D3 D10 Unleashed New Features And Effects

Status of HDF-EOS, Related Software and Tools

060128 Galeon Rept

Achitecture Aware Algorithms and Software for Peta and Exascale

RAPIDS – Open GPU-accelerated Data Science

DAW: Duplicate-AWare Federated Query Processing over the Web of Data

Location based services for Nokia X and Nokia Asha using Geo2tag

State of the Art Web Mapping with Open Source

Klessydra t - designing vector coprocessors for multi-threaded edge-computing...

2006-01-11 Data Flow & Interoperability in DataFed Service-based AQ Analysis ...

0603 Esip Fed Wash Dc Tech Pres 060103 Esip Aq Tech Track

Presentation NBMP and PCC

More from The HDF-EOS Tools and Information Center

Cloud-Optimized HDF5 Files

This document discusses how to optimize HDF5 files for efficient access in cloud object stores. Key optimizations include using large dataset chunk sizes of 1-4 MiB, consolidating internal file metadata, and minimizing variable-length datatypes. The document recommends creating files with paged aggregation and storing file content information in the user block to enable fast discovery of file contents when stored in object stores.

Accessing HDF5 data in the cloud with HSDS

This document provides an overview of HSDS (Highly Scalable Data Service), which is a REST-based service that allows accessing HDF5 data stored in the cloud. It discusses how HSDS maps HDF5 objects like datasets and groups to individual cloud storage objects to optimize performance. The document also describes how HSDS was used to improve access performance for NASA ICESat-2 HDF5 data on AWS S3 by hyper-chunking datasets into larger chunks spanning multiple original HDF5 chunks. Benchmark results showed that accessing the data through HSDS provided over 2x faster performance than other methods like ROS3 or S3FS that directly access the cloud storage.

The State of HDF

This document summarizes the current status and focus of the HDF Group. It discusses that the HDF Group is located in Champaign, IL and is a non-profit organization focused on developing and maintaining HDF software and data formats. It provides an overview of recent HDF5, HDF4 and HDFView releases and notes areas of focus for software quality improvements, increased transparency, strengthening the community, and modernizing HDF products. It invites support and participation in upcoming user group meetings.

Highly Scalable Data Service (HSDS) Performance Features

This document provides an overview of HSDS (HDF Server and Data Service), which allows HDF5 files to be stored and accessed from the cloud. Key points include: - HSDS maps HDF5 objects like datasets and groups to individual cloud storage objects for scalability and parallelism. - Features include streaming support, fancy indexing for complex queries, and caching for improved performance. - HSDS can be deployed on Docker, Kubernetes, or AWS Lambda depending on needs. - Case studies show HSDS is used by organizations like NREL and NSF to make petabytes of scientific data publicly accessible in the cloud.

Creating Cloud-Optimized HDF5 Files

This document discusses creating cloud-optimized HDF5 files by rearranging internal structures for more efficient data access in cloud object stores. It describes cloud-native and cloud-optimized storage formats, with the latter involving storing the entire HDF5 file as a single object. The benefits of cloud-optimized HDF5 include fast scanning and using the HDF5 library. Key aspects covered include using optimal chunk sizes, compression, and minimizing variable-length datatypes.

HDF5 OPeNDAP Handler Updates, and Performance Discussion

This document discusses updates and performance improvements to the HDF5 OPeNDAP data handler. It provides a history of the handler since 2001 and describes recent updates including supporting DAP4, new data types, and NetCDF data models. A performance study showed that passing compressed HDF5 data through the handler without decompressing/recompressing led to speedups of around 17-30x by leveraging HDF5 direct I/O APIs. This allows outputting HDF5 files as NetCDF files much faster through the handler.

Hyrax: Serving Data from S3

This document provides instructions for using the Hyrax software to serve scientific data files stored on Amazon S3 using the OPeNDAP data access protocol. It describes how to generate ancillary metadata files called DMR++ files using the get_dmrpp tool that provide information about the data file structure and locations. The document explains how to run get_dmrpp inside a Docker container to process data files on S3 and generate customized DMR++ files that the Hyrax server can use to serve the files to clients.

Accessing Cloud Data and Services Using EDL, Pydap, MATLAB

This document provides an overview and examples of accessing cloud data and services using the Earthdata Login (EDL), Pydap, and MATLAB. It discusses some common problems users encounter, such as being unable to access HDF5 data on AWS S3 using MATLAB or read data from OPeNDAP servers using Pydap. Solutions presented include using EDL to get temporary AWS tokens for S3 access in MATLAB and providing code examples on the HDFEOS website to help users access S3 data and OPeNDAP services. The document also notes some limitations, such as tokens being valid for only 1 hour, and workarounds like requesting new tokens or using the MATLAB HDF5 API instead of the netCDF API.

HDF - Current status and Future Directions

The HDF5 Roadmap and New Features document outlines upcoming changes and improvements to the HDF5 library. Key points include: - HDF5 1.13.x releases will include new features like selection I/O, the Onion VFD for versioned files, improved VFD SWMR for single-writer multiple-reader access, and subfiling for parallel I/O. - The Virtual Object Layer allows customizing HDF5 object storage and introduces terminal and pass-through connectors. - The Onion VFD stores versions of HDF5 files in a separate onion file for versioned access. - VFD SWMR improves on legacy SWMR by implementing single-writer multiple-reader capabilities

HDFEOS.org User Analsys, Updates, and Future

This document discusses user analysis of the HDFEOS.org website and plans for future improvements. It finds that the majority of the site's 100 daily users are "quiet", not posting on forums or other interactive elements. The main user types are locators, who search for examples or data; mergers, who combine or mosaic datasets; and converters, who change file formats. The document outlines recent updates focused on these user types, like adding Python examples for subsetting and calculating latitude and longitude. It proposes future work on artificial intelligence/machine learning uses of HDF files and examples for processing HDF data in the cloud.

HDF - Current status and Future Directions

H5Coro: The Cloud-Optimized Read-Only Library

The document describes H5Coro, a new C++ library for reading HDF5 files from cloud storage. H5Coro was created to optimize HDF5 reading for cloud environments by minimizing I/O operations through caching and efficient HTTP requests. Performance tests showed H5Coro was 77-132x faster than the previous HDF5 library at reading HDF5 data from Amazon S3 for NASA's SlideRule project. H5Coro supports common HDF5 elements but does not support writing or some complex HDF5 data types and messages to focus on optimized read-only performance for time series data stored sequentially in memory.

MATLAB Modernization on HDF5 1.10

This document summarizes MathWorks' work to modernize MATLAB's support for HDF5. Key points include: 1) MATLAB now supports HDF5 1.10.7 features like single-writer/multiple-reader access and virtual datasets through new and updated low-level functions. 2) Performance benchmarks show some improvements but also regressions compared to the previous HDF5 version, and work continues to optimize code and support future versions. 3) There are compatibility considerations for Linux filter plugins, but interim solutions are provided until MathWorks can ship a single HDF5 version.

HDF for the Cloud - Serverless HDF

HSDS provides HDF as a service through a REST API that can scale across nodes. New releases will enable serverless operation using AWS Lambda or direct client access without a server. This allows HDF data to be accessed remotely without managing servers. HSDS stores each HDF object separately, making it compatible with cloud object storage. Performance on AWS Lambda is slower than a dedicated server but has no management overhead. Direct client access has better performance but limits collaboration between clients.

HDF5 <-> Zarr

HDF5 and Zarr are data formats that can be used to store and access scientific data. This presentation discusses approaches to translating between the two formats. It describes how HDF5 files were translated to the Zarr format by creating a separate Zarr store to hold HDF5 file chunks, and storing chunk location metadata. It also discusses an implementation that translates Zarr data to the HDF5 format by using a special chunking layout and storing chunk information in an HDF5 compound dataset. Limitations of the translations include lack of support for some HDF5 dataset properties in Zarr, and lack of support for some Zarr compression methods in the HDF5 implementation.

HDF for the Cloud - New HDF Server Features

The document discusses HDF for the cloud, including new features of the HDF Server and what's next. Key points: - HDF Server uses a "sharded schema" that maps HDF5 objects to individual storage objects, allowing parallel access and updates without transferring entire files. - Implementations include HSDS software that uses the sharded schema with an API and SDKs for different languages like h5pyd for Python. - New features of HSDS 0.6 include support for POSIX, Azure, AWS Lambda, and role-based access control. - Future work includes direct access to storage without a server intermediary for some use cases.

Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3

This document compares different methods for accessing HDF and netCDF files stored on Amazon S3, including Apache Drill, THREDDS Data Server (TDS), and HDF5 Virtual File Driver (VFD). A benchmark test of accessing a 24GB HDF5/netCDF-4 file on S3 from Amazon EC2 found that TDS performed the best, responding within 2 minutes, while Apache Drill failed after 7 minutes. The document concludes that TDS 5.0 is the clear winner based on performance and support for role-based access control and HDF4 files, but the best solution depends on use case and software.

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...

This document discusses STARE-PODS, a proposal to NASA/ACCESS-19 to develop a scalable data store for earth science data using the SpatioTemporal Adaptive Resolution Encoding (STARE) indexing scheme. STARE allows diverse earth science data to be unified and indexed, enabling the data to be partitioned and stored in a Parallel Optimized Data Store (PODS) for efficient analysis. The HDF Virtual Object Layer and Virtual Data Set technologies can then provide interfaces to access the data in STARE-PODS in a familiar way. The goal is for STARE-PODS to organize diverse data for alignment and parallel/distributed storage and processing to enable integrative analysis at scale.

HDF5 and Ecosystem: What Is New?

This document provides an overview and update on HDF5 and its ecosystem. Key points include: - HDF5 1.12.0 was recently released with new features like the Virtual Object Layer and external references. - The HDF5 library now supports accessing data in the cloud using connectors like S3 VFD and REST VOL without needing to modify applications. - Projects like HDFql and H5CPP provide additional interfaces for querying and working with HDF5 files from languages like SQL, C++, and Python. - The HDF5 community is moving development to GitHub and improving documentation resources on the HDF wiki site.

HDF5 Roadmap 2019-2020