This tutorial is designed for new HDF5 users. We will go over a brief history of HDF and HDF5 software, and will cover basic HDF5 Data Model objects and their properties; we will give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples, and Java tool HDFView will be used to illustrate HDF5 concepts.
This presentation is a step-by-step Fast DDS tutorial in Windows on how to build an editor and subscriber from scratch. It is done using Visual Studio, and also from the command line, using the cmake options.
My study notes on the 2010 Haystack paper, which talks about Facebook's photo storage system. The design shares some similarity with Google's GFS (as in the 2003 paper).
This presentation is a step-by-step Fast DDS tutorial in Windows on how to build an editor and subscriber from scratch. It is done using Visual Studio, and also from the command line, using the cmake options.
My study notes on the 2010 Haystack paper, which talks about Facebook's photo storage system. The design shares some similarity with Google's GFS (as in the 2003 paper).
Containers are incredibly convenient to package applications and deploy them quickly across the data center.
This talk will introduce RunX, a new project under LF Edge that aims at bringing containers to the edge with extra benefits. At the core, RunX is an OCI-compatible container runtime to run software packaged as containers as Xen micro-VMs. RunX allows traditional containers to be executed with a minimal overhead as virtual machines, providing additional isolation and real-time support.
It also introduces new types of containers designed with edge and embedded deployments in mind. RunX enables RTOSes, and baremetal apps to be packaged as containers, delivered to the target using the powerful containers infrastructure, and deployed at runtime as Xen micro-VMs. Physical resources can be dynamically assigned to them, such as accelerators and FPGA blocks.
This presentation will go through the architecture of RunX and the new deployment scenarios it enables. It will provide an overview of the integration with Yocto Project via the meta-virtualization layer and describe how to build a complete system with Xen and RunX.
The presentation will come with a live demo on embedded hardware.
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
These slides accompanied a presentation by Steve Breker of Artefactual Systems, delivered as part of AtoM Camp Cambridge, a three-day boot camp held at St John's College, Cambridge University, May 9-11, 2017 For more information, see:
https://wiki.accesstomemory.org/Community/Camps/SJC2017
These slides are intended for developers who are interested in modifying the default look and feel of AtoM - known as the Dominion theme - and developing a custom theme plugin. They include some theme examples, how to register a plugin in Symfony, and some ideas of the elements you can modify via theming, with examples.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2020/12/making-edge-ai-inference-programming-easier-and-flexible-a-presentation-from-texas-instruments/
For more information about edge AI and computer vision, please visit:
https://www.edge-ai-vision.com
Manisha Agrawal, Product Marketing Engineer at Texas Instruments, presents the “Making Edge AI Inference Programming Easier and Flexible” tutorial at the September 2020 Embedded Vision Summit.
Deploying an AI model at the edge doesn’t have to be challenging—but it often is. Embedded processing vendors have unique sets of software tools for deploying models. It takes time and investment to learn to use proprietary tools and to optimize the edge implementation to achieve your desired performance. While embedded vendors are providing proprietary tools for model deployment, the open source community is also advancing to standardize the model deployment process and make it hardware agnostic.
Texas Instruments has adopted open source software frameworks to make model deployment easier and more flexible. In this talk, you will learn about the struggles developers face when deploying models for inference on embedded processors and how TI addresses these critical software development challenges. You will also discover how TI enables faster time-to-market using a flexible open source development approach without the need to compromise performance, accuracy or power requirements.
Best Practices in PHP Application DeploymentShahar Evron
An overview of the challenges in managing the web application development lifecycle and how a correct deployment system can help. A few common deployment techniques are reviewed. In addition, some info on an upcoming Zend Server deployment feature.
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019Sean Cohen
Starting from the basics, we explore the advantages of using Rook as a Storage operator to serve Ceph storage, the leading Software-Defined Storage platform in the Open Source world. Ceph automates the internal storage management, while Rook automates the user-facing operations and effectively turns a storage technology into a service transparent to the user. The combination delivers an impressive improvement in UX and provides the ideal storage platform for Kubernetes.
A comprehensive examination of use cases and open problems will complement our review of the Rook architecture. We will deep-dive into what Rook does well, what it does not do (yet), and what trade-offs using a storage operator involves operationally. With live access to a running cluster, we will showcase Rook in action as we discuss its capabilities.
https://www.openstack.org/summit/denver-2019/summit-schedule/events/23515/storage-101-rook-and-ceph
Domino Tech School - Upgrading to Notes/Domino V10: Best PracticesChristoph Adler
Are you looking to deploy Domino V10 but don’t know where to start? Upgrade servers or clients first? Should I upgrade the ODS? If you have questions like these, this session is for you. Get a complete understanding of the process to upgrade to Domino V10, and learn from best practices and tips from the field.
This tutorial targets HDF5 application developers and users who still use HDF5 1.6 releases and anyone who is interested in the HDF5 1.8.x libraries features. We will discuss how applications written for versions 1.6.x and earlier can be seamlessly moved to use the latest HDF5 releases. We will also talk about new features of the 1.8.x HDF5 Library such as redesigned group object, links, creation order, and different performance tuning knobs.
Containers are incredibly convenient to package applications and deploy them quickly across the data center.
This talk will introduce RunX, a new project under LF Edge that aims at bringing containers to the edge with extra benefits. At the core, RunX is an OCI-compatible container runtime to run software packaged as containers as Xen micro-VMs. RunX allows traditional containers to be executed with a minimal overhead as virtual machines, providing additional isolation and real-time support.
It also introduces new types of containers designed with edge and embedded deployments in mind. RunX enables RTOSes, and baremetal apps to be packaged as containers, delivered to the target using the powerful containers infrastructure, and deployed at runtime as Xen micro-VMs. Physical resources can be dynamically assigned to them, such as accelerators and FPGA blocks.
This presentation will go through the architecture of RunX and the new deployment scenarios it enables. It will provide an overview of the integration with Yocto Project via the meta-virtualization layer and describe how to build a complete system with Xen and RunX.
The presentation will come with a live demo on embedded hardware.
Ceph: Open Source Storage Software Optimizations on Intel® Architecture for C...Odinot Stanislas
Après la petite intro sur le stockage distribué et la description de Ceph, Jian Zhang réalise dans cette présentation quelques benchmarks intéressants : tests séquentiels, tests random et surtout comparaison des résultats avant et après optimisations. Les paramètres de configuration touchés et optimisations (Large page numbers, Omap data sur un disque séparé, ...) apportent au minimum 2x de perf en plus.
These slides accompanied a presentation by Steve Breker of Artefactual Systems, delivered as part of AtoM Camp Cambridge, a three-day boot camp held at St John's College, Cambridge University, May 9-11, 2017 For more information, see:
https://wiki.accesstomemory.org/Community/Camps/SJC2017
These slides are intended for developers who are interested in modifying the default look and feel of AtoM - known as the Dominion theme - and developing a custom theme plugin. They include some theme examples, how to register a plugin in Symfony, and some ideas of the elements you can modify via theming, with examples.
For the full video of this presentation, please visit:
https://www.edge-ai-vision.com/2020/12/making-edge-ai-inference-programming-easier-and-flexible-a-presentation-from-texas-instruments/
For more information about edge AI and computer vision, please visit:
https://www.edge-ai-vision.com
Manisha Agrawal, Product Marketing Engineer at Texas Instruments, presents the “Making Edge AI Inference Programming Easier and Flexible” tutorial at the September 2020 Embedded Vision Summit.
Deploying an AI model at the edge doesn’t have to be challenging—but it often is. Embedded processing vendors have unique sets of software tools for deploying models. It takes time and investment to learn to use proprietary tools and to optimize the edge implementation to achieve your desired performance. While embedded vendors are providing proprietary tools for model deployment, the open source community is also advancing to standardize the model deployment process and make it hardware agnostic.
Texas Instruments has adopted open source software frameworks to make model deployment easier and more flexible. In this talk, you will learn about the struggles developers face when deploying models for inference on embedded processors and how TI addresses these critical software development challenges. You will also discover how TI enables faster time-to-market using a flexible open source development approach without the need to compromise performance, accuracy or power requirements.
Best Practices in PHP Application DeploymentShahar Evron
An overview of the challenges in managing the web application development lifecycle and how a correct deployment system can help. A few common deployment techniques are reviewed. In addition, some info on an upcoming Zend Server deployment feature.
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019Sean Cohen
Starting from the basics, we explore the advantages of using Rook as a Storage operator to serve Ceph storage, the leading Software-Defined Storage platform in the Open Source world. Ceph automates the internal storage management, while Rook automates the user-facing operations and effectively turns a storage technology into a service transparent to the user. The combination delivers an impressive improvement in UX and provides the ideal storage platform for Kubernetes.
A comprehensive examination of use cases and open problems will complement our review of the Rook architecture. We will deep-dive into what Rook does well, what it does not do (yet), and what trade-offs using a storage operator involves operationally. With live access to a running cluster, we will showcase Rook in action as we discuss its capabilities.
https://www.openstack.org/summit/denver-2019/summit-schedule/events/23515/storage-101-rook-and-ceph
Domino Tech School - Upgrading to Notes/Domino V10: Best PracticesChristoph Adler
Are you looking to deploy Domino V10 but don’t know where to start? Upgrade servers or clients first? Should I upgrade the ODS? If you have questions like these, this session is for you. Get a complete understanding of the process to upgrade to Domino V10, and learn from best practices and tips from the field.
This tutorial targets HDF5 application developers and users who still use HDF5 1.6 releases and anyone who is interested in the HDF5 1.8.x libraries features. We will discuss how applications written for versions 1.6.x and earlier can be seamlessly moved to use the latest HDF5 releases. We will also talk about new features of the 1.8.x HDF5 Library such as redesigned group object, links, creation order, and different performance tuning knobs.
This slide will demonstrate how to use visualization and analysis tools such as IDV and GrADS to access HDF data via OPeNDAP.
To see animation in some slides, please visit:
http://hdfeos.org/workshops/ws13/presentations/day1/jxl_opendap_tutorial.ppt
Accessibility and usability of NPP/NPOESS data in HDF5 can be enhanced by providing tools that simplify and standardize how data is accessed and presented. In this project, The HDF Group is creating such tools in the form of software to read and write certain key data types and data aggregates used in NPP/NPOESS data products, and extending HDFView to extract, present and export these data effectively. In particular, the work will focus on NPP/NPOESS use of HDF5 region references and quality flags. The HDF Group will also provide high quality user support for the project.
Data produced by the Ozone PEATE from the Ozone Mapping and Profiler Suite (OMPS) instruments are to be stored in HDF5, not HDF-EOS, but will still need some features similar to those in HDF-EOS. In particular, a mechanism for handling dimension names will be needed. This poster proposes a method to handle dimension names for arrays in HDF5 in a manner commensurate with HDF-EOS5.
An update on HDF, including a status report on the HDF Group, an overview of recent changes to the HDF4 and HDF5 libraries and tools, plans for future releases, HDF Group projects and collaborations, and future plans.
As the volume and complexity of data from myriad Earth Observing platforms, both remote sensing and in-situ increases so does the demand for access to both data and information products from these data. The audience no longer is restricted to an investigator team with specialist science credentials. Non-specialist users from scientists from other disciplines, science-literate public, to teachers, to the general public and decision makers want access. What prevents them from this access to resources? It is the very complexity and specialist developed data formats, data set organizations and specialist terminology. What can be done in response? We must shift the burden from the user to the data provider. To achieve this our developed data infrastructures are likely to need greater degrees of internal code and data structure complexity to achieve (relatively) simpler end-user complexity. Evidence from numerous technical and consumer markets supports this scenario. We will cover the elements of modern data environments, what the new use cases are and how we can respond to them.
In this talk, we will give an update on the HDF5 OPeNDAP project. We will update the new features inside OPeNDAP HDF5 data handler. We will also introduce a new HDF5-Friendly OPeNDAP client library and demonstrate how it can help users to view and analyze remote HDF-EOS5 data served by OPeNDAP HDF5 handler. A demo will be presented with a customized OPeNDAP visualization client (GrADS) that uses the library.
ENVI and IDL software support HDF and HDF-EOS. Capabilities and the HDF tools built on ENVI and IDL will be reviewed. The current development will be discussed and demonstrated.
HDF-EOS is a software library designed to support NASA Earth Observing System (EOS) science data. HDF is the Hierarchical Data Format developed by The HDF Group. Specific data structures in HDF-EOS5 which are containers for science data are: Grid, Point, Zonal Average and Swath. These data structures are constructed from standard HDF5 data objects, using EOS conventions, through the use of a software library. This presentation is intended to familiarize current HDF-EOS users with the structure of HDF-EOS5 files and the Grid, Swath, Point and Zonal Average structures used in these files.
This tutorial is designed for new HDF5 users. We will cover basic HDF5 Data Model objects and their properties, give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples will be used to illustrate HDF5 concepts.
This tutorial is designed for new HDF5 users. We will cover HDF5 abstractions such as datasets, groups, attributes, and datatypes. Simple C examples will cover the programming model and basic features of the API, and will give new users the knowledge they need to navigate through the rich collection of HDF5 interfaces. Participants will be guided through an interactive demonstration of the fundamentals of HDF5.
This tutorial is for new HDF5 users.
This 2009 tutorial slide will cover basic HDF5 Data Model objects and their properties. It will include an overview of the HDF5 Libraries and APIs, and describe the HDF5 programming model. Simple programming examples and the HDFView data browser will be used to illustrate HDF5 concepts and start developing your own HDF5 based applications.
This tutorial is for new HDF5 users.
This tutorial is designed for the HDF5 users with some HDF5 experience.
It will cover advanced features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: partial I/O, chunked storage layout, compression and other filters including new n-bit and scale+offset filters. Significant time will be devoted to the discussion of complex HDF5 datatypes such as strings, variable-length datatypes, array and compound datatypes.
This Tutorial gives a brief introduction to HDF5 for people who have never used it. It covers the HDF5 Data Model including HDF5 objects and their properties. It also briefly describes the HDF5 Programming Model and prepares participants for further self-study of HDF5 and hands-on sessions.
Numerous scientific teams use the HDF5 format to store very large datasets. Efficient use of this data in a distributed environment depends on client applications being able to read any subset of the data without transferring the entire file to the local machine. The goal of the HDF5-iRODS Project was to develop an HDF5-iRODS module for the iRODS datagrid server that supported this capability, and to apply the technology to an NCSA/SDSC Strategic Applications Program (SAP) project, FLASH.
A joint team from The HDF Group (representing NCSA) and the SDSC SRB group collaborated to accomplish the project goal. The team implemented five HDF5 microservices functions on the iRODS server, and developed an iRODS FLASH slice client application. The client implementation also includes a JNI interface that allows HDFView, a standard tool for browsing HDF5 files, to access HDF5 files stored remotely in iRODS. Finally, three new collection client/server calls were added to the iRODS APIs, making it easier for users to query the content of an iRODS collection.
This tutorial will introduce the three levels of the HDF-Java products: the HDF-Java wrapper (or Java Native Interfaces to the standard HDF libraries), the HDF-Java object package, and the HDFView. The Java wrapper provides standard Java APIs that allow applications to call the C HDF libraries from Java. The HDF-Java object package implements HDF data objects, e.g. Groups and Datasets, in an object-oriented form and makes it easy for applications to use the libraries. The HDFView is a visual tool for browsing and editing HDF4 and HDF5 files.
This Tutorial is designed for the HDF5 users with some HDF5 experience. It will cover properties of the HDF5 objects that affect I/O performance and file sizes. The following HDF5 features will be discussed: partial I/O, chunking and compression, and complex HDF5 datatypes such as strings, variable-length arrays and compound datatypes.
We will also discuss references to objects and datasets regions and how they can be used for indexing. Participants will work with the Tutorial examples and exercises during the hands-on sessions.
This Tutorial is designed for new HDF5 users. We will cover basic HDF5 Data Model objects and their properties; we will give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples will be used to illustrate HDF5 concepts. Participants will work with the Tutorial examples and exercises during the hands-on sessions.
It will cover features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: datatype and partial I/O
This tutorial is for persons who are already familiar with HDF5 and wish to take advantage is some of its advanced features.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 4
Introduction to HDF5
1. Introduction to HDF5
HDF & HDF-EOS Workshop XII
October 15, 2008
10/15/08
HDF & HDF-EOS Workshop XII 1
1
2. Topics Covered
- Introduce HDF5
- Describe HDF5 Data and Programming Models
- Walk Through Example Code
10/15/08
HDF & HDF-EOS Workshop XII 2
2
3. For More Information …
All workshop slides will be available from:
http://hdfeos.org/workshops/ws12/workshop_twelve.php
10/15/08
HDF & HDF-EOS Workshop XII 3
4. What is HDF5?
HDF = Hierarchical Data Format
• Data model, library and file format for managing
data
• Tools for accessing data in the HDF5 format
10/15/08
HDF & HDF-EOS Workshop XII 4
5. Brief History of HDF
1987 At NCSA (University of Illinois), a task force formed to create an
architecture-independent format and library:
AEHOO (All Encompassing Hierarchical Object Oriented format)
Became HDF
Early NASA adopted HDF for Earth Observing System project
1990’s
1996
DOE’s ASC (Advanced Simulation and Computing) Project began
collaborating with the HDF group (NCSA) to create “Big HDF”
(Increase in computing power of DOE systems at LLNL, LANL and
Sandia National labs, required bigger, more complex data files).
“Big HDF” became HDF5.
1998
HDF5 was released with support from National Labs, NASA, NCSA
2006 The HDF Group spun off from University of Illinois as non-profit
corporation
10/15/08
HDF & HDF-EOS Workshop XII 5
6. Why HDF5?
In one sentence ...
10/15/08
HDF & HDF-EOS Workshop XII 6
6
7. Answering big questions …
Matter and the universe
Life and nature
August 24, 2001
August 24, 2002
Total Column Ozone (Dobson)
60
385
610
Weather and climate
10/15/08
HDF & HDF-EOS Workshop XII 7
7
8. … involves big data …
10/15/08
HDF & HDF-EOS Workshop XII 8
8
9. … varied data …
LCI Tutorial
10/15/08
Thanks to Mark
HDF & HDF-EOS Workshop XII 9 Miller, LLNL
9
10. … and complex relationships …
SNP Score
Contig Summaries
Discrepancies
Contig Qualities
Coverage Depth
Trace
Reads
Aligned bases
Read
quality
Contig
Percent match
10/15/08
HDF & HDF-EOS Workshop XII 10
10
11. … on big computers …
… and small computers …
10/15/08
HDF & HDF-EOS Workshop XII 11
11
12. How do we…
• Describe our data?
• Read it? Store it? Find it? Share it? Mine it?
• Move it into, out of, and between computers and
repositories?
• Achieve storage and I/O efficiency?
• Give applications and tools easy access our data?
10/15/08
HDF & HDF-EOS Workshop XII 12
12
13. Solution: HDF5!
• Can store all kinds of data in a variety of ways
• Runs on most systems
• Lots of tools to access data
• Emphasis on standards (HDF-EOS, CGNS)
• Library and format emphasis on I/O efficiency and
storage
10/15/08
HDF & HDF-EOS Workshop XII 13
14. Structure of HDF5 Library
Applications
Object API (C, F90, C++, Java)
Library internals
Virtual file I/O
File or other “storage”
10/15/08
HDF & HDF-EOS Workshop XII 14
16. HDF5 Applications & Domains
Examples: Thermonuclear simulations
Product modeling
Data mining tools
Visualization tools
Climate models
Simulation, visualization,
remote sensing…
HDF-EOS
Virtual File Layer
(I/O Drivers)
Stdio
CGNS
HDF5 Data Model & API
Split Files
MPI I/O
Storage
HDF5
format
10/15/08
File
ASC
Custom
?
Split metadata File on parallel
and raw data files file system
User-defined
device
HDF & HDF-EOS Workshop XII 16
Communities
17. Lots of Layers in HDF5!
“Ogres are like onions.”
Shrek HDF5 Monster??
Just like Shrek, once you get to
know HDF5 you will really like it!!
10/15/08
HDF & HDF-EOS Workshop XII 17
19. An HDF5 file is a container…
…into
which you
can put
your data
objects.
10/15/08
lat | lon | temp
----|-----|----12 | 23 | 3.1
15 | 24 | 4.2
17 | 21 | 3.6
te
let
pa
HDF & HDF-EOS Workshop XII 19
19
21. HDF5 Data Model
Primary Objects
• Groups
• Datasets
Additional ways to organize and annotate data
• Attributes
• Storage and access properties
Everything else is built from these parts.
10/15/08
HDF & HDF-EOS Workshop XII 21
21
23. Dataspaces
Two roles:
• Dataspace contains spatial info about a dataset
stored in a file
• Rank and dimensions
• Permanent part of dataset
definition
Rank = 2
Dimensions = 4x6
• Partial I/0: Dataspace describes application’s data
buffer and data elements participating in I/O
Rank = 1
Dimension = 10
10/15/08
HDF & HDF-EOS Workshop XII 23
23
24. Write – from memory to disk
memory
10/15/08
disk
HDF & HDF-EOS Workshop XII 24
24
25. Partial I/O
Move just part of a dataset
memory
disk
(a) Slab from a 2D array to the
corner of a smaller 2D array
Elements in each must be same.
10/15/08
disk
memory
(b) Regular series of blocks from a
2D array to a contiguous sequence
at a certain offset in a 1D array
HDF & HDF-EOS Workshop XII 25
25
26. Datatypes (array elements)
• Datatype – how to interpret a data element
• Permanent part of the dataset definition
• Two classes: atomic and compound
10/15/08
HDF & HDF-EOS Workshop XII 26
26
27. Datatypes
• HDF5 atomic types include:
integer & float
user-definable (e.g., 13-bit integer)
variable length types (e.g., strings)
references to objects/dataset regions
enumeration - names mapped to integers
• HDF5 compound types
Comparable to C structs (“records”)
Members can be atomic or compound types
10/15/08
HDF & HDF-EOS Workshop XII 27
27
28. HDF5 dataset: array of records
3
5
Dimensionality: 5 x 3
int8
int4
int16 2x3x2 array of float32
Datatype:
Record
10/15/08
HDF & HDF-EOS Workshop XII 28
28
29. Properties
• Properties are characteristics of HDF5 objects
that can be modified
• Default properties handle most needs
• By changing properties can take advantage of the
more powerful features in HDF5
10/15/08
HDF & HDF-EOS Workshop XII 29
30. Special Storage Properties
Better subsetting
access time;
extensible
chunked
Improves storage
efficiency,
transmission speed
compressed
Arrays can be
extended in any
direction
extensible
File B
split file
Dataset “Fred”
File A
Metadata for Fred
10/15/08
Metadata in one file,
raw data in another
Data for Fred
HDF & HDF-EOS Workshop XII 30
30
31. Attributes (optional)
• Attribute – data of the form “name = value”,
attached to an object
• Operations similar to dataset operations, but …
Not extensible
No compression or partial I/O
• Can be overwritten, deleted, added during the
“life” of a dataset
10/15/08
HDF & HDF-EOS Workshop XII 31
31
33. Groups
• A mechanism for organizing collections
• Every file starts with a root group
• Similar to UNIX directories
A
• Can have attributes
k
10/15/08
“/”
B
l m
HDF & HDF-EOS Workshop XII 33
33
C
34. Path to HDF5 Object in a File
/ (root)
/x
/foo
/foo/temp
/foo/bar/temp
10/15/08
foo
temp
“/”
x
bar
temp
HDF & HDF-EOS Workshop XII 34
34
37. Useful Tools For New Users
h5dump:
Tool to “dump” or display contents of HDF5 files
h5cc, h5c++, h5fc:
Scripts to compile applications
HDFView:
Java browser to view HDF4 and HDF5 files
10/15/08
HDF & HDF-EOS Workshop XII 37
38. H5dump Command-line Utility To View HDF5 File
h5dump [--header] [-a ] [-d <names>] [-g <names>]
[-l <names>] [-t <names>] [-p] <file>
--header
Display header only; no data is displayed.
-a <names> Display the specified attribute(s).
-d <names> Display the specified dataset(s).
-g <names> Display the specified group(s) and all the members.
-l <names>
Displays the value(s) of the specified soft link(s).
-t <names> Display the specified named datatype(s).
-p
Display properties.
<names> is one or more appropriate object names.
10/15/08
HDF & HDF-EOS Workshop XII 38
46. Simple HDF5 File in HDFView
Right-click and select
“Open” with mouse
Right-click and select
“Show Properties”
with mouse
10/15/08
HDF & HDF-EOS Workshop XII 46
47. Simple HDF5 File in HDFView
10/15/08
HDF & HDF-EOS Workshop XII 47
52. Operations Supported by the API
• Create objects (groups, datasets, attributes, complex data
types, …)
• Assign storage and I/O properties to objects
• Perform complex subsetting during read/write
• Use variety of I/O “devices” (parallel, remote, etc.)
• Transform data during I/O
• Make inquiries on file and object structure, content,
properties
10/15/08
HDF & HDF-EOS Workshop XII 52
52
53. General Programming Paradigm
• Properties of object are optionally defined
Creation properties
Access property lists
• Object is opened or created
• Object is accessed, possibly many times
• Object is closed
10/15/08
HDF & HDF-EOS Workshop XII 53
53
54. Order of Operations
• An order is imposed on operations by argument
dependencies
For Example:
A file must be opened before a dataset
-becausethe dataset open call requires a file handle
as an argument.
• Objects can be closed in any order.
10/15/08
HDF & HDF-EOS Workshop XII 54
54
55. The General HDF5 API
• Currently C, Fortran 90, Java, and C++ bindings.
• C routines begin with prefix H5?
? is a character corresponding to the type of object
the function acts on
Example Functions:
H5D : Dataset interface
H5F : File interface
e.g., H5Dread
e.g., H5Fopen
H5S : dataSpace interface e.g., H5Sclose
10/15/08
HDF & HDF-EOS Workshop XII 55
55
56. HDF5 Defined Types
For portability, the HDF5 library has its own defined
types:
hid_t:
hsize_t:
hssize_t:
object identifiers (native integer)
size used for dimensions (unsigned long or
unsigned long long)
for specifying coordinates and sometimes for
dimensions (signed long or signed long long)
herr_t:
function return value
hvl_t:
variable length datatype
For C, include hdf5.h in your HDF5 application.
10/15/08
HDF & HDF-EOS Workshop XII 56
56
57. The HDF5 API
• For flexibility, the API is extensive
300+ functions
Victronix
Swiss Army
Cybertool 34
• This can be daunting… but there is hope
A few functions can do a lot
Start simple
Build up knowledge as more features are needed
10/15/08
HDF & HDF-EOS Workshop XII 57
57
58. Basic Functions
H5Fcreate (H5Fopen)
H5Screate_simple
H5Dcreate (H5Dopen)
H5Dread, H5Dwrite
H5Dclose
H5Sclose
H5Fclose
10/15/08
create (open) File
create dataSpace
create (open) Dataset
access Dataset
close Dataset
close dataSpace
close File
HDF & HDF-EOS Workshop XII 58
60. High Level APIs
• Included along with the HDF5 library
• Simplify steps for creating, writing, and reading
objects
• Do not entirely ‘wrap’ HDF5 library
10/15/08
HDF & HDF-EOS Workshop XII 60
62. Steps to Create a File
1. Decide on special properties the file should have
•
•
•
Creation properties, like size of user block
Access properties, such as metadata cache size
Use default properties (H5P_DEFAULT)
2. Create property lists, if necessary
3. Create the file
4. Close the file and the property lists, as needed
10/15/08
HDF & HDF-EOS Workshop XII 62
62
63. Code: Create a File
hid_t
herr_t
file_id;
status;
file_id = H5Fcreate ("file.h5", H5F_ACC_TRUNC,
H5P_DEFAULT, H5P_DEFAULT);
status = H5Fclose (file_id);
“/” (root)
Note: Return codes not checked for errors in code samples.
10/15/08
HDF & HDF-EOS Workshop XII 63
63
65. Steps to Create a Dataset
1. Define dataset characteristics
•
•
•
Dataspace - 4x6
Datatype – integer
Properties if needed, or use H5P_DEFAULT
2. Decide where to put it
•
Obtain location ID:
- Group ID puts it in a Group
- File ID puts it in Root Group
“/” (root)
A
3. Create dataset in file
4. Close everything
10/15/08
HDF & HDF-EOS Workshop XII 65
65
66. HDF5 Pre-defined Datatype Identifiers
HDF5 defines* set of Datatype Identifiers per HDF5
session.
For example:
C Type
HDF5 File Type
HDF5 Memory Type
int
H5T_STD_I32BE
H5T_STD_I32LE
H5T_NATIVE_INT
float
H5T_IEEE_F32BE
H5T_IEEE_F32LE
H5T_NATIVE_FLOAT
double
H5T_IEEE_F64BE
H5T_IEEE_F64LE
H5T_NATIVE_DOUBLE
* Value of datatype is NOT fixed
10/15/08
HDF & HDF-EOS Workshop XII 66
67. Pre-defined File Datatype Identifiers
Examples:
H5T_IEEE_F64LE Eight-byte, little-endian, IEEE floating-point
H5T_STD_I32LE Four-byte, little-endian, signed two's
complement integer
Architecture*
Programming
Type
NOTE: What you see in the file. Name is the same everywhere and
explicitly defines a datatype.
*STD= “An architecture with a semi-standard type like 2’s complement integer, unsigned integer…”
10/15/08
HDF & HDF-EOS Workshop XII 67
68. Pre-defined Native Datatypes
Examples of predefined native types in C:
H5T_NATIVE_INT
H5T_NATIVE_FLOAT
H5T_NATIVE_UINT
H5T_NATIVE_LONG
H5T_NATIVE_CHAR
(int)
(float )
(unsigned int)
(long )
(char )
NOTE: Memory types.
Different for each machine.
Used for reading/writing.
10/15/08
HDF & HDF-EOS Workshop XII 68
69. Dataset Creation Property List
Dataset creation property list: information on how to
organize data in storage.
Chunked
Chunked &
compressed
H5P_DEFAULT: contiguous
10/15/08
HDF & HDF-EOS Workshop XII 69
69
70. Code: Create a Dataset
1
2
3
hid_t
hsize_t
herr_t
file_id, dataset_id, dataspace_id;
dims[2];
status;
4
file_id = H5Fcreate (”file.h5", H5F_ACC_TRUNC,
H5P_DEFAULT, H5P_DEFAULT);
Create a
5 dataspace= 4;
dims[0]
6
7
rank
dims[1] = 6;
dataspace_id = H5Screate_simple (2, dims, NULL);
Create a dataset
8
current dims
pathname
datatype
dataset_id = H5Dcreate(file_id,”A",H5T_STD_I32BE,
dataspace_id, H5P_DEFAULT);
dataspace
Terminate access to dataset, dataspace,
file
9 status = H5Dclose (dataset_id);
10 status = H5Sclose (dataspace_id);
11 status = H5Fclose (file_id);
10/15/08
property list
(default)
HDF & HDF-EOS Workshop XII 70
70
71. Example Code - H5Dwrite
Dataset Identifier from
H5Dcreate or H5Dopen
Memory Datatype
status = H5Dwrite (dataset_id, H5T_NATIVE_INT, H5S_ALL,
H5S_ALL, H5P_DEFAULT, dset_data);
10/15/08
HDF & HDF-EOS Workshop XII 71
72. Example Code – H5Dwrite
status = H5Dwrite (dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL,
H5P_DEFAULT, dset_data);
Data Transfer Property List
(MPI I/O, Transformations, …)
Memory
Dataspace
File
Dataspace
H5S_ALL selects entire
dataspace
10/15/08
HDF & HDF-EOS Workshop XII 72
73. Partial I/O
Memory Dataspace
H5S_ALL
File Dataspace (disk)
H5S_ALL
Get a Dataspace:
H5Screate_simple
H5Dget_space
Modify Dataspace:
H5Sselect_hyperslab
H5Sselect_elements
10/15/08
HDF & HDF-EOS Workshop XII 73
74. Example Code – H5Dread
status = H5Dread (dataset_id, H5T_NATIVE_INT,
H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_rdata);
10/15/08
HDF & HDF-EOS Workshop XII 74
75. High Level APIs: HDF5 Lite (H5LT)
#include "H5LT.h"
…
file_id = H5Fcreate (“file.h5", H5F_ACC_TRUNC,
H5P_DEFAULT, H5P_DEFAULT);
status = H5LTmake_dataset (file_id,“A", 2, dims,
H5T_STD_I32BE, data);
status = H5Fclose (file_id);
10/15/08
HDF & HDF-EOS Workshop XII 75
77. Example: Create a Group
“/” (root)
A
B
4x6 array of
integers
file.h5
10/15/08
HDF & HDF-EOS Workshop XII 77
77
78. Steps to Create a Group
1. Decide where to put it – “root group”
•
Obtain location ID
2. Decide name – “B”
3. Create group in file
4. (Eventually) close the group.
10/15/08
HDF & HDF-EOS Workshop XII 78
78
79. Code: Create a Group
hid_t file_id, group_id;
...
/* Open “file.h5” */
file_id = H5Fopen (“file.h5”, H5F_ACC_RDWR,
H5P_DEFAULT);
/* Create group "/B" in file. */
group_id = H5Gcreate (file_id,"B",0);
Size hint for number of
bytes to store names of
objects. 0=default
/* Close group and file. */
status = H5Gclose (group_id);
status = H5Fclose (file_id);
10/15/08
HDF & HDF-EOS Workshop XII 79
79
80. Thank you!
This work was supported by the Cooperative Agreement with the
National Aeronautics and Space Administration (NASA) under NASA
grant NNX06AC83A and NNX08A077A. Any opinions, findings,
conclusions or recommendations expressed in this material are those of
the author(s) and do not necessarily reflect the views of NASA.
10/15/08
HDF & HDF-EOS Workshop XII 80
Editor's Notes
The CFD General Notation System (CGNS) provides a general, portable, and extensible standard for the storage and retrieval of computational fluid dynamics (CFD) analysis data. It consists of a collection of conventions, and free and open software implementing those conventions. It is self-descriptive, machine-independent, well-documented, and administered by an international steering committee.
The CGNS implementation of SIDS, so-called MLL, was originally built using a file format called ADF (Advanced Data Format). This format was based on a common file format system previously in use at McDonnell Douglas. The ADF has worked extremely well, requiring little repair, upgrade, or maintenance over the last decade. However, ADF does not have parallel I/O or data compression capabilities, and does not have the support and tools that the storage format HDF5 offers. HDF5, supported by The HDF Group, has rapidly grown to become a world-wide format standard for storing scientific data. HDF5 has parallel capability as well as a broader support base than ADF.
This shows that you can mix objects of different types according to your needs. Typically, there will be metadata stored with objects to indicate what type of object they are.
Like HDF4, HDF5 has a grouping structure. The main difference is that every HDF5 file starts with a root group, whereas HDF4 doesn’t need any groups at all.
Data Array is an ordered collection of identically typed data items distinguished by their indices
Metadata:
Dataspace – Rank, dimensions; spatial info about dataset
Datatype – Information on how to interpret your data
Storage Properties – How array is organized
Attributes – User-defined metadata (optional)
Here is an example of a basic HDF5 object.
Notice that each element in the 3D array is a record with four values in it.
Data Array is an ordered collection of identically typed data items distinguished by their indices
Metadata:
Dataspace – Rank, dimensions; spatial info about dataset
Datatype – Information on how to interpret your data
Storage Properties – How array is organized
Attributes – User-defined metadata (optional)
Like HDF4, HDF5 has a grouping structure.
The main difference is that every HDF5 file starts with a root group, whereas HDF4 doesn’t need any groups at all.
To create this file, we would start by creating the file itself. When you create a file, the root group gets created with it. So every file has at least that one group.