This document provides examples of adding Climate and Forecast (CF) metadata attributes to HDF5 files using different programming languages and interfaces. It summarizes how to add CF attributes like units, long_name, and coordinates to datasets in HDF5 files using C, Fortran, Python, netCDF4, HDF-EOS5, and HDFView. It also briefly mentions using HDF5 dimension scales to associate coordinate variables with datasets.
This tutorial is designed for anyone who needs to work with data stored in HDF5 files. The tutorial will cover functionality and useful features of the HDF5 utilities h5dump, h5diff, h5repack, h5stat, h5copy, h5check and h5repart. We will also introduce a prototype of the new h52jpeg conversion tool and recently released h5perf_serial tool used for performance studies. We will briefly introduce HDFView. Details of the HDFView and HDF-Java will be discussed in a separate tutorial.
A brief overview of using HDF5 with Python and Andrew Collette's h5py module will be presented, including examples which show how and why Python can be used in the place of HDF5 tools. Extensions to the HDF5 API will be proposed which would further improve the utility of Python/h5py.
An introduction to the Python programming language and its numerical abilities will be presented. With this background, Andrew Collette's H5Py module--an HDF5-Python interface--will be explained highlighting the unique and useful similarities between Python data structures and HDF5.
From a talk by Andrew Collette to the Boulder Earth and Space Science Informatics Group (BESSIG) on November 20, 2013.
This talk explores how researchers can use the scalable, self-describing HDF5 data format together with the Python programming language to improve the analysis pipeline, easily archive and share large datasets, and improve confidence in scientific results. The discussion will focus on real-world applications of HDF5 in experimental physics at two multimillion-dollar research facilities: the Large Plasma Device at UCLA, and the NASA-funded hypervelocity dust accelerator at CU Boulder. This event coincides with the launch of a new O’Reilly book, Python and HDF5: Unlocking Scientific Data.
As scientific datasets grow from gigabytes to terabytes and beyond, the use of standard formats for data storage and communication becomes critical. HDF5, the most recent version of the Hierarchical Data Format originally developed at the National Center for Supercomputing Applications (NCSA), has rapidly emerged as the mechanism of choice for storing and sharing large datasets. At the same time, many researchers who routinely deal with large numerical datasets have been drawn to the Python by its ease of use and rapid development capabilities.
Over the past several years, Python has emerged as a credible alternative to scientific analysis environments like IDL or MATLAB. In addition to stable core packages for handling numerical arrays, analysis, and plotting, the Python ecosystem provides a huge selection of more specialized software, reducing the amount of work necessary to write scientific code while also increasing the quality of results. Python’s excellent support for standard data formats allows scientists to interact seamlessly with colleagues using other platforms.
HDF5 is a powerful and feature-rich creature, and getting the most out of it requires powerful tools. The MathWorks provides a "low-level" interface to the HDF5 library that closely corresponds to the C API and exposes much of its richness. This short tutorial will present ways to use the low-level MATLAB interface to build those tools and tackle such topics as subsetting, chunking, and compression.
The NPOESS program uses Unified Modeling Language (UML) to describe the format of the HDF5 files produced. For each unique type of data product, the HDF5 storage organization and the means to retrieve the data is the same. This provides a consistent data retrieval interface for manual and automated users of the data, without which would require custom development and cumbersome maintenance. The data formats are described using UML to provide a profile of HDF5 files.
This poster will show each unique data type so far produced by NPOESS, and the contents of the files. We will also have overhead snapshots of the data contents.
Dimension Scales for HDF-EOS2 and HDF-EOS5 field dimensions were added to the new release of HDF-EOS. The new APIs will be presented and sample outputs will be shown. Need for development of new APIs for handling Dimension Scales will be discussed.
Datasets with millions of events in charm decays at LHCb have prompted the development of powerful fitting and analysis tools capable of handling unbinned datasets using GPUs and multithreaded architectures.
GooFit, the original GPU fitting program with a familiar syntax resembling classic RooFit, has undergone significant redesign and has expanded physics and computing capabilities. The performance has been improved and tested on a variety of systems. GooFit 2.0 is easier than ever to install, develop, and use on any system.
A new templated header-only library, Hydra, provides highly optimized general framework for fits, Monte Carlo generation, integration, and more. The design and benefits of this system along with initial tests will be shown.
Finally, a model-independent search for direct CP violation using an unbinned approach called an energy test was performed directly using the Thrust library (which both of the previous packages are based on). Public results from this analysis and performance comparisons will be presented.
This tutorial is designed for anyone who needs to work with data stored in HDF5 files. The tutorial will cover functionality and useful features of the HDF5 utilities h5dump, h5diff, h5repack, h5stat, h5copy, h5check and h5repart. We will also introduce a prototype of the new h52jpeg conversion tool and recently released h5perf_serial tool used for performance studies. We will briefly introduce HDFView. Details of the HDFView and HDF-Java will be discussed in a separate tutorial.
A brief overview of using HDF5 with Python and Andrew Collette's h5py module will be presented, including examples which show how and why Python can be used in the place of HDF5 tools. Extensions to the HDF5 API will be proposed which would further improve the utility of Python/h5py.
An introduction to the Python programming language and its numerical abilities will be presented. With this background, Andrew Collette's H5Py module--an HDF5-Python interface--will be explained highlighting the unique and useful similarities between Python data structures and HDF5.
From a talk by Andrew Collette to the Boulder Earth and Space Science Informatics Group (BESSIG) on November 20, 2013.
This talk explores how researchers can use the scalable, self-describing HDF5 data format together with the Python programming language to improve the analysis pipeline, easily archive and share large datasets, and improve confidence in scientific results. The discussion will focus on real-world applications of HDF5 in experimental physics at two multimillion-dollar research facilities: the Large Plasma Device at UCLA, and the NASA-funded hypervelocity dust accelerator at CU Boulder. This event coincides with the launch of a new O’Reilly book, Python and HDF5: Unlocking Scientific Data.
As scientific datasets grow from gigabytes to terabytes and beyond, the use of standard formats for data storage and communication becomes critical. HDF5, the most recent version of the Hierarchical Data Format originally developed at the National Center for Supercomputing Applications (NCSA), has rapidly emerged as the mechanism of choice for storing and sharing large datasets. At the same time, many researchers who routinely deal with large numerical datasets have been drawn to the Python by its ease of use and rapid development capabilities.
Over the past several years, Python has emerged as a credible alternative to scientific analysis environments like IDL or MATLAB. In addition to stable core packages for handling numerical arrays, analysis, and plotting, the Python ecosystem provides a huge selection of more specialized software, reducing the amount of work necessary to write scientific code while also increasing the quality of results. Python’s excellent support for standard data formats allows scientists to interact seamlessly with colleagues using other platforms.
HDF5 is a powerful and feature-rich creature, and getting the most out of it requires powerful tools. The MathWorks provides a "low-level" interface to the HDF5 library that closely corresponds to the C API and exposes much of its richness. This short tutorial will present ways to use the low-level MATLAB interface to build those tools and tackle such topics as subsetting, chunking, and compression.
The NPOESS program uses Unified Modeling Language (UML) to describe the format of the HDF5 files produced. For each unique type of data product, the HDF5 storage organization and the means to retrieve the data is the same. This provides a consistent data retrieval interface for manual and automated users of the data, without which would require custom development and cumbersome maintenance. The data formats are described using UML to provide a profile of HDF5 files.
This poster will show each unique data type so far produced by NPOESS, and the contents of the files. We will also have overhead snapshots of the data contents.
Dimension Scales for HDF-EOS2 and HDF-EOS5 field dimensions were added to the new release of HDF-EOS. The new APIs will be presented and sample outputs will be shown. Need for development of new APIs for handling Dimension Scales will be discussed.
Datasets with millions of events in charm decays at LHCb have prompted the development of powerful fitting and analysis tools capable of handling unbinned datasets using GPUs and multithreaded architectures.
GooFit, the original GPU fitting program with a familiar syntax resembling classic RooFit, has undergone significant redesign and has expanded physics and computing capabilities. The performance has been improved and tested on a variety of systems. GooFit 2.0 is easier than ever to install, develop, and use on any system.
A new templated header-only library, Hydra, provides highly optimized general framework for fits, Monte Carlo generation, integration, and more. The design and benefits of this system along with initial tests will be shown.
Finally, a model-independent search for direct CP violation using an unbinned approach called an energy test was performed directly using the Thrust library (which both of the previous packages are based on). Public results from this analysis and performance comparisons will be presented.
This tutorial is designed for anyone who needs to work with data stored in HDF5 files. It will cover functionality and useful features of the HDF5 utilities, which include h5dump, h5diff, h5repack, h5stat, h5copy, h5check and h5repart. The tutorial will also introduce recently changes and new features of the utilities.
The HDFView is a visual tool for browsing and editing HDF4 and HDF5 files. Some basic features and new changes of HDFView will be presented. Details of recent development in HDF-Java products will be discussed in a separate presentation.
Despite being a slow interpreter, Python is a key component in high-performance computing (HPC). Python is easy to use. C++ is fast. Together they are a beautiful blend. A new tool, pybind11, makes this approach even more attractive to HPC code. It focuses on the niceties C++11 brings in. Beyond the syntactic sugar around the Python C API, it is interesting to see how pybind11 handles the vast difference between the two languages, and what matters to HPC.
Notes about moving from python to c++ py contw 2020Yung-Yu Chen
Python is fast to write and deliver results, but it is not a good choice when you cannot sacrifice any runtime. It is the time to switch to C++. But we certainly don’t want to give up the productivity available to Python, and we worry about the complexity of C++. The way to go is to design the backbone system in C++ and expose the API in Python. Then we can enjoy the capabilities coming from the complex compiler and scripting it just like Python.
This Tutorial is designed for the HDF5 users with some HDF5 experience. It will cover advanced features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: partial I/O, compression and other filters including new n-bit and scale+offset filters, and data storage options. Significant time will be devoted to the discussion of complex HDF5 datatypes such as strings, variable-length, array and compound datatypes. Participants will work with the Tutorial examples and exercises during the hands-on sessions.
Threads and Callbacks for Embedded PythonYi-Lung Tsai
Python is a great choice to be customized plug-ins for existing applications. Extending existing applications with Python program is also practical. For large systems, multi-thread programming is ubiquitous along with asynchronous programming, such as event routing. This presentation focuses on dealing with threads and callbacks while embedding Python in other applications.
AfterGlow is a script that assists with the visualization of log data. It reads CSV files and converts them into a Graph description. Check out http://afterglow.sf.net for more information also.
This short presentation gives an overview of AfterGlow and outlines the features and capabilities of the tool. It discusses some of the harder to understand features by showing some configuration examples that can be used as a starting point for some more sophisticated setups.
AftterGlow is one the most downloaded security visualization tools with over 17,000 downloads.
The GooFit package provides physicists a simple, familiar syntax for manipulating probability density functions and performing fits, but is highly optimized for data analysis on NVIDIA GPUs and multithreaded CPU backends. GooFit is being updated to version 2.0, bringing a host of new features. A completely revamped and redesigned build system makes GooFit easier to install, develop with, and run on virtually any system. Unit testing, continuous integration, and advanced logging options are improving the stability and reliability of the system. Developing new PDFs now uses standard CUDA terminology and provides a lower barrier for new users. The system now has built-in support for multiple graphics cards or nodes using MPI, and is being tested on a wide range of different systems.
GooFit also has significant improvements in performance on some GPU architectures due to optimized memory access. Support for time-dependent four body amplitude analyses has also been added.
Reproducible Computational Research in RSamuel Bosch
A short presentation with pointers on getting started with reproducible computational research in R. Some of the topics include git, R package development, document generation with R markdown, saving plots, saving tables and using packrat.
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPyDong-hee Na
gopy is an excellent tool which generates (and compiles) a CPython extension module from a go package. And I hope more developers could make full use of gopy to migrate their go code into python code. To make gopy more advanced, It is necessary to provide APIs for various Python compiler versions, such as CPython 2/3 and PyPy. This can be improved with CFFI or ctypes. Moreover, many go’s implementations/features are not yet implemented in gopy. So we need to implement implementations such as slices, interfaces, and maps in the go.
My goal is to update gopy by using CFFI to support Python3 and PyPy and write detailed documents
HDF5 is designed to work well on high performance parallel systems and clusters. This tutorial will review the high performance features of HDF5, including:
o Design of Parallel HDF5 Library
o Parallel HDF5 Programming Model and Environment
It is desired that participants are familiar with MPI and MPI I/0 and have a basic knowledge of sequential HDF5 Library. The lecture will prepare them for the Parallel I/O hands-on session.
This tutorial is designed for anyone who needs to work with data stored in HDF5 files. It will cover functionality and useful features of the HDF5 utilities, which include h5dump, h5diff, h5repack, h5stat, h5copy, h5check and h5repart. The tutorial will also introduce recently changes and new features of the utilities.
The HDFView is a visual tool for browsing and editing HDF4 and HDF5 files. Some basic features and new changes of HDFView will be presented. Details of recent development in HDF-Java products will be discussed in a separate presentation.
Despite being a slow interpreter, Python is a key component in high-performance computing (HPC). Python is easy to use. C++ is fast. Together they are a beautiful blend. A new tool, pybind11, makes this approach even more attractive to HPC code. It focuses on the niceties C++11 brings in. Beyond the syntactic sugar around the Python C API, it is interesting to see how pybind11 handles the vast difference between the two languages, and what matters to HPC.
Notes about moving from python to c++ py contw 2020Yung-Yu Chen
Python is fast to write and deliver results, but it is not a good choice when you cannot sacrifice any runtime. It is the time to switch to C++. But we certainly don’t want to give up the productivity available to Python, and we worry about the complexity of C++. The way to go is to design the backbone system in C++ and expose the API in Python. Then we can enjoy the capabilities coming from the complex compiler and scripting it just like Python.
This Tutorial is designed for the HDF5 users with some HDF5 experience. It will cover advanced features of the HDF5 library for achieving better I/O performance and efficient storage. The following HDF5 features will be discussed: partial I/O, compression and other filters including new n-bit and scale+offset filters, and data storage options. Significant time will be devoted to the discussion of complex HDF5 datatypes such as strings, variable-length, array and compound datatypes. Participants will work with the Tutorial examples and exercises during the hands-on sessions.
Threads and Callbacks for Embedded PythonYi-Lung Tsai
Python is a great choice to be customized plug-ins for existing applications. Extending existing applications with Python program is also practical. For large systems, multi-thread programming is ubiquitous along with asynchronous programming, such as event routing. This presentation focuses on dealing with threads and callbacks while embedding Python in other applications.
AfterGlow is a script that assists with the visualization of log data. It reads CSV files and converts them into a Graph description. Check out http://afterglow.sf.net for more information also.
This short presentation gives an overview of AfterGlow and outlines the features and capabilities of the tool. It discusses some of the harder to understand features by showing some configuration examples that can be used as a starting point for some more sophisticated setups.
AftterGlow is one the most downloaded security visualization tools with over 17,000 downloads.
The GooFit package provides physicists a simple, familiar syntax for manipulating probability density functions and performing fits, but is highly optimized for data analysis on NVIDIA GPUs and multithreaded CPU backends. GooFit is being updated to version 2.0, bringing a host of new features. A completely revamped and redesigned build system makes GooFit easier to install, develop with, and run on virtually any system. Unit testing, continuous integration, and advanced logging options are improving the stability and reliability of the system. Developing new PDFs now uses standard CUDA terminology and provides a lower barrier for new users. The system now has built-in support for multiple graphics cards or nodes using MPI, and is being tested on a wide range of different systems.
GooFit also has significant improvements in performance on some GPU architectures due to optimized memory access. Support for time-dependent four body amplitude analyses has also been added.
Reproducible Computational Research in RSamuel Bosch
A short presentation with pointers on getting started with reproducible computational research in R. Some of the topics include git, R package development, document generation with R markdown, saving plots, saving tables and using packrat.
[GSoC 2017] gopy: Updating gopy to support Python3 and PyPyDong-hee Na
gopy is an excellent tool which generates (and compiles) a CPython extension module from a go package. And I hope more developers could make full use of gopy to migrate their go code into python code. To make gopy more advanced, It is necessary to provide APIs for various Python compiler versions, such as CPython 2/3 and PyPy. This can be improved with CFFI or ctypes. Moreover, many go’s implementations/features are not yet implemented in gopy. So we need to implement implementations such as slices, interfaces, and maps in the go.
My goal is to update gopy by using CFFI to support Python3 and PyPy and write detailed documents
HDF5 is designed to work well on high performance parallel systems and clusters. This tutorial will review the high performance features of HDF5, including:
o Design of Parallel HDF5 Library
o Parallel HDF5 Programming Model and Environment
It is desired that participants are familiar with MPI and MPI I/0 and have a basic knowledge of sequential HDF5 Library. The lecture will prepare them for the Parallel I/O hands-on session.
This Tutorial is designed for the users who have exposure to MPI I/O and basic concepts of HDF5 and would like to learn about Parallel HDF5 Library. The Tutorial will cover Parallel HDF5 design and programming model. Several C and Fortran examples will be used to illustrate the basic ideas of the Parallel HDF5 programming model. Some performance issues including collective chunked I/O will be discussed. Participants will work with the Tutorial examples and exercises during the hands-on sessions.
This Tutorial gives a brief introduction to HDF5 for people who have never used it. It covers the HDF5 Data Model including HDF5 objects and their properties. It also briefly describes the HDF5 Programming Model and prepares participants for further self-study of HDF5 and hands-on sessions.
In this Tutorial we will discuss different storage methods for the HDF5 files (split files, family of files, multi-files), and datasets (compressed, external, compact), and related filters and properties. This tutorial will introduce advanced features of HDF5, including:
o Property lists
o Compound datatypes
o hyperslab selections
o point selection
o references to objects and regions
o extendable datasets
o mounting files
group iterations
The overall evolution towards microservices has caused a lot of IT leaders to radically rethink architectures and platforms. One can hardly keep up with the rapid onslaught on new distributed technologies. The same people who just asked yesterday "how can we deploy Docker containers?", are now asking "how can we operate Kubernetes-as-a-Service on-premise?", and are about to start asking "how can we operate the open source frameworks of our choice, such as Spark, TensorFlow, HDFS, and more, as a service across hybrid clouds?”. This session will discuss: Challenges of orchestrating and operating.
The overall evolution towards microservices has caused a lot of IT leaders to radically rethink architectures and platforms. One can hardly keep up with the rapid onslaught on new distributed technologies. The same people who just asked yesterday "how can we deploy Docker containers?", are now asking "how can we operate Kubernetes-as-a-Service on-premise?", and are about to start asking "how can we operate the open source frameworks of our choice, such as Spark, TensorFlow, HDFS, and more, as a service across hybrid clouds?”. This session will discuss: Challenges of orchestrating and operating
Listen up, developers. You are not special. Your infrastructure is not a beautiful and unique snowflake. You have the same tech debt as everyone else. This is a talk about a better way to build and manage infrastructure: Terraform Modules. It goes over how to build infrastructure as code, package that code into reusable modules, design clean and flexible APIs for those modules, write automated tests for the modules, and combine multiple modules into an end-to-end techs tack in minutes.
You can find the video here: https://www.youtube.com/watch?v=LVgP63BkhKQ
22nd Athens Big Data Meetup - 1st Talk - MLOps Workshop: The Full ML Lifecycl...Athens Big Data
Title: MLOps Workshop: The Full ML Lifecycle - How to Use ML in Production
Speakers: Spyros Cavadias (https://www.linkedin.com/in/spyros-cavadias/), Konstantinos Pittas (https://www.linkedin.com/in/konstantinos-pittas-83310270/), Thanos Gkinakos (https://www.linkedin.com/in/thanos-gkinakos-03582a128/)
Date: Saturday, December 17, 2022
Event: https://www.meetup.com/athens-big-data/events/289927468/
Hive is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. At the same time this language also allows traditional map/reduce programmers to plug in their custom mappers and reducers when it is inconvenient or inefficient to express this logic in HiveQL.
This tutorial is designed for users with some HDF5 experience. It will cover advanced features of the HDF5 library that can be used to achieve better I/O performance and more efficient storage. The following HDF5 features will be discussed: partial I/O; compression and other filters, including new n-bit and scale+offset filters and data storage options. Significant time will be devoted to the discussion of complex HDF5 datatypes such as strings, variable-length datatypes, array datatypes, and compound datatypes.
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteGoogle
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
👉👉 Click Here To Get More Info 👇👇
https://sumonreview.com/ai-pilot-review/
AI Pilot Review: Key Features
✅Deploy AI expert bots in Any Niche With Just A Click
✅With one keyword, generate complete funnels, websites, landing pages, and more.
✅More than 85 AI features are included in the AI pilot.
✅No setup or configuration; use your voice (like Siri) to do whatever you want.
✅You Can Use AI Pilot To Create your version of AI Pilot And Charge People For It…
✅ZERO Manual Work With AI Pilot. Never write, Design, Or Code Again.
✅ZERO Limits On Features Or Usages
✅Use Our AI-powered Traffic To Get Hundreds Of Customers
✅No Complicated Setup: Get Up And Running In 2 Minutes
✅99.99% Up-Time Guaranteed
✅30 Days Money-Back Guarantee
✅ZERO Upfront Cost
See My Other Reviews Article:
(1) TubeTrivia AI Review: https://sumonreview.com/tubetrivia-ai-review
(2) SocioWave Review: https://sumonreview.com/sociowave-review
(3) AI Partner & Profit Review: https://sumonreview.com/ai-partner-profit-review
(4) AI Ebook Suite Review: https://sumonreview.com/ai-ebook-suite-review
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
2. www.hdfgroup.org
Climate and Forecast Conventions
• Metadata conventions for earth science data
• Included in same file as data
• Description of what the data represents
• Uses values of universal attribute
• Extension of COARDS* conventions
• Allows comparison of data from different
sources
*Cooperative Ocean/Atmosphere Research Data Service
URL: http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.5/cf-conventions.pdf
3. www.hdfgroup.org
Overview
• Programming examples that add CF attributes
to an HDF5 file
• HDF5
• C, FORTRAN90, Python
• netCDF4
• C, FORTRAN90
• HDF5-EOS5
• C, FORTRAN77
• HDFView to add CF attributes
3
4. www.hdfgroup.org
Problem Set
Examples are based on a simple application
4
Field Description
temp
Temperature
180x360 array
lat
Latitude
1-D array, size 180
lon
Longitude
1-D array, size 360
5. www.hdfgroup.org
CF attributes added
Attribute Description
long_name A long descriptive name for the data.
units The quantity of measurement.
coordinates
A list of the associated coordinate variable
names of the variable.
_FillValue A missing or undefined value.
10. www.hdfgroup.org
H5PY
• A Python interface to the HDF5 library
• Supports nearly all HDF5-C features
• Combines advantages of Python and C
• Shorter and simpler function calls
• Powerful computational abilities
• Requires numpy and scipy
URL: http://code.google.com/p/h5py
11. www.hdfgroup.org
H5PY Example
Create an HDF5 file:
file = h5py.File ("cf_example.h5", 'w')
Create/write dataset:
temp_dset = file.create_dataset ('temp', data=temp_array)
Add the _FIllValue:
temp_dset.attrs.create ('_FillValue', data=-999.0, dtype ='f')
12. www.hdfgroup.org
H5PY Example
Add the units attribute:
temp_dset.attrs["units"] = "kelvin”
Add the long_name attribute:
temp_dset.attrs["long_name"] = "temperature”
Add the coordinates attribute:
vlen = h5py.special_dtype (vlen = str)
temp_dset.attrs.create ('coordinates', data = ['lat', 'lon'],
dtype=vlen)
13. www.hdfgroup.org
netCDF-4
• Extends netCDF3
• Built on the HDF5 library
• Uses HDF5 for storage and performance
• Chunking and compression
• C and FORTRAN libraries
• Simple function calls
URL: http://www.unidata.ucar.edu/software/netcdf/docs/netcdf
14. www.hdfgroup.org
C Example
Create a netCDF4 file:
nc_create(FILE_NAME, NC_NETCDF4|NC_CLOBBER, &ncid)
Define the temperature variable:
nc_def_var(ncid, “temp”, NC_FLOAT, 2,dimsa, &varid);
Add the _FillValue:
nc_def_var_fill(ncid, varid, 0, &fillvalue);
Write the temperature data:
nc_put_var_float(ncid, varid, &temp_array[0][0]));
14
15. www.hdfgroup.org
C Example
Add the units attribute:
nc_put_att_text(ncid, varid, “units”, strlen(“kelvin”), “kelvin”);
Add the long_name attribute:
nc_put_att_text(ncid, varid, “long_name”, strlen(“temperature”),
“temperature”);
Add the coordinates attribute:
char *coorlist[2]= {"lat", "lon"};
nc_put_att_string(ncid, varid, “coordinates”, 2, (const char**)&coorlist);
15
16. www.hdfgroup.org
FORTRAN90 Example
Create the netCDF4 file:
nf90_create(path=filename, cmode=IOR(NF90_CLOBBER,NF90_HDF5),
ncid=ncid)
Define the temperature variable:
nf90_def_var(ncid, “temp”, NF90_FLOAT, (/180,360/), varid)
Add the _FillValue:
nf90_def_var_fill(ncid, varid, 0, -999)
Write the temperature data:
nf90_put_var(ncid, varid, temp_data)
16
17. www.hdfgroup.org
FORTRAN90 Example
Add the units attribute:
nf90_put_att(ncid, varid, “units, "kelvin")
Add the long_name attribute:
nf90_put_att(ncid, varid, “long_name”, "temperature")
Add the coordinates attribute:
nf90_put_att(ncid, varid, “coordinates”, “latitude”)
nf90_put_att(ncid, varid, “coordinates”, “longitude”)
17
18. www.hdfgroup.org
HDF-EOS5
• Built on HDF5
• extends HDF5
• uses HDF5 library calls as a foundation
• Associates geolocation data to scientific data
• Additional definitions
• points, swaths, grids
URL: http://newsroom.gsfc.nasa.gov/sdptoolkit/docs/HDF-EOS_UG.pdf
18
19. www.hdfgroup.org
C Example
Create a swath:
HE5_SWcreate(file, "Swath 1");
Define dimensions:
HE5_SWdefdim(swid, "GeoXtrack", 180);
HE5_SWdefdim(swid, "GeoTrack", 360);
Define temperature data field:
HE5_SWdefdatafield(swid, “temp”, "GeoTrack,GeoXtrack", NULL,
H5T_NATIVE_FLOAT, 0);
Set _FillValue:
HE5_SWsetfillvalue(swid, “temp”, H5T_NATIVE_FLOAT, &value);
Write the temperature data:
HE5_SWwritefield(swid, “temp”, NULL, NULL, NULL, temp_array);
19
28. www.hdfgroup.org
Acknowledgements
This work was supported by cooperative agreement
number NNX08AO77A from the National
Aeronautics and Space Administration (NASA).
Any opinions, findings, conclusions, or
recommendations expressed in this material are
those of the author[s] and do not necessarily reflect
the views of the National Aeronautics and Space
Administration.
28
30. www.hdfgroup.org
Dimension Scales
• API included with HDF5
• HDF5 datasets with additional metadata
• shows relationship to a dataset
• independent of a dataset
URL: http://www.hdfgroup.org/HDF5/doc/HL/RM_H5DS.html
30
31. www.hdfgroup.org
Programming Example
Uses same code as HDF5 example
Declare datasets as a dimension scale:
hid_t dataset[3];
// declare latitude and longitude datasets as a dimension scale
H5DSset_scale(dataset[1], “lat”);
H5DSset_scale(dataset[2], LON);
Attach the dimension scale:
// attach latitude to the temperature dataset
H5Dsattach_scale(dataset[0], dataset[1], 0);
// attach longitude to the temperature dataset
H5Dsattach_scale(dataset[0], dataset[2], 1);
31