John Readey presented on HDF5 in the cloud using HDFCloud. HDF5 can provide a cost-effective cloud infrastructure by paying for what is used rather than what may be needed. HDFCloud uses an HDF5 server to enable accessing HDF5 data through a REST API, allowing users to access large datasets without downloading entire files. It maps HDF5 objects to cloud object storage for scalable performance and uses Docker containers for elastic scaling.
This slide will provide an overview of current functionality, techniques, and tips for visualization and query of HDF and netCDF data in ArcGIS, as well as future plans. Hierarchical Data Format (HDF) and netCDF (network Common Data Form) are two widely used data formats for storing and manipulating scientific data. The NetCDF format also supports temporal data by using multidimensional arrays. The basic structure of data in this format and how to work with it will be covered in the context of standardized data structures and conventions. This slide will demonstrate the tools and techniques for ingesting HDF and netCDF data efficiently in ArcGIS, as well as some common workflows to employ the visualization capabilities of ArcGIS for effective animation and analysis of your data.
This slide will provide an overview of current functionality, techniques, and tips for visualization and query of HDF and netCDF data in ArcGIS, as well as future plans. Hierarchical Data Format (HDF) and netCDF (network Common Data Form) are two widely used data formats for storing and manipulating scientific data. The NetCDF format also supports temporal data by using multidimensional arrays. The basic structure of data in this format and how to work with it will be covered in the context of standardized data structures and conventions. This slide will demonstrate the tools and techniques for ingesting HDF and netCDF data efficiently in ArcGIS, as well as some common workflows to employ the visualization capabilities of ArcGIS for effective animation and analysis of your data.
We Provide Hadoop training institute in Hyderabad and Bangalore with corporate training by 12+ Experience faculty.
Real-time industry experts from MNCs
Resume Preparation by expert Professionals
Lab exercises
Interview Preparation
Experts advice
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy
In questa sessione HPE e SUSE illustrano con casi reali come HPE Data Management Framework e SUSE Enterprise Storage permettano di risolvere i problemi di gestione della crescita esponenziale dei dati realizzando un’architettura software-defined flessibile, scalabile ed economica. (Alberto Galli, HPE Italia e SUSE)
Software Defined Analytics with File and Object Access Plus Geographically Di...Trishali Nayar
Introduction to Spectrum Scale Active File Management (AFM)
and its use cases. Spectrum Scale Protocols - Unified File & Object Access (UFO) Feature Details
AFM + Object : Unique Wan Caching for Object Store
Big Data Architecture Workshop - Vahid Amiridatastack
Big Data Architecture Workshop
This slide is about big data tools, thecnologies and layers that can be used in enterprise solutions.
TopHPC Conference
2019
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsDataWorks Summit
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics - Apache Spark’s in memory capabilities catapulted it as the premier processing framework for Hadoop. Apache Ignite and Alluxio, both high-performance, integrated and distributed in-memory platform, takes Apache Spark to the next level by providing an even more powerful, faster and scalable platform to the most demanding data processing and analytic environments.
Speaker
Irfan Elahi, Consultant, Deloitte
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Elevating Tactical DDD Patterns Through Object Calisthenics
HDFCloud Workshop: HDF5 in the Cloud
1. HDF5 in the Cloud
HDFCloud
Workshop
John Readey
The HDF Group
jreadey@hdfgroup.org
2. My Background
Sr. Architect at The HDF Group
Started in 2014
Have been exploring remote interfaces to HDF
Previously: Dev Manager at Amazon/AWS
More previously: Used HDF5 while a developer
at Intel
2
3. What is HDF5?
Depends on your point of view:
• a C-API
• a File Format
• a data model
3
Think of HDF5 as a file system
within a file.
Store arrays with chunking and compression.
Add NumPy style data selection.
4. • Native support for multidimensional
data
• Data and metadata in one place =>
streamlines data lifecycle & pipelines
• Portable, no vendor lock-in
• Maintains logical view while adapting to
storage context
• In-memory, over-the-wire, on-disk, parallel
FS, object store
• Pluggable filter pipeline for compression,
checksum, encryption, etc.
• High-performance I/O
• Large ecosystem (700+ Github projects)
Why is this concept so different + useful? 4
5. Why HDF in the Cloud
• It can provide a cost-effective infrastructure
• Pay for what you use vs pay for what you may need
• Lower overhead: no hard ware setup/network configuration, etc.
• Potentially can benefit from cloud-based technologies:
• Elastic compute – scale compute resources dynamically
• Object based storage – low cost/built in redundancy
• Community platform (potentially)
• Enables interested users to bring their applications to the data
• Share data among many users
6. Cost Factors
• Most public clouds bill per usage
• For HDF in the cloud, there are three big cost drivers:
• Storage – What storage system will be used? (see next slide)
• Compute – Elastic compute on demand better than fixed cost
• Scale compute to usage not size of data
• Egress charges
• Ingress is free but getting data out will cost you ($0.09/GB)
• Enabling users to get just the data they need will tend to lower egress charges
6
7. Storage Costs
How much will it costs to store 1PB for one year on AWS?
Answer depends on the technology and tradeoffs you are willing to accept…
Technology What it is Cost for 1PB/1yr Fine Print
Glacier Offline (tape) Storage $125K - 4 hour latency for first read
- Additional costs for restore
S3 Infrequent Access Nearline Object Storage $157K - $0.01/GB data retrieval charge
- $10K to read entire PB!
S3 Online Object Storage $289K - Request pricing $0.01 per 10K req
- Transfer out charge $0.01/GB
EBS Attachable Disk Storage $629K - Extra charges for guaranteed IOPS
- Need backups
EFS Shared Network (NFS) $3,774K - Not all NFSv4.1 features supported
- E.g. File Locking
DynamoDB NoSQL Database $3,145K - Extra charge for IOPS
8. Introducing Highly Scalable Data Service
(HSDS)
• RESTful interface to HDF5 using object storage
• Storage using AWS S3
• Built in redundancy
• Cost effective
• Scalable throughput
• Runs as a cluster of Docker containers
• Elastically scale compute with usage
• Feature compatible with HDF5 library
• Implemented in Python using asyncio
• Task oriented parallelism
8
9. HSDS Features
• Clients can interact with service using REST API
• SDKs provide language specific interface (e.g. h5pyd for Python)
• Can read/write just the data they need (as opposed to transferring entire
files)
• No limit to the amount of data that can be stored by the service
• Multiple clients can read/write to same data source
• Scalable performance:
• Can cache recently accessed data in RAM
• Can parallelize requests across multiple nodes
• More nodes -> better performance
9
10. What makes it RESTful?
• Client-server model
• Stateless – (no client context stored on server)
• Cacheable – clients can cache responses
• Resources identified by URIs (datasets, groups, attributes, etc)
• Standard HTTP methods and behaviors:
Method Safe Idempotent Description
GET Y Y Get a description of a resource
POST N N Create a new resource
PUT N Y Create a new named resource
DELETE N Y Delete a resource
12. Object Storage Challenges for HDF
• Not POSIX!
• High latency (>0.1s) per request
• Not write/read consistent
• High throughput needs some tricks
• (use many async requests)
• Request charges can add up (public cloud)
For HDF5, using the HDF5 library
directly on an object storage
system is a non-starter. Will
need an alternative solution…
13. HSDS S3 Schema
Big Idea: Map individual HDF5
objects (datasets, groups,
chunks) as Object Storage
Objects• Limit maximum storage object size
• Support parallelism for read/write
• Only data that is modified needs to be updated
• (Potentially) Multiple clients can be
reading/updating the same “file”
Legend:
• Dataset is partitioned into
chunks
• Each chunk stored as an S3
object
• Dataset meta data (type,
shape, attributes, etc.) stored in
a separate object (as JSON
text)
How to store HDF5 content in S3?
13
Each chunk (heavy outlines) get
persisted as a separate object
15. Architecture for HSDS
Legend:
• Client: Any user of the service
• Load balancer – distributes requests to Service nodes
• Service Nodes – processes requests from clients (with help from Data
Nodes)
• Data Nodes – responsible for partition of Object Store
• Object Store: Base storage service (e.g. AWS S3)
15
16. Implementing HSDS with asyncio
• HSDS relies heavily on Python’s new asyncio module
• Concurrency based on tasks (rather than say multithreading or
multiprocessing)
• Task switching occurs when process would otherwise wait on I/O
async def my_func():
a_regular_function_call()
await a_blocking_call()
• Control will switch to another task when await is
encountered
• Result is the app can do other useful work vs. blocking
• Supporting 1000’s of concurrent tasks within a process
16
17. Parallelizing data access with asyncio
• SN node invoking parallel requests on DN nodes
tasks = []
for chunk_id in my_chunk_list:
task = asyncio.ensure_future(read_chunk_query(chunk_id))
tasks.append(task)
await asyncio.gather(*tasks, loop=loop)
• Read_chunk_query makes a http request to a specific DN
node
• Set of DN nodes can be reading from S3, decompression
and selecting requested data in parallel
• Asyncio.gather waits for all tasks to complete before
continuing
• Meanwhile, new requests can be processed by SN node
17
18. Python and Docker
• Docker makes developing clustered applications sooo much easier
• Can run dozens of containers on a moderate laptop
• Containers communicate with each other just like on a physical network
• Use docker stats to check up cpu, net i/o, disk i/o usage per container
• Can try out different constraints for amount of memory, disk per container
• Same code “just works” on an actual cluster
• ”scale up” by launching more containers on production hardware
• AWS ECS enables running containers in a machine agnostic way
• Using docker does require a reversion to the edit/build/run paradigm
• The build step is now the creation of the docker image
• Run is launching the container(s)
18
19. Python package MVPs
• numpy – python arrays
• Used heavily in server and client stacks
• Great performance for common array operations
• Simplifies much of the logic needed for hyperslab selection
• aiohttp – async http client/server
• Use of asyncio requires async enabled packages
• Aiohttp is used in HSDS as both web server and client
• Aiobotocore – async aws s3 client
• Enables async read/write to S3
• H5py – template for h5pyd package
19
20. H5pyd – Python client for HDF Server
• H5py is a popular Python package that provide a Pythonic interface to the
HDF5 library
• H5pyd (for h5py distributed) provides a h5py compatible h5py for accessing
the server
• Pure Python – uses requests package to make http calls to server
• Compatible with h5serv (the reference implementation of the HDF REST API)
• Include several extensions to h5py:
• List content in folders
• Get/Set ACLs (access control list)
• Pytables-like query interface
20
21. HDF REST VOL
• The HDF5 VOL architecture is a plugin layer for HDF5
• User API stays the same, but different backends can be
implemented
• REST VOL substitutes REST API requests for file i/o actions
• C/Fortran applications should be able to run as is
• Still in development – Beta expected this year
2
1
22. HSDS CLI (Command Line Interface)
• Accessing HDF via a service means can’t utilize usual shell commands:
ls, rm, chmod, etc.
• Command line tools are a set of simple apps to use instead:
• hsinfo: display server version, connect info
• hsls: list content of folder or file
• hstouch: create folder or file
• hsdel: delete a file
• hsload: upload an HDF5 file
• hsget: download content from server to an HDF5 file
• hsacl: create/list/update ACLs (Access Control Lists)
• Implemented in Python & uses h5pyd
22
23. Future Work 2
3
• Work planned for the next year
• Compression
• Variable length datatypes
• NetCDF support
• Auto Scaling
• Scalability and performance testing
24. Demo Time!
NREL (National Renewable Energy Laboratory) is using HSDS
to make 7TB of wind simulation data accessible to the public.
Datasets are three-dimensional covering the continental US:
• Time (one slice/hour)
• Lon (~2k resolution)
• Lat (~2k resolution)
Initial data covers one year (8760 slices), but will be soon be
extended to 5 years (35 TBs).
Rather than downloading TB’s of files, interested users can
now use the HSDS client libraries to explore the datasets.
25. To Find out More:
• H5serv: https://github.com/HDFGroup/h5serv
• Documentation: http://h5serv.readthedocs.io/
• H5pyd: https://github.com/HDFGroup/h5pyd
• RESTful HDF5 White Paper:
https://www.hdfgroup.org/pubs/papers/RESTful_HDF5.pdf
• Blog articles:
• https://hdfgroup.org/wp/2015/04/hdf5-for-the-web-hdf-server/
• https://hdfgroup.org/wp/2015/12/serve-protect-web-security-hdf5/
• https://www.hdfgroup.org/2017/04/the-gfed-analysis-tool-an-hdf-
server-implementation/
25
26. 26
HDF5 Community Support
• Documentation - https://support.hdfgroup.org/documentation/
• Tutorials, FAQs, examples
• HDF-Forum – mailing list and archive
• Great for specific questions
• Helpdesk Email – help@hdfgroup.org
• Issues with software and documentation
https://support.hdfgroup.org/services/community_support.html
2
6