It introduces and illustrates use cases, benefits and problems for Kerberos deployment on Hadoop; how Token support and TokenPreauth can help solve the problems. It also briefly introduces Haox project, a Java client library for Kerberos.
More than 87% of websites are SSL-encrypted and organizations can have thousands of certificates in production. A more flexible approach to managing certificates is needed. In this webinar we cover how to load certificates dynamically and additional newly released features. https://attendee.gotowebinar.com/register/521167809778215683
The Blockchain for the Internet of Things (IoT) has considered to "change the future." Despite a myriad of studies on the blockchain IoT, few studies have investigated how an IoT blockchain system develops with open source technologies, open standards, web technologies, and a p2p network. In this presentation, Jollen will share the Flowchain case study, an open source IoT blockchain project in Node.js; he will discuss the practice, the technical challenges, and the engineering experiences. Furthermore, to provide the real-time data transaction capabilities for current IoT requirements, he will utilize the "virtual block" idea to facilitate such technical challenges.
Securing Your Deployment with MongoDB and Red Hat's Identity Management in Re...MongoDB
MongoDB and Red Hat have collaborated to deliver an integrated solution for securing MongoDB deployments. Red Hat's proven security infrastructure adds extra protection to MongoDB with standards-based identity management featuring centralization of user, password, and certificate information. MongoDB and Red Hat team members present what you need to know to secure your systems, including an overview of Red Hat's Identity Management in Red Hat Enterprise Linux and MongoDB-RHEL security architecture.
It introduces and illustrates use cases, benefits and problems for Kerberos deployment on Hadoop; how Token support and TokenPreauth can help solve the problems. It also briefly introduces Haox project, a Java client library for Kerberos.
More than 87% of websites are SSL-encrypted and organizations can have thousands of certificates in production. A more flexible approach to managing certificates is needed. In this webinar we cover how to load certificates dynamically and additional newly released features. https://attendee.gotowebinar.com/register/521167809778215683
The Blockchain for the Internet of Things (IoT) has considered to "change the future." Despite a myriad of studies on the blockchain IoT, few studies have investigated how an IoT blockchain system develops with open source technologies, open standards, web technologies, and a p2p network. In this presentation, Jollen will share the Flowchain case study, an open source IoT blockchain project in Node.js; he will discuss the practice, the technical challenges, and the engineering experiences. Furthermore, to provide the real-time data transaction capabilities for current IoT requirements, he will utilize the "virtual block" idea to facilitate such technical challenges.
Securing Your Deployment with MongoDB and Red Hat's Identity Management in Re...MongoDB
MongoDB and Red Hat have collaborated to deliver an integrated solution for securing MongoDB deployments. Red Hat's proven security infrastructure adds extra protection to MongoDB with standards-based identity management featuring centralization of user, password, and certificate information. MongoDB and Red Hat team members present what you need to know to secure your systems, including an overview of Red Hat's Identity Management in Red Hat Enterprise Linux and MongoDB-RHEL security architecture.
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorWSO2
This slide deck discusses how to troubleshoot an issue in WSO2 Enterprise Integrator and follow best practices in order to optimize output and avoid failure.
APIs: Intelligent Routing, Security, & ManagementNGINX, Inc.
Kevin Jones, Global Consulting Engineer from NGINX San Francisco, preseentation about how to accelerate your journey to microservices with a modernised full API lifecycle management solution. Learn how to cut costs, improve performance, and reduce load on API endpoints. This presentation, covers:
All elements of full lifecycle management including API creation, securing your backend infrastructure, managing traffic, and ongoing monitoring.
Innovative architecture that doesn't involve additional microgateways to process API calls
Differentiated pricing model that does not penalize API adoption
On-Demand Link: https://www.nginx.com/resources/webinars/analyzing-nginx-logs-datadog/
About the Webinar
Datadog is a SaaS-based monitoring and analytics platform for cloud-scale organizations. The company is an industry leader in monitoring and observability – with over 350+ vendor-supported integrations, Datadog seamlessly correlates metrics, traces, and logs across the full DevOps stack.
With Datadog’s Log Management solution, you can cost-effectively collect, analyze, and archive all your logs with an easy-to-use, intuitive interface.
Attend this webinar to learn how to analyze NGINX logs using Datadog to achieve business outcomes including SEO optimization, improved website performance, and detection of DDoS attacks.
Scale your application to new heights with NGINX and AWSNGINX, Inc.
On-demand Link:
https://www.nginx.com/resources/webinars/scale-application-new-heights-nginx-aws/
In this webinar we will discuss how AWS and NGINX can complement each other to create highly scalable, high performance and secure web applications. We will cover the different ways that NGINX can integrate with AWS services such as NLB, Route53 and PrivateLink to add new layers of security and functionality to your high traffic website, streaming service or IOT system.
Hadoop Security Features that make your risk officer happyAnurag Shrivastava
This talk was delivered by Anurag Shrivastava at Hadoop Summit 2015 Brussels. It covers how Apache Ranger, Apache Sentry, Apache Knox and Project Rhino can help you pass IT risk assessment in Hadoop projects.
[DevDay 2016] OpenStack and approaches for new users - Speaker: Chi Le – Head...DevDay.org
OpenStack is an open source cloud computing platform providing infrastructure as a service (IaaS). The presentation will encapsulate the contents of OpenStack, amplified by practical demo and simple but effective guidelines to access OpenStack.
———
Speaker: Chi Le – Head of Infrastructure System at Da Nang ICT Infrastructure Development Center
Katpro Technology, a IT solutions company, announced it has been selected by Microsoft Co-corporations as a windows Azure Circle Partner.The Partnership will provide katpro with the ability to service customers needs in the area of cloud, training and support material provided by Microsoft.
Nowadays a typical Hadoop deployment consists of core Hadoop components – HDFS and MapReduce – several other components such as HBase, HttpFS, Oozie, Pig, Hive, Sqoop, Flume, plus programmatic integration from external systems and applications. This effectively creates a complex and heterogenous distributed environment that runs across several machines and uses different protocols to communicate with each other; all of which is used concurrently by several users and applications. When a Hadoop deployment and its ecosystem is used to process sensitive data (such as financial records, payment transactions, healthcare records), several security requirements arise. These security requirements may be dictated by internal policies and/or government regulations. They may require strong authentication, selective authorization to access data/resources, and data confidentiality. This session covers in detail how different components in the Hadoop ecosystem and external applications can interact with each other in a secure manner providing authentication, authorization, and confidentiality when accessing services and transferring data to/from/between services. The session will cover topics like Kerberos authentication, Web UI authentication, File System permissions, delegation tokens, Access Control Lists, ProxyUser impersonation and network encryption.
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Abhiraj Butala
The talk covers limitations of current Hadoop eco-system components in handling security (Authentication, Authorization, Auditing) in multi-tenant, multi-application environments. Then it proposes how we can use Apache Ranger and HDFS super-user connections to enforce correct HDFS authorization policies and achieve the required auditing.
Deploying Enterprise-grade Security for HadoopCloudera, Inc.
Deploying enterprise grade security for Hadoop or six security problems with Apache Hive. In this talk we will discuss the security problems with Hive and then secure Hive with Apache Sentry. Additional topics will include Hadoop security, and Role Based Access Control (RBAC).
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorWSO2
This slide deck discusses how to troubleshoot an issue in WSO2 Enterprise Integrator and follow best practices in order to optimize output and avoid failure.
APIs: Intelligent Routing, Security, & ManagementNGINX, Inc.
Kevin Jones, Global Consulting Engineer from NGINX San Francisco, preseentation about how to accelerate your journey to microservices with a modernised full API lifecycle management solution. Learn how to cut costs, improve performance, and reduce load on API endpoints. This presentation, covers:
All elements of full lifecycle management including API creation, securing your backend infrastructure, managing traffic, and ongoing monitoring.
Innovative architecture that doesn't involve additional microgateways to process API calls
Differentiated pricing model that does not penalize API adoption
On-Demand Link: https://www.nginx.com/resources/webinars/analyzing-nginx-logs-datadog/
About the Webinar
Datadog is a SaaS-based monitoring and analytics platform for cloud-scale organizations. The company is an industry leader in monitoring and observability – with over 350+ vendor-supported integrations, Datadog seamlessly correlates metrics, traces, and logs across the full DevOps stack.
With Datadog’s Log Management solution, you can cost-effectively collect, analyze, and archive all your logs with an easy-to-use, intuitive interface.
Attend this webinar to learn how to analyze NGINX logs using Datadog to achieve business outcomes including SEO optimization, improved website performance, and detection of DDoS attacks.
Scale your application to new heights with NGINX and AWSNGINX, Inc.
On-demand Link:
https://www.nginx.com/resources/webinars/scale-application-new-heights-nginx-aws/
In this webinar we will discuss how AWS and NGINX can complement each other to create highly scalable, high performance and secure web applications. We will cover the different ways that NGINX can integrate with AWS services such as NLB, Route53 and PrivateLink to add new layers of security and functionality to your high traffic website, streaming service or IOT system.
Hadoop Security Features that make your risk officer happyAnurag Shrivastava
This talk was delivered by Anurag Shrivastava at Hadoop Summit 2015 Brussels. It covers how Apache Ranger, Apache Sentry, Apache Knox and Project Rhino can help you pass IT risk assessment in Hadoop projects.
[DevDay 2016] OpenStack and approaches for new users - Speaker: Chi Le – Head...DevDay.org
OpenStack is an open source cloud computing platform providing infrastructure as a service (IaaS). The presentation will encapsulate the contents of OpenStack, amplified by practical demo and simple but effective guidelines to access OpenStack.
———
Speaker: Chi Le – Head of Infrastructure System at Da Nang ICT Infrastructure Development Center
Katpro Technology, a IT solutions company, announced it has been selected by Microsoft Co-corporations as a windows Azure Circle Partner.The Partnership will provide katpro with the ability to service customers needs in the area of cloud, training and support material provided by Microsoft.
Nowadays a typical Hadoop deployment consists of core Hadoop components – HDFS and MapReduce – several other components such as HBase, HttpFS, Oozie, Pig, Hive, Sqoop, Flume, plus programmatic integration from external systems and applications. This effectively creates a complex and heterogenous distributed environment that runs across several machines and uses different protocols to communicate with each other; all of which is used concurrently by several users and applications. When a Hadoop deployment and its ecosystem is used to process sensitive data (such as financial records, payment transactions, healthcare records), several security requirements arise. These security requirements may be dictated by internal policies and/or government regulations. They may require strong authentication, selective authorization to access data/resources, and data confidentiality. This session covers in detail how different components in the Hadoop ecosystem and external applications can interact with each other in a secure manner providing authentication, authorization, and confidentiality when accessing services and transferring data to/from/between services. The session will cover topics like Kerberos authentication, Web UI authentication, File System permissions, delegation tokens, Access Control Lists, ProxyUser impersonation and network encryption.
Hadoop Security in Big-Data-as-a-Service Deployments - Presented at Hadoop Su...Abhiraj Butala
The talk covers limitations of current Hadoop eco-system components in handling security (Authentication, Authorization, Auditing) in multi-tenant, multi-application environments. Then it proposes how we can use Apache Ranger and HDFS super-user connections to enforce correct HDFS authorization policies and achieve the required auditing.
Deploying Enterprise-grade Security for HadoopCloudera, Inc.
Deploying enterprise grade security for Hadoop or six security problems with Apache Hive. In this talk we will discuss the security problems with Hive and then secure Hive with Apache Sentry. Additional topics will include Hadoop security, and Role Based Access Control (RBAC).
We provide a summary review of Globus features targeted at those new to Globus. We demonstrate how to transfer and share data, and install a Globus Connect Personal endpoint on your laptop.
GlobusWorld 2021 Tutorial: Introduction to GlobusGlobus
An introduction to the core features of the Globus data management service. This tutorial was presented at the GlobusWorld 2021 conference in Chicago, IL by Greg Nawrocki.
Introduces the Globus software-as-a-service for file transfer and data sharing. Includes step-by-step instructions for creating a Globus account, transferring a file, and setting up a Globus endpoint on your laptop.
"What's New With Globus" Webinar: Spring 2018Globus
In this presentation from June 26, 2018, Globus co-founder Steve Tuecke discussed Globus Connect Server 5.1 with HTTPS file access; plans for new premium storage connectors; upcoming publication services including the new Globus Search and Identifiers services; the new Globus Web App, SSH with Globus Auth, and more.
Simplified Research Data Management with the Globus PlatformGlobus
Overview of the Globus research data management platform, as presented at the Fall 2018 Membership Meeting of the Coalition for Networked Information (CNI), held in Washington, D.C., December 10-11, 2018
Automating Research Data Management at Scale with GlobusGlobus
Research computing facilities, such as the national supercomputing centers, and shared instruments, such as cryo electron microscopes and advanced light sources, are generating large volumes of data daily. These growing data volumes make it challenging for researchers to perform what should be mundane tasks: move data reliably, describe data for subsequent discovery, and make data accessible to geographically distributed collaborators. Most employ some set of ad hoc methods, which are not scalable, and it is clear that some level of automation is required for these tasks.
Globus is an established service from the University of Chicago that is widely used for managing research data in national laboratories, campus computing centers, and HPC facilities. While its intuitive web app addresses simple file transfer and sharing scenarios, automation at scale requires integrating Globus data management platform services into custom science gateways, data portals and other web applications in service of research. Such applications should enable automated ingest of data from diverse sources, launching of analysis runs on diverse computing resources, extraction and addition of metadata for creating search indexes, assignment of persistent identifiers faceted search for rapid data discovery, and point-and-click downloading of datasets by authorized users — all protected by an authentication and authorization substrate that allows the implementation of flexible data access policies for both metadata and data alike.
We describe current and emerging Globus services that facilitate these automated data flows while ensuring a streamlined user experience. We also demonstrate Petreldata.net, a data management portal and gateway to multiple computing resources, that supports large-scale research at the Advanced Photon Source.
We presented these slides at the NIH Data Commons kickoff meeting, showing some of the technologies that we propose to integrate in our "full stack" pilot.
Globus: Research Data Management as Service and Platform - pearc17Mary Bass
Scientists have embraced the use of specialized cloud-hosted services to perform data management operations. Globus offers a suite of data and user management capabilities to the community, encompassing data transfer and sharing, user identity and authorization, and data publication. Globus capabilities are accessible via both a web browser and REST APIs. Web access allows Globus to address the needs of research labs through a software-as-a-service model; the newer REST APIs address the needs of developers of research services, who can now use Globus as a platform, outsourcing complex user and data management tasks to Globus cloud-hosted services. Here we review Globus capabilities and outline how it is being applied as a platform for scientific services. Presentation by Steve Tuecke from The University of Chicago. Steve is Globus Founder and Project Lead.
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and we describe a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Globus
The Earth System Grid Federation (ESGF) is a global network of data servers that archives and distributes the planet’s largest collection of Earth system model output for thousands of climate and environmental scientists worldwide. Many of these petabyte-scale data archives are located in proximity to large high-performance computing (HPC) or cloud computing resources, but the primary workflow for data users consists of transferring data, and applying computations on a different system. As a part of the ESGF 2.0 US project (funded by the United States Department of Energy Office of Science), we developed pre-defined data workflows, which can be run on-demand, capable of applying many data reduction and data analysis to the large ESGF data archives, transferring only the resultant analysis (ex. visualizations, smaller data files). In this talk, we will showcase a few of these workflows, highlighting how Globus Flows can be used for petabyte-scale climate analysis.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Enhancing Research Orchestration Capabilities at ORNL.pdfGlobus
Cross-facility research orchestration comes with ever-changing constraints regarding the availability and suitability of various compute and data resources. In short, a flexible data and processing fabric is needed to enable the dynamic redirection of data and compute tasks throughout the lifecycle of an experiment. In this talk, we illustrate how we easily leveraged Globus services to instrument the ACE research testbed at the Oak Ridge Leadership Computing Facility with flexible data and task orchestration capabilities.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
The Department of Energy's Integrated Research Infrastructure (IRI)Globus
We will provide an overview of DOE’s IRI initiative as it moves into early implementation, what drives the IRI vision, and the role of DOE in the larger national research ecosystem.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Enhancing Performance with Globus and the Science DMZGlobus
ESnet has led the way in helping national facilities—and many other institutions in the research community—configure Science DMZs and troubleshoot network issues to maximize data transfer performance. In this talk we will present a summary of approaches and tips for getting the most out of your network infrastructure using Globus Connect Server.
Extending Globus into a Site-wide Automated Data Infrastructure.pdfGlobus
The Rosalind Franklin Institute hosts a variety of scientific instruments, which allow us to capture a multifaceted and multilevel view of biological systems, generating around 70 terabytes of data a month. Distributed solutions, such as Globus and Ceph, facilitates storage, access, and transfer of large amount of data. However, we still must deal with the heterogeneity of the file formats and directory structure at acquisition, which is optimised for fast recording, rather than for efficient storage and processing. Our data infrastructure includes local storage at the instruments and workstations, distributed object stores with POSIX and S3 access, remote storage on HPCs, and taped backup. This can pose a challenge in ensuring fast, secure, and efficient data transfer. Globus allows us to handle this heterogeneity, while its Python SDK allows us to automate our data infrastructure using Globus microservices integrated with our data access models. Our data management workflows are becoming increasingly complex and heterogenous, including desktop PCs, virtual machines, and offsite HPCs, as well as several open-source software tools with different computing and data structure requirements. This complexity commands that data is annotated with enough details about the experiments and the analysis to ensure efficient and reproducible workflows. This talk explores how we extend Globus into different parts of our data lifecycle to create a secure, scalable, and high performing automated data infrastructure that can provide FAIR[1,2] data for all our science.
1. https://doi.org/10.1038/sdata.2016.18
2. https://www.go-fair.org/fair-principles
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Globus Compute with Integrated Research Infrastructure (IRI) workflowsGlobus
As part of the DOE Integrated Research Infrastructure (IRI) program, NERSC at Lawrence Berkeley National Lab and ALCF at Argonne National Lab are working closely with General Atomics on accelerating the computing requirements of the DIII-D experiment. As part of the work the team is investigating ways to speedup the time to solution for many different parts of the DIII-D workflow including how they run jobs on HPC systems. One of these routes is looking at Globus Compute as a way to replace the current method for managing tasks and I will give a brief proof of concept showing how Globus Compute could help to schedule jobs and be a tool to connect compute at different facilities.
Reactive Documents and Computational Pipelines - Bridging the GapGlobus
As scientific discovery and experimentation become increasingly reliant on computational methods, the static nature of traditional publications renders them progressively fragmented and unreproducible. How can workflow automation tools, such as Globus, be leveraged to address these issues and potentially create a new, higher-value form of publication? LivePublication leverages Globus’s custom Action Provider integrations and Compute nodes to capture semantic and provenance information during distributed flow executions. This information is then embedded within an RO-crate and interfaced with a programmatic document, creating a seamless pipeline from instruments, to computation, to publication.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
2. 6,902
active shared
endpoints
70+
petabyte movers
675 PB
moved
23,450
active personal
endpoints
93 billion
files processed
1,868
active server
endpoints
110+
subscribers
2.9 PB
largest transfer
to date
99.9%
availability
710
identity providers
1,923
most shared
endpoints
at a single
institution 111,000
registered users
Globus by the numbers
3. Manage Protected Data
3
Higher assurance levels for HIPAA and other regulated data
• Transfer and share…
– PHI (Protected Health Information)
– PII (Personally identifiable
information)
– Controlled Unclassified Information
• Security controls comply with…
– NIST 800-53 Low
– Superset of NIST 800-171 Low
• Optional BAA with UChicago
4. Product enhancements for high assurance
• Additional authentication assurance
– Authenticate with specific identity…
– …within specific time
– …within specific session
• Application instance isolation
– Per application
– Per session (~browser session)
• Encryption of user data in transit and Globus data at rest
• Detailed audit logs: Globus service + your DTNs
5. Product enhancements for high assurance
• Additional security requirements enforced on
management of all high assurance resources
– Data access, and any interaction that can lead to data access
– Examples: Groups, Management Console
• Enhanced user interfaces (web app and CLI) for
seamless management of protected data
6. Services enabled
• Globus Services: Auth, Transfer & Sharing, Groups
• Globus Connect Server v5.2 and above
• Globus Connect Personal v3.x
• Web app (app.globus.org)
• Globus Command Line Interface (CLI)
• Connectors: POSIX, Google Drive, AWS S3, CEPH
7. Operational enhancements for high assurance
• Intrusion detection and prevention
• Encryption
• Enhanced logging
• Secure remote access, access control, and secure
practices for laptops
• Uniform configuration management and change control
• AWS best practices for secure environment: VPCs,
security groups, IAM best practices
8. New subscription levels
• High Assurance
– 33% uplift on Standard
subscription and on premium
connectors
• BAA
– 50% uplift on Standard
subscription and on premium
connectors
10. Web app enhancements
• Accessibility
– Target WCAG 2.0 AA compliance
• Responsiveness and touch
• Works with new connectors
collections.globus.org
10
11. Web app enhancements
• Customizable interface
– Single vs. dual panel
– Compact file listing display
– Columns displayed
• Continue incorporating
user feedback
12. CLI enhancements
• Support for use with high assurance collections
• '--format UNIX': output suitable for line-oriented
processing with typical Unix tools
• Added 'globus rm' command
• 'globus whoami --linked-identities': shows all
linked identities
• '--timeout-exit-code': overrides the default exit code
for commands which wait on tasks
• Enhancements to SDK as needed
12
13. Globus for Box
• Extends the value of your Box deployment
• Unifies access to cloud and on-premise storage
• Transitions protected data (HIPAA-regulated, CUI)
seamlessly between Box and other storage systems
13
16. Make Box part of your
research storage ecosystem
globus.org/connectors/box
docs.globus.org/premium-storage-connectors/box
17. Connector updates
• Enhanced user experience for credential handling for
several connectors (GCSv5)
• AWS S3
– Automated multi-region support
• Google Drive
– Enhancement to retry handling for large transfers
• HPSS
– Support added for HPSS 7.5 (7.3 to 7.5 supported)
– Improved asynchronous staging from tape
17
18. S3 compatible systems
• Initial customer deployments
• Validation, testing and vendor
engagement planned
• Additional systems driven by
customer demand
PLEASE CONTACT US BEFORE
DEPLOYING ANY OF THESE!
18
21. Globus Connect Server v5.3
• Subsumes GCS version 5.0, 5.1, 5.2 (upgrade now)
• Standard and high assurance guest collections (sharing)
• High assurance mapped collections
• Connectors: POSIX, AWS S3, CEPH, Google Drive, Box
• High assurance, standard gateways on same endpoint
• Data access protocols: GridFTP and HTTPS
22. HTTPS access to Globus endpoints
• Browser based
up/download
• Put your (research)
storage “on the web”
• Enforce same
security policies
22
23. Globus Connect Server v5 Milestones
v5.0: Google
Drive
v5.1: POSIX guest
collections, HTTPS
v5.x: v4 feature parity+
v5.3
• Multi DTN support
• Additional storage
systems
• Endpoint specific
identity providers
• …
Other
features
v5.2: High
assurance
v5.4: …
24. Recent Transfer enhancements
• Verify transfer using client provided checksums
– User provided checksum used rather than source checksum for
verification
• Improvements for scaling transfer service
– Multiple nodes for transfer service for higher availability and
reliability
– Allows for code updates with no downtime
24
25. SSH with OAuth
• Securely access resource using SSH with federated identity
– Facilitates automation, eliminates SSH key management
– Replacement for deprecated GSI OpenSSH
• First version released
– Server side PAM module with Globus Auth support
– Command line client
• Open source, community support
– Not part of the standard subscription
– OAuth SSH Client: https://pypi.org/project/oauth-ssh/
– OAuth SSH Server PAM module: https://github.com/xsede/oauth-ssh
28. Globus Transfer: A complete solution
☑ Bulk transfer and sync
☑ Good end-to-end performance in myriad of real world settings
☑ End-to-end reliability
☑ Robust security, with federated identities
☑ Layers onto diverse storage systems
☑ Web-compatible client/server remote access
☑ Easy to use interfaces
☑ Easy installation and administration
☑ Sharing data with guest users
☑ Dedicated professional support
28
29. Rethinking data publication
• Limited adoption
– Not easily customizable
• Maintenance Challenges
– Costly to maintain
– JRE licensing concerns
• Going forward
– Code will be open source
– Leverage platform
• Invest in higher priorities
30. JLSEUChicago
ALCFAPS
Publication7
Kasthuri Lab: Building the connectome
Imaging1
Lab Server 1
Acquisition2
Lab Server 2
Pre-processing3 Preview/Center4
Reconstruction6Visualization8
User validation5
Science!9
Neuroanatomy
reconstruction
pipeline
32. Our (ambitious) goals for the Globus platform
• Transform how research applications, services, and
workflows are created, delivered, used, and sustained
– Scientific instrument data processing
– Repositories: Make data more FAIR
– Science gateways
• Facilitate creation of interoperable app ecosystem
32
33. Globus platform services
• Identity and Access Management (IAM)
– Federated identity login, Groups, Attributes, Access Control
– Globus Auth: Oauth authorization provider
• Connect
• Transfer
– Building a family of services
• Execution
• Search, Identifiers
• Automation
– Queues, Events
– Triggers, Actions, Flows
33
34. Platform status
• Generally Available in a few years
• Separate product with separate sustainability model
• Early engagements help shape product direction
– Argonne Leadership Computing Facility, Materials Data Facility,
– NCAR Research Data Archive, NSO, …
– Use in Globus products
• Multiple integrations facilitate more complete solution
– e.g. Django, JupyterHub
– Follow progress: globus-integration-examples.readthedocs.io
• Currently accessible via professional services team
35. Thank you to our sponsors...
U . S . D E P A R T M E N T O F
ENERGY
Use cases – HIPAA/protected data enclave, multi-institutional trials,
Access Control
Identities provided and managed by institution
Globus acts as identity broker only, does not access or store any institutional user credentials
Institution controls all access policies (at multiple levels)
who can access what data and with what permissions
who can share what data and with what permissions
all access policies can be changed or revoked at any time
Protected Environment using either AWS KMS encryption or AWS service-specific encryption options.
Data access
CLI access
Private window
Groups page
we are working towards being compliant with W3 established accessibility standards –increased visual contrast, ability to resize all of the GUI text using standard browser controls, code to support screen readers, the GUI adapts to a users’ screen from mobile devices all the way up to large high resolution desktop displays
Greg’s talk
Backup, data management plans, archive…use Globus for all those use cases
Layers on Box
Our current set of premium connectors; care and feed
Box collection creation
Data transfer
Sharing
Pricing – same as Google Drive connector
Care and feed
Customer demand + sustainble model for maintaining the connectors. Ask for any input.
And what we once called endpoints are now called Collections
Mapped = host endpoint
Guest = shared endpoint
Don’t forget we now also offer HTTPS as well!
Hand to Steve who takes it from here…
BE POSITIVE!
IT’S ALL GOOD!
WE’RE DOING IT FOR YOUR OWN GOOD!
WE CAN INVEST ON COOL NEW STUFF
A good example of the science that Globus facilitates is the work being done by Bobby Kasthuri Lab at Argonne Lab
HIs group have set out to map the brain, or build the connectome, as it’s known
It’s an ambitious undertaking that involves massive mounts of data
They start with samples that are imaged using a beamline at the Advanced Photon Source
They get time from APS every 2 to 3 days, and efficiency and automation allows them to make the most of the time at APS.
The initial set of raw images undergo some prep-processing and are sent to the Argonne Leadership Computing Facility for analysis
A scientist then previews the images and makes any needed adjustments to the experiment
Once everything is properly configured the sample is fully imaged and reconstructed at the ALCF
The datasets are then moved to a petascale storage system called Petrel where they are annotated with metadata and published with a persistent identifier
Researchers at Uchicago and elsewhere can then search and extract relevant subsets of the data to analyze further
…and then Science happens!
No timeframe for general availability
Last year we talked about services such as Automate and Search. These are fundamentally platform services, for developers to leverage in their own applications, services, and solutions.
We continue to make good progress on these
However, realistically, due to funding constraints, these platforms services will not be generally available for several more year
In the mean time we are partnering with select groups to use prototype and limited production platform services
To learn what is really needed before going GA
If you are interested in experimenting with them, please talk with us. We will be selective.