This document discusses building a streaming data pipeline using Apache technologies like Kafka, Spark Streaming, and MapR-DB. It describes collecting streaming data with Kafka, organizing the data into topics, and processing the streams in Spark Streaming. The streaming data can then be stored in MapR-DB and queried using Spark SQL. An example uses a streaming payment dataset to demonstrate parsing the data, transforming it into a Dataset, and continuously aggregating values with Spark Streaming.
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBCarol McDonald
Apache Spark GraphX made it possible to run graph algorithms within Spark, GraphFrames integrates GraphX and DataFrames and makes it possible to perform Graph pattern queries without moving data to a specialized graph database.
This presentation will help you get started using Apache Spark GraphFrames Graph Algorithms and Graph Queries with MapR-DB JSON document database.
Predicting Flight Delays with Spark Machine LearningCarol McDonald
Apache Spark's MLlib makes machine learning scalable and easier with ML pipelines built on top of DataFrames. In this webinar, we will go over an example from the ebook Getting Started with Apache Spark 2.x.: predicting flight delays using Apache Spark machine learning.
How Big Data is Reducing Costs and Improving Outcomes in Health CareCarol McDonald
There is no better example of the important role that data plays in our lives than in matters of our health and our healthcare. There’s a growing wealth of health-related data out there, and it’s playing an increasing role in improving patient care, population health, and healthcare economics.
Join this talk to hear how MapR customers are using big data and advanced analytics to address a myriad of healthcare challenges—from patient to payer.
We will cover big data healthcare trends and production use cases that demonstrate how to deliver data-driven healthcare applications
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
This discusses the architecture of an end-to-end application that combines streaming data with machine learning to do real-time analysis and visualization of where and when Uber cars are clustered, so as to analyze and visualize the most popular Uber locations.
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DBCarol McDonald
Apache Spark GraphX made it possible to run graph algorithms within Spark, GraphFrames integrates GraphX and DataFrames and makes it possible to perform Graph pattern queries without moving data to a specialized graph database.
This presentation will help you get started using Apache Spark GraphFrames Graph Algorithms and Graph Queries with MapR-DB JSON document database.
Predicting Flight Delays with Spark Machine LearningCarol McDonald
Apache Spark's MLlib makes machine learning scalable and easier with ML pipelines built on top of DataFrames. In this webinar, we will go over an example from the ebook Getting Started with Apache Spark 2.x.: predicting flight delays using Apache Spark machine learning.
How Big Data is Reducing Costs and Improving Outcomes in Health CareCarol McDonald
There is no better example of the important role that data plays in our lives than in matters of our health and our healthcare. There’s a growing wealth of health-related data out there, and it’s playing an increasing role in improving patient care, population health, and healthcare economics.
Join this talk to hear how MapR customers are using big data and advanced analytics to address a myriad of healthcare challenges—from patient to payer.
We will cover big data healthcare trends and production use cases that demonstrate how to deliver data-driven healthcare applications
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...Carol McDonald
This discusses the architecture of an end-to-end application that combines streaming data with machine learning to do real-time analysis and visualization of where and when Uber cars are clustered, so as to analyze and visualize the most popular Uber locations.
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
Deep learning, machine learning, artificial intelligence - all buzzwords and representative of the future of analytics. In this talk we will explain what is machine learning and deep learning at a high level with some real world examples. The goal of this is not to turn you into a data scientist, but to give you a better understanding of what you can do with machine learning. Machine learning is becoming more accessible to developers, and Data scientists work with domain experts, architects, developers and data engineers, so it is important for everyone to have a better understanding of the possibilities. Every piece of information that your business generates has potential to add value. This and future posts are meant to provoke a review of your own data to identify new opportunities.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
Streaming Patterns Revolutionary Architectures with the Kafka APICarol McDonald
Building a robust, responsive, secure data service for healthcare is tricky. For starters, healthcare data lends itself to multiple models:
• Document representation for patient profile view or update
• Graph representation to query relationships between patients, providers, and medications
• Search representation for advanced lookups
Keeping these different systems up to date requires an architecture that can synchronize them in real time as data is updated. Furthermore, meeting audit requirements in Healthcare requires the ability to apply granular cross-datacenter replication policies to data and be able to provide detailed lineage information for each record. This post will describe how stream-first architectures can solve these challenges, and look at how this has been implemented at a Health Information Network provider.
This talk will go over the Kafka API with these design patterns:
• Turning the database upside down
• Event Sourcing , Command Query Responsibity Separation , Polyglot Persistence
• Kappa Architecture
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
Big data technologies are being applied to a wide variety of use cases. We will review tangible examples of machine learning, discuss an autonomous driving project and illustrate the role of MapR in next generation initiatives. More: http://info.mapr.com/WB_Machine-Learning-for-Chickens_Global_DG_17.11.02_RegistrationPage.html
Free Code Friday - Machine Learning with Apache SparkMapR Technologies
In this Free Code Friday webinar, you’ll get an overview of machine learning with Apache Spark’s MLlib, and you’ll also learn how MLlib decision trees can be used to predict flight delays.
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
How Rendezvous Architecture Improves Evaluation in the Real World
In this addition of our machine learning logistics webinar series we build on the ideas of the key requirements for effective management of machine learning logistics presented in the Overview webinar and in Part I Workshop. Here we focus on model-to-model comparison & evaluation, use of decoy models and more. Listen here: http://info.mapr.com/machine-learning-workshop2.html?_ga=2.35695522.324200644.1511891424-416597139.1465233415
Kubernetes is great for deploying stateless containers, but what about the big data ecosystem? Episode 3 of our Kubernetes series covers how DC/OS enables you to connect your Kubernetes-based applications to co-located big data services.
Slides cover:
1. Why persistence is challenging in distributed architectures
How DC/OS helps you take advantage of the services available in the big data ecosystem
2. How to connect Kubernetes to your data services through networking
3. How Apache Flink and Apache Spark work with Kubernetes to enable real-time data processing on DC/OS
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...Debraj GuhaThakurta
Event: TDWI Accelerate, Seattle, Oct 16, 2017
Topic: Distributed and In-Database Analytics with R
Presenter: Debraj GuhaThakurta
Tags: R, Spark, SQL Server
Demystifying AI, Machine Learning and Deep LearningCarol McDonald
Deep learning, machine learning, artificial intelligence - all buzzwords and representative of the future of analytics. In this talk we will explain what is machine learning and deep learning at a high level with some real world examples. The goal of this is not to turn you into a data scientist, but to give you a better understanding of what you can do with machine learning. Machine learning is becoming more accessible to developers, and Data scientists work with domain experts, architects, developers and data engineers, so it is important for everyone to have a better understanding of the possibilities. Every piece of information that your business generates has potential to add value. This and future posts are meant to provoke a review of your own data to identify new opportunities.
Churn prediction is big business. It minimizes customer defection by predicting which customers are likely to cancel a service. Though originally used within the telecommunications industry, it has become common practice for banks, ISPs, insurance firms, and other verticals. More: http://info.mapr.com/WB_PredictingChurn_Global_DG_17.06.15_RegistrationPage.html
The prediction process is data-driven and often uses advanced machine learning techniques. In this webinar, we'll look at customer data, do some preliminary analysis, and generate churn prediction models – all with Spark machine learning (ML) and a Zeppelin notebook.
Spark’s ML library goal is to make machine learning scalable and easy. Zeppelin with Spark provides a web-based notebook that enables interactive machine learning and visualization.
In this tutorial, we'll do the following:
Review classification and decision trees
Use Spark DataFrames with Spark ML pipelines
Predict customer churn with Apache Spark ML decision trees
Use Zeppelin to run Spark commands and visualize the results
Streaming Patterns Revolutionary Architectures with the Kafka APICarol McDonald
Building a robust, responsive, secure data service for healthcare is tricky. For starters, healthcare data lends itself to multiple models:
• Document representation for patient profile view or update
• Graph representation to query relationships between patients, providers, and medications
• Search representation for advanced lookups
Keeping these different systems up to date requires an architecture that can synchronize them in real time as data is updated. Furthermore, meeting audit requirements in Healthcare requires the ability to apply granular cross-datacenter replication policies to data and be able to provide detailed lineage information for each record. This post will describe how stream-first architectures can solve these challenges, and look at how this has been implemented at a Health Information Network provider.
This talk will go over the Kafka API with these design patterns:
• Turning the database upside down
• Event Sourcing , Command Query Responsibity Separation , Polyglot Persistence
• Kappa Architecture
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
Big data technologies are being applied to a wide variety of use cases. We will review tangible examples of machine learning, discuss an autonomous driving project and illustrate the role of MapR in next generation initiatives. More: http://info.mapr.com/WB_Machine-Learning-for-Chickens_Global_DG_17.11.02_RegistrationPage.html
Free Code Friday - Machine Learning with Apache SparkMapR Technologies
In this Free Code Friday webinar, you’ll get an overview of machine learning with Apache Spark’s MLlib, and you’ll also learn how MLlib decision trees can be used to predict flight delays.
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
How Rendezvous Architecture Improves Evaluation in the Real World
In this addition of our machine learning logistics webinar series we build on the ideas of the key requirements for effective management of machine learning logistics presented in the Overview webinar and in Part I Workshop. Here we focus on model-to-model comparison & evaluation, use of decoy models and more. Listen here: http://info.mapr.com/machine-learning-workshop2.html?_ga=2.35695522.324200644.1511891424-416597139.1465233415
Kubernetes is great for deploying stateless containers, but what about the big data ecosystem? Episode 3 of our Kubernetes series covers how DC/OS enables you to connect your Kubernetes-based applications to co-located big data services.
Slides cover:
1. Why persistence is challenging in distributed architectures
How DC/OS helps you take advantage of the services available in the big data ecosystem
2. How to connect Kubernetes to your data services through networking
3. How Apache Flink and Apache Spark work with Kubernetes to enable real-time data processing on DC/OS
TDWI Accelerate, Seattle, Oct 16, 2017: Distributed and In-Database Analytics...Debraj GuhaThakurta
Event: TDWI Accelerate, Seattle, Oct 16, 2017
Topic: Distributed and In-Database Analytics with R
Presenter: Debraj GuhaThakurta
Tags: R, Spark, SQL Server
TWDI Accelerate Seattle, Oct 16, 2017: Distributed and In-Database Analytics ...Debraj GuhaThakurta
Event: TDWI Accelerate Seattle, October 16, 2017
Topic: Distributed and In-Database Analytics with R
Presenter: Debraj GuhaThakurta
Description: How to develop scalable and in-DB analytics using R in Spark and SQL-Server
Dataservices - Processing Big Data The Microservice WayJosef Adersberger
We see a big data processing pattern emerging using the Microservice approach to build an integrated, flexible, and distributed system of data processing tasks. We call this the Dataservice pattern. In this presentation we'll introduce into Dataservices: their basic concepts, the technology typically in use (like Kubernetes, Kafka, Cassandra and Spring) and some architectures from real-life.
Application Timeline Server - Past, Present and FutureVARUN SAXENA
How YARN Application timeline server evolved from Application History Server to Application Timeline Server v1 to ATSv2 or ATS Next gen, which is currently under development.
This slide was present at Hadoop Big Data Meetup at eBay, Bangalore, India.
Episode 4: Operating Kubernetes at Scale with DC/OSMesosphere Inc.
You’ve installed your Kubernetes cluster on DC/OS — now what? Operating Kubernetes efficiently can be challenging. In the final episode of our Kubernetes series, we will share best practices for operating your DC/OS Kubernetes cluster and maintaining performance. During this presentation, Joerg Schad and Chris Gaun show you how to successfully operate Kubernetes at scale in your environment.
During this session, we discuss:
1. How to upgrade DC/OS and Kubernetes with no downtime
2. How DC/OS guards against failure and enables fault domains that are resistant to outages within racks, availability zones, or cloud environments
3. How the monitoring and metrics capabilities on DC/OS improve operational analytics and help you get the most from your cluster
4. How cloud bursting extends your on-prem environment with resources from the cloud to handle spikes in your workload
(BDT303) Running Spark and Presto on the Netflix Big Data PlatformAmazon Web Services
In this session, we discuss how Spark and Presto complement the Netflix big data platform stack that started with Hadoop, and the use cases that Spark and Presto address. Also, we discuss how we run Spark and Presto on top of the Amazon EMR infrastructure; specifically, how we use Amazon S3 as our data warehouse and how we leverage Amazon EMR as a generic framework for data-processing cluster management.
Changes in how business is done combined with multiple technology drivers make geo-distributed data increasingly important for enterprises. These changes are causing serious disruption across a wide range of industries, including healthcare, manufacturing, automotive, telecommunications, and entertainment. Technical challenges arise with these disruptions, but the good news is there are now innovative solutions to address these problems. http://info.mapr.com/WB_Geo-distributed-Big-Data-and-Analytics_Global_DG_17.05.16_RegistrationPage.html
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
The Briefing Room with Dr. Robin Bloor and Teradata RainStor
Live Webcast October 13, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=012bb2c290097165911872b1f241531d
Hadoop data lakes are emerging as peers to corporate data warehouses. However, successful data management solutions require a fusion of all relevant data, new and old, which has proven challenging for many companies. With a data lake that’s been optimized for fast queries, solid governance and lifecycle management, users can take data management to a whole new level.
Register for this episode of The Briefing Room to learn from veteran Analyst Dr. Robin Bloor as he discusses the relevance of data lakes in today’s information landscape. He’ll be briefed by Mark Cusack of Teradata, who will explain how his company’s archiving solution has developed into a storage point for raw data. He’ll show how the proven compression, scalability and governance of Teradata RainStor combined with Hadoop can enable an optimized data lake that serves as both reservoir for historical data and as a "system of record” for the enterprise.
Visit InsideAnalysis.com for more information.
How to create an enterprise data lake for enterprise-wide information storage and sharing? The data lake concept, architecture principles, support for data science and some use case review.
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...DataStax Academy
Typesafe did a survey of Spark usage last year and found that a large percentage of Spark users combine it with Cassandra and Kafka. This talk focuses on streaming data scenarios that demonstrate how these three tools complement each other for building robust, scalable, and flexible data applications. Cassandra provides resilient and scalable storage, with flexible data format and query options. Kafka provides durable, scalable collection of streaming data with message-queue semantics. Spark provides very flexible analytics, everything from classic SQL queries to machine learning and graph algorithms, running in a streaming model based on "mini-batches", offline batch jobs, or interactive queries. We'll consider best practices and areas where improvements are needed.
Modernizing upstream workflows with aws storage - john malloryAmazon Web Services
Modernizing Upstream Workflows with AWS Storage
Accelerating seismic data retrieval, getting better data protection and reliability, and providing a common AWS data platform for compute and graphic intensive processing, simulation and visualization workloads.
Modernizing and transforming exploration and production workflows with AWS Storage services
Accelerating seismic data retrieval, getting better data protection and reliability, and providing a common AWS data platform for compute and graphic intensive processing, simulation and visualization workloads.
Capturing and processing streaming sensor data from remote oil rigs with Snowball Edge
Providing a Data Lake foundation for a next generation Digital Oilfield IoT analytics platform with Amazon S3
Speaker: John Mallory - AWS Storage Business Development Manager
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamtakuyayamamoto1800
In this slide, we show the simulation example and the way to compile this solver.
In this solver, the Helmholtz equation can be solved by helmholtzFoam. Also, the Helmholtz equation with uniformly dispersed bubbles can be simulated by helmholtzBubbleFoam.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
Experience our free, in-depth three-part Tendenci Platform Corporate Membership Management workshop series! In Session 1 on May 14th, 2024, we began with an Introduction and Setup, mastering the configuration of your Corporate Membership Module settings to establish membership types, applications, and more. Then, on May 16th, 2024, in Session 2, we focused on binding individual members to a Corporate Membership and Corporate Reps, teaching you how to add individual members and assign Corporate Representatives to manage dues, renewals, and associated members. Finally, on May 28th, 2024, in Session 3, we covered questions and concerns, addressing any queries or issues you may have.
For more Tendenci AMS events, check out www.tendenci.com/events
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
Large Language Models and the End of ProgrammingMatt Welsh
Talk by Matt Welsh at Craft Conference 2024 on the impact that Large Language Models will have on the future of software development. In this talk, I discuss the ways in which LLMs will impact the software industry, from replacing human software developers with AI, to replacing conventional software with models that perform reasoning, computation, and problem-solving.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.