Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

0

Share

Download to read offline

Recent Upgrades to ARM Data Transfer and Delivery Using Globus

Download to read offline

This presentation was given at the 2019 GlobusWorld Conference in Chicago, IL by Giri Prakash from the ARM Data Center at Oak Ridge National Laboratory.

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Recent Upgrades to ARM Data Transfer and Delivery Using Globus

  1. 1. May 6, 2019 1 Improving Data Transfer and Delivery using Globus Recent Upgrades to The Atmospheric Radiation Measurement (ARM) Facility Data Center Architecture GIRI PRAKASH, ZACH PRICE, JOSEPH OLATT, AND JITU KUMAR ARM Data Center, Oak Ridge National Laboratory Globus World, May 01, 2019 palanisamyg@ornl.gov
  2. 2. ARM’s Vision 2 To provide a detailed & accurate description of the earth atmosphere in diverse climate regimes to resolve the uncertainties in climate and earth system models toward the development of sustainable solutions for the Nation’s energy & environmental challenges.
  3. 3. ARM Data Flow – The Big Picture Data Growth 1 PB 3
  4. 4. ARM Data – Disaster Recovery Offsite Data backup ARM Data files that are copied into the ORNL HPSS system at ORNL are also copied over to the HPSS system at ANL. The Globus-URL-copy program and the ESnet network between the two labs are utilized for this purpose. ■ Date copying to ANL started: 03-26-2018 ■ Total size transferred: 188.03 TB ■ Total number of files transferred: 3,938,465 4
  5. 5. Data Transfer and Staging to Facilitate Data Science 5
  6. 6. Data Transfer and Staging to Facilitate Data Science 6
  7. 7. Data Pipeline and Software Architecture May 6, 2019 7 Data Processing Storage & Data Model Querying Analytics Scientific Users Data Pipeline Software Architecture Interface Visualization Analytics Output Spark ARM HPC Computing Clusters JupyterLab Relational Database NoSQL Database • Supports fast analysis of voluminous data • Hides architectural complexities • Stage data in HPC • Metadata • Order History • Data from multiple instruments Frontend Analytic Server Backend Dr.Bhargavi Krishna, Yuping Lu, and Dr.Jitu Kumar 7
  8. 8. Data Retrieval, Packaging, and Delivery § Merging § DQR filtering § Conversion Retrieval Future capability Data- streams HPSS Online copy Link to data access Data quality Access to plots DOI based citation guidance Publication request Discovery UI & Web services NetCDF data extractions Data staging order Live Data WS 8
  9. 9. Data Discovery Tool 9
  10. 10. 10 § Based on big data analysis platform (NoSQL) § ARM HPC Clusters for data processing § Provides an interactive web interface for users to find simulations of interest through examination of the LES performance relative to select ARM observations § Allows user to visualize LASSO data bundle diagnostics and skill scores on the fly using plots and tables § Globus as a delivery option Cassandra D3 & NodeJS Spark Data Discovery for LASSO
  11. 11. Questions? 5/6/19 11 ARM Data Services Giri Prakash palanisamyg@ornl.gov

This presentation was given at the 2019 GlobusWorld Conference in Chicago, IL by Giri Prakash from the ARM Data Center at Oak Ridge National Laboratory.

Views

Total views

226

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

2

Shares

0

Comments

0

Likes

0

×