Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ocean/Atmosphere Sciences: A Data Driven Science


Published on

In this deck from the DDN User Group at SC19, Dr. Suryachandra Rao from MoES presents: Ocean/Atmosphere Sciences: A Data Driven Science. The Ministry of Earth Sciences (MoES) is mandated to provide services for weather, climate, ocean and coastal state, hydrology, seismology, and natural hazards; to explore and harness marine living and non-living resources in a sustainable way and to explore the three poles (Arctic, Antarctic and Himalayas).

"MoES recently inaugurated a new supercomputer at the Indian Institute of Tropical Meteorology (IITM) in Pune, dedicated to improving weather and climate forecasts across the country. The high-performance computing (HPC) facility will provide improved weather forecasts at block level over India; higher resolution forecasts during the monsoon; high-resolution coupled models for cyclone prediction with more accuracy and lead time; improved ocean state forecasts including marine water quality forecasts at high resolution; tsunami forecasts with greater lead time; air quality forecasts for different smart cities; and high-resolution climate projections. The HPC facility will also be utilized by other MoES institutes for research activities to improve their respective weather and climate services. MoES’s new supercomputer in Pune will boost the organization’s overall HPC infrastructure to 6.8 petaflops of computing power, making it one of the most powerful HPC facilities in the world."

Watch the video:

Learn more:

Sign up for our insideHPC Newsletter:

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Ocean/Atmosphere Sciences: A Data Driven Science

  2. 2. Outline of the Presentation • Data growth in Ocean/atmosphere Sciences • Past to present setups of HPC and data storage systems of MOES • Major challenges in management of data • Future expectations
  3. 3. Mandate of MoES The primary mandate of the Ministry of Earth Sciences (MoES) is to provide the nation with best possible services in forecasting the monsoons and other weather/climate parameters, ocean state, earthquakes, tsunamis and other phenomena related to earth systems.
  4. 4. Ekman Currents in the Ocean
  5. 5. MoES Weather, Climate and Ocean State Forecasts Short Range (Next 2-3 days) Medium Range (Up to 7-10 days) Extended Range (beyond 2 weeks up to one month) Long Range (Seasonal mean) Climate change projections (contributing to IPCC, CMIP6) Ocean State Forecast (next 3-5 days) Potential Fishing Zone advisories Air quality forecast (next 2-3 days) Agricultural, Forest Fire, Hydrology advisories Tsunami Warnings
  6. 6. Exponential Data Growth in Earth Sciences in last 2 decades
  7. 7. Data required to generate Forecasts 0 10 20 30 40 50 60 70 0 1 2 3 4 5 6 1997200120032006200820092010201120122013201420152016 FTP(GB/day) GTS(GB/day) Year FTP (SAT + RADAR) IMD(GTS) Surface/Upper Air observations
  8. 8. Why so much data? In order to make a single day forecast • Initial data from various sensors/satellites  75 GB/day • Analysis (combining the data with model forecasts)  75 Gb/day • Model forecasts  250 GB/day (as it runs at much higher resolution) • To reduce uncertainty in forecasts 50 of those model forecasts are re-run with slight perturbations to initial data/analysis  50 TB/day  In a day several forecasts are made for medium, extended and long range forecasts in addition to climate change projections and R&D experiments to improve models
  9. 9. HPC and Data Storage Capacities @ MOES Year HPC Capacity @ IITM (@MoES) Data Storage Capacity @MoES Tape Capacity 2008 7 TF (50 TF) 300 TB 1 PB 2009 70 TF (115 TF) 3 PB 2 PB 2014 790 TF (1150 TF) 18 PB 30 PB 2018 4000 TF (8000 TF) 45 PB 40 PB
  10. 10. Challenges and Changes in data flow Challenges • Programming of efficient workflows • Efficient analysis of data • Organizing data sets • Ensuring reproducibility of workflows/provenance of data • Meeting the compute/storage needs in future complex hardware landscape Expected Data Characteristics in 2020+ • Velocity: Input 5 TB/day (for NWP; reduced data from instruments) • Volume: Data output of ensembles in PBs of data • Data products are used by 3rd parties • Various file formats Source: Julian M. Kunkel, University of Reading
  11. 11. Major Concerns of data handling • Data Collection and its Preservation  Data is growing rapidly (multi-platform, multi-model with multi-resolutions) Make it readily available or archive it for long storage? Moving it from one configuration to the other as systems get upgraded • Accessing the preserved data  Retrieval of archived data at faster rates than present • Utilizing the preserved data  Combining the data from different experiments, forecasts etc.  Transferring data from one computer to another (bandwidth limitations)  Analyzing the big datasets themselves
  12. 12. DDN ConfidentialDDN Storage | ©2018 DataDirect Networks, Inc. MOES ASSOCIATION WITH DDN 2013 2016 2019 1PB DDN S2A9900 10PB DDN 7700X 27PB DDN SFA18K 5 GB/s Performance 200 GB/s Performance 81 GB/s Performance
  13. 13. Innovative Use of 300M IB Cables for 10PB Storage • Two data centers which are 300M apart at IITM. MoES wanted storage connectivity over RDMA capable InfiniBand. • First use of 300M Mellanox LinkX modules and InfiniBand cables in APAC. Storage delivered 200 GB/s performance. • Data migration from DDN S2A9900 connected to POWER6 compute over DDR InfiniBand. 10PB DDN Storage in DC1 6x 648 Port Chassis Switches in DC2 Leaf IB Switches in DC1 Protective Duct for Multi-Mode Fiber 300M SR4 optical modules Patch panel with MPO connector Spine IB Switches in DC1 Patch panel with MPO connector Aaditya 790+TF HPC System Data Movers Qlogic/Silverstorm DDR IB Switch DDN S2A9900 DC2 DC1
  14. 14. Adoption of Disk Archive & HPSS for Long Term Archive Core Switch :1 Core Switch :2 TORTOR Home File system Scratch File system Existing CRAY Compute & Storage Environment Data Movers Data Movers EDR InfiniBand N/w DDN Storage 17PB @ IITM 10PB @ NCMRWF NAS Gateways CRAY Ethernet Switch • In 2019, MoES decided to procure 27PB disk based archive along with HPSS for long term archive at two of its sites. • Through competitive tendering process, ATOS with DDN were selected to provide this technology. • Factors that governed winning bid: • Price/Performance • Total Cost of Ownership including data center footprint • HPSS Integration experience • Currently being installed at both sites. Smallest DC footprint Highest performance/$ Existing HPSS References
  15. 15. Challenges and Needs of MOES Challenges Data migration from one generation system to another including disk to disk, tape to tape and disk to tape. Evaluation of new technology that improves I/O of weather/climate simulations. Data center footprint, electricity consumption Filesystem reliability. All computation is time sensitive and must not stop because of storage or filesystem issues. Needs Reliable storage solution and system integration capability with experience in data migration. Vendors to come forward with innovative proposals and willingness to work with MoES to reduce the simulation time of relevant applications Focus on Total Cost of Ownership Tight integration between storage and filesystem. Vendors filesystem support capabilities are very important
  16. 16. Next Steps for MoES on Data Storage • Evaluate new technologies • Use of ESDM for Heterogeneous Storage Infrastructures/HPSS for incremental scalability • Benefits of 3D X-point storage • Usage of Flash on wider scale. Burst Buffer based on MoES application performance • Single storage to support for diverse computing architectures
  17. 17. THANK YOU