High Performance
Cyberinfrastructure Discovery Tools
for Data Intensive Research




Larry Smarr
Prof. Computer Science and Engineering
Director, Calit2 (UC San Diego/UC Irvine)
Abstract
High performance cyberinfrastructure (10Gbps dedicated
optical channels end-to-end) enable new levels of discovery
for data-intensive research projects—such as cosmological
simulations, ocean observing, and microbial metagenomics.
In addition to the national optical fiber infrastructure provided
by National LamdbaRail, we need local campus high
performance research cyberinfrastructure to provide ―on-
ramps,‖ as well as compute and storage clouds, to augment
the emerging remote commercial clouds.
Dedicated 10,000Mbps (10Gbps) Supernetworks
Enable Remote Visual Analysis of Big Data

                                               National
                                             LambdaRail
                                            Interconnects
                                              Two Dozen
                                          State and Regional
                                           Optical Networks


                                          Also       Dynamic
                                            Circuit Network
            NLR 80 x 10Gb Wavelengths      Is Now Available
NSF’s OptIPuter Project: Using Supernetworks
to Meet the Needs of Data-Intensive Researchers

                                                   OptIPortal–
                                                   Termination
                                                  Device for the
                                                    OptIPuter
                                                     10Gbps
                                                    Backplane
Intergalactic Medium on 2 Billion Light Year Scale

Exploring Cosmology With Supercomputers,
Supernetworks, and Supervisualization


• Supercomputer Output                                      Science:
                                                             Norman,
   – 148 TB Movie Output                                    Harkness,
     (0.25 TB/file)                                         Paschos,
   – 80 TB Diagnostic Dumps                                   SDSC
     (8 TB/file)                                           Visualization:
• Connected at 10Gbps                                       Insley, ANL;
                                                           Wagner SDSC
   – Oak Ridge to ANL to SDSC
ANL * Calit2 * LBNL * NICS * ORNL * SDSC
End-to-End 10Gbps Cyberinfrastructure for
Petascale High Performance Computing End Users

     log of gas temperature   log of gas density
                                                   Mike Norman, SDSC
                                                    Using OptIPortal
                                                       to Analyze
                                                        Petascale
                                                      Cosmological
                                                         Big Data
Calit2 Microbial Metagenomics Cluster-Users Can
Connect by Shared Internet or 10Gbps Optical Paths


                                                     Source: Phil
                                                    Papadopoulos,
     512 Processors                                  SDSC, Calit2
       ~5 Teraflops
 ~ 200 Terabytes Storage


                                Nearly 4000 Users
                                Over 75 Countries
Using 10 Gbps Big Data Access and Analysis-
Collaboration Between Calit2 and U Washington
                          Photo Credit: Alan Decker
                            Feb. 29, 2008
                                                        Ginger Armbrust’s
                                                      Diatom Chromosomes
                                                      iHDTV: 1500 Mbits/sec
                                                      Calit2 to UW Over NLR




                                                       UW Research Channel
MIT Using OptIPortal to Analyze 10km
Coupled Ocean Microbial Simulation




                                       MIT’s Ed DeLong &
                                       Darwin Project Team
The NSF-Funded Ocean Observatory Initiative– a
Complex System of Systems Cyberinfrastructure

                                                 Source: Matthew
                                                   Arrott, Calit2
                                                 Program Manager
                                                    for OOI CI
Taking Sensornets to the Ocean Floor:
Remote Interactive HD Imaging of Deep Sea Vent



1 cm.                                               Source:
                                                 John Delaney
                                                      and
                                                   Research
                                                   Channel,
                                                 University of
                                                  Washington
NSF OOI is a $400M Program
-OOI CI is $34M Part of OOI


     Science Program
                                          Source:
      25 to 30 Years
                                       Matthew Arrott,
                                       Calit2 Program
 Construction                         Manager for OOI
                                     Cyberinfrastructure
Program 5 Years



                               30 Software Engineers
                              Housed at Calit2@UCSD
OOI CI is Built on National LambdaRail’s
and Internet2’s DCN Optical Infrastructure


                                              Source: John
                                             Orcutt, Matthew
                                             Arrott, SIO/Calit2
High Definition Video Connected OptIPortals:
Virtual Working Spaces for Data Intensive Research

                                                   Source: Falko
                                                 Kuester, Kai Doerr
                                                  Calit2; Michael
                                                    Sims, NASA
Analyzing Big Data in 3D Stereo:
The NexCAVE OptIPortal


                                   Source: Tom
                                     DeFanti,
                                   Calit2@UCSD

                                   Array of JVC
                                   HDTV 3D LCD
                                     Screens

                                   Calit2’s KAUST
                                    NexCAVE =
                                    22.5MPixels
CENIC,
“Blueprint 10Gbe
        N x for the  Digital I2DCN
                        NLR, University”--Report
                                         of the
UCSD Research Cyberinfrastructure Design Team
research.ucsd.edu/documents/rcidt/RCIDTReportFinal2009.pdf
                  Gordon –
                   HPC
    Cluster       System                          April 24, 2009
    Condo
                                           DataOasis
                   Triton –             (Central) Storage
                   Petadata
                   Analysis
    Scientific
  Instruments

                  Digital Data   Campus Lab
                  Collections      Cluster        OptIPortal


    Source: Philip Papadopoulos, SDSC, Calit2 UCSD
California and Washington Universities Are Testing
a 10Gbps Connected Commercial Data Cloud


      • Amazon Experiment for Big Data
         – Only Available Through CENIC and
           Pacific NW GigaPOP
            • Private 10Gbps Peering Paths
         – Includes Amazon EC2 Computing and
           S3 Storage Services
You Can Download This Presentation
at lsmarr.calit2.net

High Performance Cyberinfrastructure Discovery Tools for Data Intensive Research

  • 1.
    High Performance Cyberinfrastructure DiscoveryTools for Data Intensive Research Larry Smarr Prof. Computer Science and Engineering Director, Calit2 (UC San Diego/UC Irvine)
  • 2.
    Abstract High performance cyberinfrastructure(10Gbps dedicated optical channels end-to-end) enable new levels of discovery for data-intensive research projects—such as cosmological simulations, ocean observing, and microbial metagenomics. In addition to the national optical fiber infrastructure provided by National LamdbaRail, we need local campus high performance research cyberinfrastructure to provide ―on- ramps,‖ as well as compute and storage clouds, to augment the emerging remote commercial clouds.
  • 3.
    Dedicated 10,000Mbps (10Gbps)Supernetworks Enable Remote Visual Analysis of Big Data National LambdaRail Interconnects Two Dozen State and Regional Optical Networks Also Dynamic Circuit Network NLR 80 x 10Gb Wavelengths Is Now Available
  • 4.
    NSF’s OptIPuter Project:Using Supernetworks to Meet the Needs of Data-Intensive Researchers OptIPortal– Termination Device for the OptIPuter 10Gbps Backplane
  • 5.
    Intergalactic Medium on2 Billion Light Year Scale Exploring Cosmology With Supercomputers, Supernetworks, and Supervisualization • Supercomputer Output Science: Norman, – 148 TB Movie Output Harkness, (0.25 TB/file) Paschos, – 80 TB Diagnostic Dumps SDSC (8 TB/file) Visualization: • Connected at 10Gbps Insley, ANL; Wagner SDSC – Oak Ridge to ANL to SDSC ANL * Calit2 * LBNL * NICS * ORNL * SDSC
  • 6.
    End-to-End 10Gbps Cyberinfrastructurefor Petascale High Performance Computing End Users log of gas temperature log of gas density Mike Norman, SDSC Using OptIPortal to Analyze Petascale Cosmological Big Data
  • 7.
    Calit2 Microbial MetagenomicsCluster-Users Can Connect by Shared Internet or 10Gbps Optical Paths Source: Phil Papadopoulos, 512 Processors SDSC, Calit2 ~5 Teraflops ~ 200 Terabytes Storage Nearly 4000 Users Over 75 Countries
  • 8.
    Using 10 GbpsBig Data Access and Analysis- Collaboration Between Calit2 and U Washington Photo Credit: Alan Decker Feb. 29, 2008 Ginger Armbrust’s Diatom Chromosomes iHDTV: 1500 Mbits/sec Calit2 to UW Over NLR UW Research Channel
  • 9.
    MIT Using OptIPortalto Analyze 10km Coupled Ocean Microbial Simulation MIT’s Ed DeLong & Darwin Project Team
  • 10.
    The NSF-Funded OceanObservatory Initiative– a Complex System of Systems Cyberinfrastructure Source: Matthew Arrott, Calit2 Program Manager for OOI CI
  • 11.
    Taking Sensornets tothe Ocean Floor: Remote Interactive HD Imaging of Deep Sea Vent 1 cm. Source: John Delaney and Research Channel, University of Washington
  • 12.
    NSF OOI isa $400M Program -OOI CI is $34M Part of OOI Science Program Source: 25 to 30 Years Matthew Arrott, Calit2 Program Construction Manager for OOI Cyberinfrastructure Program 5 Years 30 Software Engineers Housed at Calit2@UCSD
  • 13.
    OOI CI isBuilt on National LambdaRail’s and Internet2’s DCN Optical Infrastructure Source: John Orcutt, Matthew Arrott, SIO/Calit2
  • 14.
    High Definition VideoConnected OptIPortals: Virtual Working Spaces for Data Intensive Research Source: Falko Kuester, Kai Doerr Calit2; Michael Sims, NASA
  • 15.
    Analyzing Big Datain 3D Stereo: The NexCAVE OptIPortal Source: Tom DeFanti, Calit2@UCSD Array of JVC HDTV 3D LCD Screens Calit2’s KAUST NexCAVE = 22.5MPixels
  • 16.
    CENIC, “Blueprint 10Gbe N x for the Digital I2DCN NLR, University”--Report of the UCSD Research Cyberinfrastructure Design Team research.ucsd.edu/documents/rcidt/RCIDTReportFinal2009.pdf Gordon – HPC Cluster System April 24, 2009 Condo DataOasis Triton – (Central) Storage Petadata Analysis Scientific Instruments Digital Data Campus Lab Collections Cluster OptIPortal Source: Philip Papadopoulos, SDSC, Calit2 UCSD
  • 17.
    California and WashingtonUniversities Are Testing a 10Gbps Connected Commercial Data Cloud • Amazon Experiment for Big Data – Only Available Through CENIC and Pacific NW GigaPOP • Private 10Gbps Peering Paths – Includes Amazon EC2 Computing and S3 Storage Services
  • 18.
    You Can DownloadThis Presentation at lsmarr.calit2.net

Editor's Notes

  • #8 This is a production cluster with it’s own Force10 e1200 switch. It is connected to quartzite and is labeled as the “CAMERA Force10 E1200”. We built CAMERA this way because of technology deployed successfully in Quartzite