Grid Projects In The US July 2008

  • 1,286 views
Uploaded on

A talk given at the HPC 2008 meeting in Cetraro, Italy

A talk given at the HPC 2008 meeting in Cetraro, Italy

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,286
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
33
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Grid Projects in the US (an inevitably incomplete view) Ian Foster Computation Institute Argonne National Lab & University of Chicago
  • 2. Grid Projects in the US Resources Resource Provider Resource Provider Resource Provider
  • 3. Grid Projects in the US Service Provider Service Provider Service Provider Services Resources Resource Provider
  • 4. Grid Projects in the US Community Community Community Service Provider Content Services Resources Resource Provider Software Providers
  • 5. Grid Projects in the US Community Service Provider Content Services Resources Software Providers Resource Provider
  • 6. Resource Providers
    • Campus and regional grids
      • Purdue, Wisc, UCLA, …, …
      • TIGRE, UC system, …
    • Open Science Grid
      • 43,000 CPUs, 6 PB disk, 15,000 CPU days/day
      • Allocations on basis of MOUs
    • TeraGrid
      • ~ 1.2 Pflop/s
      • National Allocation Committee
    • Amazon, Microsoft, IBM, etc.
      • ?? CPUs, ?? storage
      • Fee for service
  • 7. Open Science Grid Sites (5/4/08) +3 in Brazil; 2 in Mexico; 2 in Taiwan; 1 in the UK. Grows by 10-20 per year.
  • 8. Use by Community CMS ATLAS CDF Local Usage & bugs (unmapped to VO) D0 2,000,000 a week 1,000,000 a week
  • 9. TeraGrid Participants
  • 10. Growing User Community Source: TeraGrid Central Database
  • 11. Growing Usage Source: TeraGrid Central Database 3.95B NUs delivered in CY2007
  • 12. CY2007 Usage by Discipline 3.95B NUs delivered in CY2007 Molecular Biosciences 31% Chemistry 17% Physics 17% Astronomical Sciences 12% Materials Research 6% Earth Sciences 3% All 19 Others 4% Advanced Scientific Computing 2% Atmospheric Sciences 3% Chemical, Thermal Systems 5%
  • 13. Grid Projects in the US
    • For example:
    • Build and test service (Wisc)
    • Certificate Authorities
    • Cancer Biology Informatics Grid
    • LIGO Data Grid
    Community Service Provider Content Services Resources Software Providers Resource Provider Service Provider
  • 14. caBIG: sharing of infrastructure, applications, and data. Data Integration! Services & Cancer Biology Globus
  • 15. caBIG Under the Covers NCICB Research Center Grid-Enabled Client Research Center Tool 1 Tool 2 Tool 3 Tool 4 Grid Data Service Analytical Service Grid Portal Microarray Gene Database caArray Protein Database Image Tool 2 Tool 3 Grid Services Infrastructure (Metadata, Registry, Query, Invocation, Security, etc.) Globus
  • 16. LIGO Data Grid Birmingham • Replicating >1 Terabyte/day to 8 sites 770 TB replicated to date: >120 million replicas MTBF = 1 month LIGO Gravitational Wave Observatory Ann Chervenak et al., ISI; Scott Koranda et al, LIGO
    • Cardiff
    AEI/Golm Globus
  • 17. Grid Projects in the US
    • For example:
    • Earth System Grid
    • Children’s Oncology Grid
    • Southern California Earthquake Center (SCEC)
    • Science gateways
    Community Service Provider Content Services Resources Software Providers Resource Provider Community
  • 18. Earth System Grid Main ESG Portal CMIP3 (IPCC AR4) ESG Portal
    • 198 TB of data at four locations
    • 1,150 datasets
    • 1,032,000 files
    • Includes the past 6 years of joint DOE/NSF climate modeling experiments
    • 35 TB of data at one location
    • 74,700 files
    • Generated by a modeling campaign coordinated by the Intergovernmental Panel on Climate Change
    • Data from 13 countries, representing 25 models
    8,000 registered users 1,900 registered projects
    • Downloads to date
    • 49 TB
    • 176,000 files
    • Downloads to date
    • 387 TB
    • 1,300,000 files
    • 500 GB/day (average)
    400 scientific papers published to date based on analysis of CMIP3 (IPCC AR4) data ESG usage: over 500 sites worldwide ESG monthly download volumes Globus
  • 19. SCEC Community Modeling Environment Pathway Instantiations Knowledge Base Ontologies Curated taxonomies, Relations & constraints Pathway Models Pathway templates, Models of simulation codes Code Repositories Data & Simulation Products Data Collections FSM RDM AWM SRM Storage GRID Pathway Execution Policy, Data ingest, Repository access Grid Services Compute & storage management, Security DIGITAL LIBRARIES Navigation & Queries Versioning, Replication Mediated Collections Federated access KNOWLEDGE ACQUISITION Acquisition Interfaces Dialog planning, Pathway construction strategies Pathway Assembly Template instantiation, Resource selection, Constraint checking KNOWLEDGE REPRESENTATION & REASONING Knowledge Server Knowledge base access, Inference Translation Services Syntactic & semantic translation Computing Users A collaboratory for system-level earthquake science Globus
  • 20. Seismic Hazard Analysis
    • Defn: Max. intensity of shaking expected at a site during a fixed time interval
    • Example: National seismic hazard maps
    • Intensity measure: peak ground acceleration
    • Interval: 50 yrs
    • Probability of exceedance: 2%
    (http://geohazards.cr.usgs.gov/eq/) Globus
  • 21. SCEC Computations & Grid
    • Prepare input to Pathway2 wave propagation code
    • Pathway2PGV converts output into hazard map
    • Map is visualized
    SDSC USC SCEC PSC TeraGrid ISI 12 CPUs 1,700 CPUs 1,200 CPUs 1 CPU 4 CPUs Globus
  • 22. Children’s Oncology Grid and MEDICUS Globus
  • 23. Grid Projects in the US Community Service Provider Content Services Resources Resource Provider Software Providers
  • 24. Software Providers
    • Globus [GT4.2 released July 2, 2008]
      • GRAM, GridFTP, MDS, RLS, DRS, …
      • GSI, GridShib, MyProxy, …
      • GridWay (Spain), OGSA-DAI (UK), Introduce, …
    • Condor
    • MPI-G, Swift, Pegasus, Taverna (UK), Kepler
    • caBIG: e.g., Introduce
    • Virtual Data Toolkit (includes VOMS [Italy], …)
    • SRB, iRODS, MyCluster, …
    Globus
  • 25. Virtual Data Toolkit (VDT) Software Release Process VDT components over time: built for 15 Linux Versions Development & testing Globus
  • 26. Creating Services: Introduce and gRAVI
    • Introduce
      • Define service
      • Create skeleton
      • Discover types
      • Add operations
      • Configure security
    • Grid R emote A pplication V irtualization Infrastructure
      • Wrap executables
    Index service Repository Service Introduce Container Ohio State University and Argonne/U.Chicago Appln Service Create Store Advertize Discover Invoke; get results Transfer GAR Deploy Globus
  • 27. Composing Services Globus
  • 28. Service Discovery: Registries Globus
  • 29. Challenges Community Community Community Service Provider Content Services Resources Resource Provider Software Providers Conflicting Missions Sustainability Discipline science pull
  • 30. The Future
    • NSF eXtreme Digital (XD) solicitation
      • Aka “TeraGrid III”
    • DOE, NIH, etc.—what do they want?
    • International cooperation