1
Upcoming SlideShare
Loading in...5
×
 

1

on

  • 2,417 views

 

Statistics

Views

Total Views
2,417
Views on SlideShare
2,417
Embed Views
0

Actions

Likes
0
Downloads
9
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft Word

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    1 1 Document Transcript

    • DataGrid REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART Document identifier: DataGrid-09-D9.2-0902-1_4 Date: Work package: WP09 Partner(s): ESA, KNMI, IPSL, RAL Lead Partner: ESA Document status APPROVED Deliverable identifier: DataGrid-D09.2 Abstract: EU-DataGrid Project Work Package 9 Deliverable Document D9.2: A survey of the current state-of-art of national and international organisations, projects, software tools and testbeds for development of global, Petabyte-scale High-Performance Metacomputing Grids, Collaborative Environments and e-Science infrastructures. IST-2000-25182 PUBLIC 1 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART Delivery Slip Name Partner Date Signature From J. Linford ESA 23/01/2002 Verified by V. Breton CNRS 28/01/2002 Approved by PTB CERN 28/01/2002 Document Log Issue Date Comment Author 1_0 21/08/2001 First draft J. Linford 1_1 28/08/2001 Updated J. Linford J. Linford. Reviewed by Luigi 1_2 05/12/2001 Internal review Fusco. Comments from IPSL and KNMI Comments from V. Breton , S.Du, 1_3 14/01/2002 Review by DataGrid WPs G. Romier (moderator: V. Breton) Comments from V. Breton, S. 1_4 21/01/2002 Review by DataGrid WPs (2nd cycle) Beco (Datamat) Document Change Record Issue Item Reason for Change 1_1 All sections Structure reorganised, new material, other improvements New material on EO/ES projects contributed by Joost van 1_2 Various sections Bemmelen (ESA). Updated after feedback to internal review Updated according to comments and feedback from the 1_3 Various sections reviewers. Some new information added in Section 7. 1_4 Various sections Second review cycle IST-2000-25182 PUBLIC 2 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART Files Software Products User files Word DataGrid-09-D9.2-0902-1_4-TestbedsReport.doc http://edmsoraweb.cern.ch:8001/cedar/doc.info? document_id=336289 IST-2000-25182 PUBLIC 3 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART CONTENT 1. INTRODUCTION...............................................................................................................................................7 1.1. OBJECTIVES OF THIS DOCUMENT............................................................................................................................7 1.2. APPLICATION AREA.............................................................................................................................................7 1.3. APPLICABLE DOCUMENTS AND REFERENCE DOCUMENTS.............................................................................................7 1.4. DOCUMENT AMENDMENT PROCEDURE.....................................................................................................................9 1.5. TERMINOLOGY...................................................................................................................................................9 2. EXECUTIVE SUMMARY...............................................................................................................................11 3. GRID APPLICATION DRIVERS..................................................................................................................12 3.1. THE NATURE OF GRID TECHNOLOGY...............................................................................................................12 3.2. HIGH ENERGY PHYSICS.....................................................................................................................................12 3.2.1. ATLAS..................................................................................................................................................12 3.2.2. ALICE...................................................................................................................................................13 3.2.3. CMS......................................................................................................................................................13 3.2.4. LHCb....................................................................................................................................................13 3.3. BIOLOGY.........................................................................................................................................................13 3.4. EARTH OBSERVATION.......................................................................................................................................13 3.5. OTHER APPLICATION AREAS................................................................................................................................14 4. GRID ORGANIZATIONS WORLDWIDE...................................................................................................15 4.1. GLOBAL GRID FORUM......................................................................................................................................15 4.2. USA NATIONAL PARTNERSHIPS.........................................................................................................................16 4.2.1. EOT/PACI............................................................................................................................................16 4.2.2. DOE / ASCI DisCom2..........................................................................................................................16 4.2.3. CAVERN...............................................................................................................................................16 4.3. EUROPEAN PROJECTS AND FORUMS.........................................................................................................17 4.3.1. DataGrid Consortium..........................................................................................................................17 4.3.2. EUROGRID..........................................................................................................................................17 4.3.3. European Grid Forum .........................................................................................................................18 4.3.4. UNICORE FORUM.............................................................................................................................18 4.3.5. GÉANT.................................................................................................................................................19 4.3.6. TERENA...............................................................................................................................................19 4.3.7. Scandinavia..........................................................................................................................................20 4.3.8. Ireland..................................................................................................................................................20 4.3.9. UK e-science programme.....................................................................................................................20 4.3.10. GridPP...............................................................................................................................................21 4.3.11. CrossGrid...........................................................................................................................................21 4.4. ASIA............................................................................................................................................................22 4.4.1. Asia-Pacific Grid.................................................................................................................................22 4.4.2. Grid forum Korea.................................................................................................................................22 4.5. CROSS-CONTINENTAL PROJECTS...........................................................................................................................22 4.5.1. DataTag................................................................................................................................................23 5. BASIC GRID BUILDING TOOLS.................................................................................................................24 5.1. GLOBUS..........................................................................................................................................................24 5.2. CONDOR.....................................................................................................................................................24 5.3. UNICORE....................................................................................................................................................25 5.4. GRID-ENGINE..................................................................................................................................................26 5.5. LEGION...........................................................................................................................................................26 5.5.1. Architecture..........................................................................................................................................27 5.6. CACTUS COMPUTATIONAL TOOLKIT....................................................................................................................27 IST-2000-25182 PUBLIC 4 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 5.7. NETSOLVE.......................................................................................................................................................28 5.8. NEOS...........................................................................................................................................................28 5.9. COLLABORATIVE AUTOMATIC VIRTUAL ENVIRONMENTS........................................................................................29 5.10. VISUALIZATION TOOLKIT.................................................................................................................................30 5.11. NLANR GRID PORTAL DEVELOPMENT KIT ....................................................................................................30 5.12. OTHER GRID MIDDLEWARE TOOLS........................................................................................................30 6. SURVEY OF CURRENT GRID TESTBEDS ...............................................................................................31 6.1. USA NATIONAL PROGRAMMES..........................................................................................................................31 6.1.1. Terascale Computing System...............................................................................................................31 6.1.2. Distributed Terascale Facility.............................................................................................................31 6.1.3. Access Grid .........................................................................................................................................32 6.1.4. SC Grid.................................................................................................................................................32 6.1.5. DOE Science Grid ...............................................................................................................................32 6.1.6. DOE / ASCI DisCom2..........................................................................................................................33 6.1.7. Information Power Grid.......................................................................................................................33 6.1.8. NEESGrid.............................................................................................................................................34 6.1.9. IGRID...................................................................................................................................................34 6.1.10. Particle Physics Data Grid................................................................................................................35 6.1.11. Grid Physics Network (GriPhyN)......................................................................................................36 6.1.12. GRADS...............................................................................................................................................38 6.1.13. Bandwidth to the World.....................................................................................................................39 6.1.14. CAVE / IMMERSADESK...................................................................................................................39 6.2. EUROPEAN PROGRAMMES...................................................................................................................................41 6.2.1. EU DataGrid........................................................................................................................................41 6.2.2. EuroGrid..............................................................................................................................................45 6.2.3. CROSSGRID........................................................................................................................................46 6.2.4. INFN Grid............................................................................................................................................46 6.2.5. EPSRC E-Science Testbed...................................................................................................................46 6.2.6. Grid-Ireland.........................................................................................................................................47 6.2.7. CENTURION........................................................................................................................................47 6.2.8. GEMSviz...............................................................................................................................................48 6.2.9. Distributed ASCI Supercomputer (DAS)..............................................................................................48 6.3. OTHER COUNTRIES...........................................................................................................................................49 6.3.1. Scandinavia..........................................................................................................................................49 6.3.2. Russia...................................................................................................................................................49 6.3.3. EcoGrid – EconomyGRID....................................................................................................................50 7. EARTH SCIENCE / SPACE PROGRAMMES FOR GRID........................................................................51 7.1. ESA SPACEGRID.......................................................................................................................................51 7.2. CDF - CONCURRENT DESIGN FACILITY..............................................................................................................51 7.3. ASTROVIRTEL..........................................................................................................................................51 7.4. EGSO...........................................................................................................................................................51 7.5. CEOS...........................................................................................................................................................52 7.6. ISS VIRTUAL CAMPUS.....................................................................................................................................52 7.7. ASTROPHYSICAL VIRTUAL OBSERVATORY............................................................................................................52 7.8. ASTROGRID...............................................................................................................................................53 7.9. NATIONAL VIRTUAL OBSERVATORY....................................................................................................................53 7.10. THE VIRTUAL SOLAR OBSERVATORY................................................................................................................53 7.11. GRIDLAB......................................................................................................................................................54 7.12. MM5..........................................................................................................................................................54 7.13. TARDIS.........................................................................................................................................................54 7.14. NEPH............................................................................................................................................................54 7.15. THE EARTH SYSTEM GRID..............................................................................................................................55 7.16. AVANT GARDE..............................................................................................................................................55 IST-2000-25182 PUBLIC 5 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 8. CONCLUSION..................................................................................................................................................56 IST-2000-25182 PUBLIC 6 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 1. INTRODUCTION 1.1. OBJECTIVES OF THIS DOCUMENT This deliverable is closely associated with WP9 Task 9.2, described as follows [A2]: Survey and study of the different projects and testbed connected with EO related activities, to have a view of the state-of-the-art in order to plan for improvements, development and reuse. This survey will focus on all the main topics involved in computational Grid activity to: • Application development to have better understanding of middleware already available to develop GRID-aware applications • Workload management to review middleware and test-bed already used or under evaluation • Data Management to review middleware and data access standard compliant with EO environment • Network Management to understand utilization of the different protocols (QoS improvement, Ipv6 features etc.) that could match or impact on EO application development The main focus will be on the middleware components and all its expected features for EO Grid-aware applications, such as: resource, data and processing management, security and accounting, billing, communication, monitoring. This will also address relation and intersection with US Grid projects and initiatives. In addition this document describes the progress and current status (at the time of writing) of the DataGrid Testbeds and middleware. 1.2. APPLICATION AREA The main application area is Computer Science, Research and development of grid metacomputing testbeds and building computational grid infrastructures. Grid can be summarised as “flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources” [R20]. 1.3. APPLICABLE DOCUMENTS AND REFERENCE DOCUMENTS Applicable documents [A1] DataGrid Project Quality Plan [DataGrid-12-D12.1-0101-2-0] [A2] DataGrid Project Programme Annex 1 “Description of Work” Reference documents [R1] Report on the INFN-GRID Globus evaluation, DataGrid WP1 Report by INFN, March 26, 2001 [R2] DataGrid WP9 Task00 Activity Report, ESA-ESRIN, March 3, 2001 [R3] DataGrid Deliverable 8.1 Planning specification of Grid services [R4] DataGrid Deliverable 9.1 EO application requirements specification for Grid [R5] DataGrid Deliverable 10.1 Bio-Informatics DataGrid requirements [R6] The GrADS Project: Software Support for High-Level Grid Application Development, F. Berman, IST-2000-25182 PUBLIC 7 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART A. Chien, K. Cooper, J. Dongarra, I. Foster, D. Gannon, L. Johnsson, K. Kennedy, C. Kesselman, J. Mellor-Crummey, D. Reed, L. Torczon, R. Wolski. International Journal of High Performance Computing, Winter 2001 (Volume 15, Number 4) [R7] Performance Contracts: Predicting and Monitoring Grid Application Behavior, F. Vraalsen, R.A. Aydt, C.L. Mendes, and D.A. Reed, Grid Computing - GRID 2001: Proceedings of the 2nd International Workshop on Grid Computing, Denver, CO, November 12, 2001, Springer-Verlag Lecture Notes in Computer Science, to appear. [R8] Real-time Performance Monitoring, Adaptive Control, and Interactive Steering of Computational Grids, J.S.Vetter and D.A.Reed. International Journal of High Performance Computing Applications, Winter 2000, (Volume 14, No. 4), pp. 357-366. [R9] The Grid: Blueprint for a New Computing Infrastructure. Ed. I. Foster, C. Kesselman, Morgan Kaufmann, July 1998 [R10] NSF / EPSRC Grid workshop [http://www.isi.edu/us-uk.gridworkshop/] [R11] P. Anderson, A. Scobie Large Scale Linux Configuration with LCFG Division of Informatics, University of Edinburgh [http://www.dcs.ed.ac.uk/home/paul/publications/ALS2000/index.html] [R12] A. Samar, H. Stockinger Grid Data Management Pilot (GDMP) A Tool for Wide Area Replication [http://home.cern.ch/a/asamar/www/grid/gdmp_ai2001.ps] [R13] C.Anglano, S.Barale, L.Gaido, A.Guarise, S.Lusso, A.Werbrouck An accounting system for the DataGrid project -Preliminary proposal [http://www.gridforum.org/Meetings/GGF3/Papers/ggf3_11.pdf] [R14] DataGrid D1.2 Definition of architecture, technical plan and evaluation criteria for scheduling, resource management, security and job description [http://server11.infn.it/workload- grid/docs/DataGrid-01-D1.2-0112-0-3.doc] [R15] EU DataGrid Project D1.1 Report on Current Technology [http://server11.infn.it/workload-grid/docs/DataGrid-01-TED-0102-1_0.doc] [R16] EU DataGrid Project D2.1 Report on Current Technology [In preparation] [R17] EU DataGrid Project D3.1 Current Technology Evaluation [In preparation] [R18] EU DataGrid Project D4.1 Report on Current Technology [http://hep-proj-grid-fabric.web.cern.ch/hep-proj-grid-fabric/Tools/DataGrid-04- TED-0101-3_0.doc] [R19] R-GMA Relational Information Monitoring and Management System User Guide Version 2.1.1 EU DataGrid Project WP3 relational team October 10, 2001 [http://datagrid-wp8.web.cern.ch/DataGrid-WP8/Testbed1/Doc_Assessment/guide.pdf] [R20] I. Foster, C. Kesselman, The Anatomy of the Grid: Enabling Scalable Virtual Organizations, Intl J. Supercomputer Applications 2001 [http://www.globus.org/research/papers/anatomy.pdf] [R21] Jason S. King, Parallel FTP Performance in a High-Bandwidth, High-Latency WAN [http://www.llnl.gov/asci/discom/sc2000_king.html] [R22] G. Cancio, S. Fisher, T. Folkes, F. Giacomini, W. Hoschek, D. Kelsey, B. Tierney DataGrid Architecture Version 2, EU DataGrid project working document, July 2001 [http://press.web.cern.ch/grid-atf/doc/architecture-2001-07-02.pdf] [R22] L. Childers, T. Disz, R. Olson, M. Papka, R. Stevens, T. Udeshi, Access Grid: Immersive Group- to-Group Collaborative Visualization, Mathematics and Computer Science Division, Argonne IST-2000-25182 PUBLIC 8 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART National Laboratory, Computer Science Department University of Chicago [http://www-fp.mcs.anl.gov/fl/publications-electronic-files/ag-immersive-821.pdf] [R23] William Johnston, Horst D. Simon , DOE Science Grid: Enabling and Deploying the SciDAC Collaboratory Software Environment, Proposal paper 1.4. DOCUMENT AMENDMENT PROCEDURE This document is under the responsibility of ESA/ESRIN. Amendments, comments and suggestions should be sent to the person in charge of WP9 task 9.2 The document control and change procedure is detailed in the DataGrid Project Quality Plan [A1] 1.5. TERMINOLOGY Definitions AFS Andrew File System ALICE A Large Ion Collider Experiment AMS Archive Management System AOD Analysis Object Data ASP Application Service Provider CORBA Common Object Request Broker Architecture DB Database DNA De-oxy Ribose Nucleic Acid EO Earth Observation EU European Union ESD Event Summary Data GOME Global Ozone Monitoring Experiment GBP Great Britain Pound HEP High Energy Physics HENP High Energy Nuclear Physics HPC High Performance Computing IETF Internet Engineering Task Force Ipv6 Internet Protocol Version 6 JPL Jet Propulsion Laboratory IST European Union Information Society Technologies programme LDAP Lightweight Directory Access Protocol MPI-2 Message Passing Interface version 2 PKI Public Key Infrastructure PM9 Project Month 9 IST-2000-25182 PUBLIC 9 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART PNO Public Network Operator RFC Request For Comment RTD Research and Technological Development SSL Secure Sockets Layer STM Synchronous Transfer Mode UK United Kingdom UKP United Kingdom Pound VPN Virtual Private Network VR Virtual Reality Glossary Catalogue Container of product descriptors and product collections GridFTP Grid File Transfer Protocol JINI Java network framework for distributed computing Metadata Information about the actual data contained in a dataset, such as product type, date and time, file format, etc. TAG ALICE collision event information object data X.509 Internet standard for security Abbreviations KB 1024 bytes MB Giga Bytes = 103 KB GB Giga Bytes = 106 KB TB Tera Bytes = 109 KB PB Peta Bytes = 1012 KB US United States of America IST-2000-25182 PUBLIC 10 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 2. EXECUTIVE SUMMARY This document satisfies the formal deliverable part of the DataGrid Project Work Package 9 task 9.2 “EO related middleware components state-of-the-art”, in which the focus is on the current status of grid testbeds in general (i.e. not just those used for Earth Observation Applications). This document is a survey which presents the current state-of-the art of grid testbeds worldwide today. Recently a large number of organisations in developed countries throughout the world have begun undertaking advanced research and development into emerging grid concepts and technology, a relatively new topic in distributed, high-performance and collaborative computing. The technology is generating a large amount of global interest and activity due to it’s innovative approach for providing solutions for ubiquitous, high-capacity computing to help solve today’s increasingly demanding data- and processing-intense scientific and RTD computing problems. In order to present to the reader a clear picture of the state of global research in all areas and at different stages of development in this rapidly progressing field of research, this report has been structured as follows: Section 1 Introduction Section 2 This section Section 3 The nature of grid concepts and technology, the need for grids - focusing on the applications Section 4 Worldwide survey of organisations involved in grid research - focusing on the organizations and projects currently underway Section 5 Survey of existing software tools used as a basis for building grids – focusing on the collections of software and grid middleware tools being used Section 6 Survey and current state-of-the art of the technology – focusing on the grid testbeds currently operational, or being constructed Section 7 Focusing on specific Earth Observation, Space and Astronomy projects where application of grid technology may offer significant benefits Section 8 Conclusion Note: In some cases the distinction between grid organisations and their testbeds is not easily recognizable; we have aimed to describe first the organisations (Section 4) and then the testbeds (Section 6); in some cases there is inevitably some overlap between the two. IST-2000-25182 PUBLIC 11 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 3. GRID APPLICATION DRIVERS Here we describe the major scientific research and application areas which are calling for the development of computational grids. 3.1. THE NATURE OF GRID TECHNOLOGY The focus since the mid-eighties to develop HPC computing techniques has resulted in an increasing number of high performance supercomputing centers being created, regionally distributed all over the world. At that time, the goal was to master the art of using a large number of co-operating processors to solve a computational problem. The need has subsequently arisen for a uniform way to link up HPC and data centers in separate organizations to provide an even more powerful, large scale network- distributed, “virtual” computing machine. To day, the goal is to master the art of using co-operating heterogeneous systems and resources. Hence both the paradigm and the terminology used are borrowed from that of the national (and in many cases international) electrical power grids. This need appears to be fuelled primarily by research in climatology where a large amount of data and computation are required, and in high-energy physics where particle detectors generate vast quantities of data which needs to be analysed within certain time limits. New experiments and detectors soon to come online will generate even larger quantities of data on a scale several orders of magnitude greater than was previously possible. Although the need for Biology and Earth Science applications is only slightly less urgent, these areas of scientific research and the capability of existing services will gain substantial advances by making use of the techniques and facilities which are developed over the next decade. According to [R9] there are five classes of applications for which grids are relevant:  Distributed supercomputing to solve very large problems in multi-physics, multi- systems  High-throughput to solve large numbers of small tasks  On-demand to meet peak needs for computational resources  Collaborative to connect people  Data-intensive for coupling distributed data resources 3.2. HIGH ENERGY PHYSICS Four major HEP experiments of the CERN LHC (Large Hadron Collider) are currently demanding Grid computing facilities. The major requirements of the experiments have been identified and are collected in [R3]. 3.2.1. ATLAS There are some 1800 ATLAS collaborators distributed in more than 100 institutes all over the world, roughly 1000-1500 physicists are expected to use grid computing resources to analyse event data at the different processing levels, roughly 150 users will need to run jobs simultaneously. Total resources of ~2000 kSi95 for CPU, ~20 PB of tape storage and ~2600 TB of disk storage are foreseen in order to store the data and perform several different levels of data processing needed to reconstruct the particle collision events. IST-2000-25182 PUBLIC 12 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 3.2.2. ALICE The ALICE (A Large Ion Collider Experiment) experiment at the CERN (European Organisation for Nuclear Research) Large Hadron Collider (LHC) will produce data in an unprecedented large quantity, of the order of several PB (2×1015 bytes) per year, starting in 2007. 3.2.3. CMS The CMS experiment is expected to begin producing data by 2006. The CMS detector is one of two general-purpose detectors of the LHC accelerator. The worldwide CMS collaboration currently con- sists of some 1800 scientists in 144 institutes, divided over 31 countries. The 15 million detector ele- ments in the CMS detector will produce roughly 100 Mb/sec of raw data to be refined in subsequent processing levels using a multi-tiered data distribution and processing chain. 3.2.4. LHCb Grid activities within LHCb focus on using Monte Carlo techniques for event reconstruction and ana- lysis and will take place within a multi-tiered worldwide processing scheme consisting of Tier-1, Tier-2 and selected Tier-3 centres, including analyses done on personal workstations (Tier-4). In the processing chain raw data is used to reconstruct the event physical quantities such as tracks, energy clusters and particle ID to form the basis of the ESD data. This is then processed to determine mo- mentum 4-vectors, locate vertices, reconstruct invariant masses etc. to produce AOD and TAG files. Using the multi-tiered approach, over one year it is expected that 400 Tb of raw data, 200 Tb of ESD data, 70 Tb of AOD and 2 Tb of TAG data will be processed. 3.3. BIOLOGY The human genome project is well known for its very demanding information processing task needed to decode and map out the vast human DNA gene sequence. Recent developments have only been possible through coordinated collaboration of many organisations and institutions on a global scale. In future a processing task of such proportions is expected to be carried out easily and quickly with the use of grid middleware, infrastructures and protocols expected to emerge as a result of ongoing international grid research projects. Biomedical research is set to become increasingly reliant on high performance computing capabilities and on the availability of a large number and variety of high-volume datasets. Current molecular modelling programs can take up to 2 months of dedicated supercomputer time to perform an end-to- end simulation [R4] of a complex protein reaction which takes just 20µ seconds to occur in real life. Similarly, there are many areas of bio-science which can benefit from these developments. 3.4. EARTH OBSERVATION There are currently vast amounts of data collected and archived from satellite and airborne instruments available for analysis in a wide variety of Earth Sciences application areas. Scientific satellite data have been collected for the past thirty years. National and/or international space agencies have organized the means for the scientific community to use these data. The data volume increases drastically, and then their duplication becomes difficult. Ongoing research programmes and advances in science will continue to deploy new, increasingly specialized instruments to add to the current data collections. Complex modelling processes, processing, analysis and data-merging algorithms continue to make demands on computing capability and performance. Many research and information programmes are being carried out on a global scale. For instance, monitoring of Ozone in the atmosphere needs to be carried out by analysing and modelling changes in ozone gas concentrations in time-lapse sequences globally in three dimensions over a continuous five-year period. Earth Science programmes for environmental disaster monitoring, control and prevention require guaranteed IST-2000-25182 PUBLIC 13 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART availability of processing power on demand to produce results in a timely manner. Furthermore the Earth Science community is both multi-disciplinary and distributed worldwide and a prime candidate for the uptake of grid infrastructures and fabrics to enable collaboration between researchers and operational centers. 3.5. OTHER APPLICATION AREAS There are a large number of scientific and commercial endeavours, which either now or in future need to analyse very large datasets with a widely distributed user base. One example is in the field of numerical optimization, described in Section 5.8 in which grid computing principles have been successfully applied. Examples of other applications include:  The Human Brain Project  Automated astronomical scans  Geophysical data  Climatology  Meteorology: nowcasting or short-term forecasting  Crystallography data  Banking and financial analysis  Investigations into Consumer trends and marketing  Defence  Industrial mold-filling simulation using an internationally distributed software component architecture  Investigating the Einstein theory of space-time - colliding black holes and neutron stars  Metacomputing and collaborative visualization for molecular modeling  Maximum-likelihood analysis of phylogenetic data  Construction of numerical wind tunnel based on design procedure: from aircraft geometric definition to aircraft flow solutions  Parallel computation of high-speed train aerodynamics  Remote visualization of electron microscopy data  Telemanufacturing via international high-speed network  Architectural walk-through coupled with a parallel lighting simulation  Optimisation and Design Search for Engineering  Electromagnetic Compatibility in advanced Automotive design IST-2000-25182 PUBLIC 14 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 4. GRID ORGANIZATIONS WORLDWIDE 4.1. GLOBAL GRID FORUM http://www.globalgridforum.org/ A result of merging the Grid Forum, the European Grid Forum (eGrid), and the Grid community in Asia-Pacific, the Global Grid Forum is a community-based forum of individual researchers and practitioners working on distributed computing "grid" technologies. It focuses on the promotion and development of grid technologies and applications via the development and documentation of "best practices," implementation guidelines, and standards with an emphasis on rough consensus and running code. The stated goals of the GGF are:  Facilitate and support the creation and development of regional and global computational grids  Address architecture, infrastructure, standards and other technical requirements  Educate and bring grid awareness to the scientific community, industry, government and the public  Facilitate the application of grid technologies within educational, research, governmental, healthcare and other industries  Provide a forum for exploration of computational grid technologies, applications and opportunities  Exercise influence on US non-profit corporations to accomplish its charitable, scientific and educational purposes Participants come from over 200 organizations in over 30 countries, with financial and in-kind support coming from companies and institutions. The forum organises 3 annual meetings in locations around the world. The forum is organized into several working groups. The draft documents produced will form the basis of the future grid equivalent of RFCs.1 Working groups are organised into:  Information Services: grid object specification, grid notification framework, grid directory services, relational DB information service  Security: infrastructure and certificate policy  Scheduling and Management: resource management, scheduling dictionary, scheduling attributes  Performance: fabric monitoring  Architecture: grid protocols, JINI, accounting models  Data: replication, GridFTP  Applications & Models: applications & testbeds, user services, computing environments, advanced programming models, advanced collaborative environments GGF is positioned for developing an integrated architecture from the published results and findings of different research testbeds. As a result of its central role and documentation process modelled after the 1 RFC stands for Request For Comments, documents which have become the formal method defining Internet standards. IST-2000-25182 PUBLIC 15 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART IETF's RFC series, GGF or a closely related body is a likely candidate source of emerging formal grid standards. 4.2. USA NATIONAL PARTNERSHIPS National Science Foundation's Directorate for Computer and Information Science and Engineering created the Partnership for Advanced Computational Infrastructure (PACI) in 1997. PACI aims to provide the national scientific user community with access to high-end computing infrastructure and research. PACI is building the 21st century's information infrastructure for meeting the increasing need for high-end computation and information technologies by the academic research community. The aim is to provide a "national grid" of interconnected high-performance computing systems. The PACI program provides support to two national partnerships:  National Computational Science Alliance (Alliance) the University of Illinois' National Center for Supercomputing Applications (NCSA) is the lead site for this partnership of more than 50 academic, government, and industry research partners from across the United States  National Partnership for Advanced Computational Infrastructure (NPACI) the San Diego Supercomputer Center (SDSC) is the lead site for this partnership with 46 institutional members from 20 states across the country and four international affiliates. Section 6.1 gives more details about the testbeds being run by the programme. 4.2.1. EOT/PACI http://www.eot.org A joint programme of the of the Alliance and PACI program for Educational Outreach and Training, the mission of this group is to develop human resources through innovative use of emerging technologies to understand and solve problems. EOT/PACI testbeds are described in Section 6.1 4.2.2. DOE / ASCI DisCom2 http://www.cs.sandia.gov/discom/ The Department of Energy, National Nuclear Security Administration runs a programme for Advanced Simulation & Computing (ASCI). The programme has worked for several years with the High-Performance Computing and Communications community to develop terascale computing technologies and systems to help ensure the safety and reliability of the USA nuclear weapons stockpile without full-scale testing. The Distance and Distributed Computing and Communication (DisCom2) project is part of the ASCI program (Section 6.1.6). The DisCom2 programme objectives are:  Develop technologies and infrastructure that will enable the efficient use of high-end computing platforms at a distance  Increase the rate of adoption of computational modeling and simulation  Create flexible distributed systems that can be configured to provide both capacity and capability computing 4.2.3. CAVERN IST-2000-25182 PUBLIC 16 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART http://www.evl.uic.edu/cavern CAVERN, the CAVE Research Network, is an alliance of industrial and research institutions equipped with CAVEs, Immersadesks, and high performance computing resources, interconnected by high- speed networks to support collaboration in design, training, education, scientific visualization, and computational steering, in virtual reality. Supported by advanced networking on both the national and international level, the CAVE research network is focusing on Tele-Immersion - the union of networked virtual reality and video in the context of significant computing and data mining. These pioneering tools are described in Section 5.9 and the testbed sites in Section 6.1.14 4.3. EUROPEAN PROJECTS AND FORUMS 4.3.1. DataGrid Consortium http://www.eu-datagrid.org/ The European DataGrid Consortium consists of 21 partners, mostly research institutes, covering France, Italy, Finland, Norway, Sweden, Germany, Spain, Czech Republic, Netherlands, United Kingdom and Hungary. The IST funded project has the objective to “Start creating the ambient intelligence landscape for seamless delivery of services and applications in Europe relying also upon test-beds and open source software, develop user-friendliness, and develop and converge the networking infrastructure in Europe to world-class” [A2]. A very large membership makes this one of the largest EU scientific collaborations, building on the expertise gained in several successful large- scale projects, extending from the central hub of the CERN organization, spreading the technology beyond High Energy Physics to involve related application areas such as Earth Sciences and Biomedical Research. The work plan of the DataGrid project is given in the proposal technical annex [A2]. Work is divided among the 12 work packages and periodic deliverables are scheduled. The EU DataGrid testbeds are described in the testbeds section 6.2.1 4.3.2. EUROGRID http://www.eurogrid.org EuroGrid is a grid application testbed funded by the IST. The project will establish a European domain-specific GRID infrastructure based on the UNICORE system (see also Section 4.3.4 and 5.3). The project started in November 2000 with a duration of 3 years. The main partners are :  CSCS (Swiss Center for Scientific Computing)  DWD (Deutscher Wetterdienst - German Meteorological Service)  FZ Jülich (Research Centre Jülich, Germany)  ICM (Warsaw University)  IDRIS (Institute for Development and Resources in Intensive Scientific Computing, CNRS's national supercomputing center)  Parallab (Computational Science & High Performance Computing Laboratory of Norway)  Manchester Computing  EADS (European Aeronautic Defence and Space Company)  Systemhaus (Now T-Systems Europe's No. 2 computer systems company) IST-2000-25182 PUBLIC 17 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  Fecit (Fujitsu European Centre for Information Technology) EADS and Systemhaus are representing industry, while Pallas and Fecit are in charge of developing the UNICORE-based grid middleware. The project will investigates the development of grids in four main application areas:  Bio GRID (lead partner Warsaw University http://biogrid.icm.edu.pl) will develop interfaces to enable chemists and biologists to submit work to HPC facilities via a uniform interface from their workstations, without having to worry about the details of how to run particular packages on different architectures  Meteo GRID will develop adaptations to an existing weather–prediction code for on– demand localised weather prediction and thus be the first step towards a weather prediction portal  Computer-aided Engineering (CAE) GRID will carry out adaptation of two CAE applications to use grid services  HPC Research GRID will make the computing resources of the HPC centers in EUROGRID available to other partners for use in the general HPC field, and interfaces to a number of important HPC applications will be developed The project will focus on five important areas to be developed:  efficient data transfer  resource brokerage  ASP services  application coupling  interactive access The results will be integrated into operational software and evaluated by the application work packages. Some of them will use CORBA or MPI-2 for interfacing software modules and Java for development of simple and portable graphical interfaces. The EUROGRID testbeds are described in section 6.2.2 4.3.3. European Grid Forum http://www.egrid.org The European Grid Forum, EGrid, has merged together with Grid Forum and Asia-Pacific Grid efforts into one organisation called Global Grid Forum. EGrid will continue its efforts on the European level mainly in the area of pan-European testbeds. It won't be called a "Forum" any more. All the Forum- related parts of EGrid (i.e. Working Groups) have migrated to the Global Grid Forum. 4.3.4. UNICORE FORUM http://www.unicore.org The UNICORE (UNified Interface to COmputer REsouces) project was formulated and funded by the BMFT (Bundesministerium für Forschung und Technologie) as a 30 months’ activity by the main partners Deutscher Wetterdienst (DWD), FZ Jülich, Genias, Pallas, and RZ Universtät Stuttgart starting on 1 July 1997 (see also Section 4.3.2). Additional partners were ZIB Berlin, RZ Universität Karlsruhe, HPC Paderborn and LRZ München. At the end of the project a de-facto standardisation body for UNICORE was founded in 1999 with 21 members. The project aims to provide a unified IST-2000-25182 PUBLIC 18 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART access to a set of computing resources located at various regional, industrial or national supercomputing centres. It aims to foster the use and development of UNICORE standards to access computers via the Internet for batch processing. Unicore software and testbeds are described in Section 5.3 and 6.2.2 respectively. 4.3.5. GÉANT http://www.dante.net/geant/ Beginning with the X.25 network in 1999, collective management of the European research networks has evolved through EuropaNet, Ten-34 and TEN-155, today known as GÉANT, the Pan-European Gigabit Research and Education Network. This has only been possible through inter-governmental and industrial collaboration towards liberalisation in the European Telecommunications market. The GÉANT architecture, which links 27 European Research and Educational Networks (NRENs), has a topology designed for high-speed sharing of network resources, while satisfying increasing demands for service quality. Current network bandwidth utilization grows at a rate of 200% per year as a result of new, developing applications coming online to serve the needs of many different high-technology communities in Europe (for example, tele-teaching, online collaboration, video conferencing). GÉANT is a European success story which lays down the procedures to allow rapid, coordinated growth in the European Telecommunications Research Networks. GÉANT officially became fully operational on 1st December 2001. 4.3.6. TERENA http://www.terena.nl/ The Trans-European Research and Education Networking Association - was formed in October 1994 by the merger of RARE (Réseaux Associés pour la Recherche Européenne) and EARN (European Academic and Research Network) to promote and participate in the development of high-quality international information and telecommunications infrastructure for the benefit of research and education. It aims to ensure that the developing infrastructure is based on open standards and uses the most advanced technology available. TERENA carries out technical activities and provides a platform for discussion to encourage the development of a high-quality computer networking infrastructure for the European research community. An extensive and varied membership in more than 30 countries works to advance the combined interests of the following types of member organisations:  governments and the Commission of the European Union to promote networking development, to advise on policy issues relating to networking development and on regulatory matters and to stimulate investments in Research and Education Networking;  standards bodies to ensure the development of open standards for services and for the development of standards for new services;  telecommunications operators for encouraging the participation of PNOs in the research and education communities pilot projects leading to new advanced services and for the provision of adequate international services at appropriate tariffs - to encourage competition in the area of international services  industry to encourage participation in research networking in general and in pilot research networking projects, and to provide feedback to industry on the emerging developments in networking and to stimulate European suppliers of products and services IST-2000-25182 PUBLIC 19 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART Terena is currently helping to coordinate and to facilitate cross-fertilisation in the ongoing developments of the European national research networks, such as:  Gigabit-Wissenschaftsnetz G-WiN, DFN, Germany  SURFnet, The Netherlands  SuperJANET 4, UKERNA, The United Kingdom  Nordunet2  European Education and Research Interconnects, DANTE  The Amsterdam Internet Exchange 4.3.7. Scandinavia http://www.nordu.net/ NORDUnet is the Nordic Internet highway to research and education networks in Denmark, Finland, Iceland, Norway and Sweden, and provides the Nordic backbone to the Global Information Society. NORDUnet has a 622Mbit/s STM-4 connection to New York that carries traffic related to advanced research application between Nordunet member networks and Abilene network (Internet2). Connection between Nordunet's router in New York and Abilene has also STM-4 capacity. Connection between New York and the StarTap in Chicago is 155Mbit/s. The Nordunet grid testbed is described in Section 6.3.1. See also GEMSviz, Section 6.2.8 4.3.8. Ireland www.grid-ireland.org/ In Ireland, the universities and other institutions of advanced education and research are represented in Government by the Higher Education Authority (HEA), and computing systems of the institutions are interconnected by the HEAnet. The grid infrastructure in Ireland is organised and managed by Grid- Ireland, a management layer above the HEAnet. Grid-Ireland aims to provide a common grid infrastructure for hosting virtual organizations targeted for user communities of, for example, astrophysicists, geneticists or linguists. A pilot project began in October, 1999, between Trinity College Dublin and University College Cork, and the IT Centre, NUI Galway and was extended a year later with a follow-on project, with the aim of establishing an Irish grid and to carrying on research in grid systems. Section 6.2.6 describes the Grid-Ireland testbed. 4.3.9. UK e-science programme http://www.escience-grid.org.uk http://www.nesc.ac.uk/esi/ The UK e-science programme has been founded to enhance global collaboration in key areas of science and to develop the next-generation infrastructure to enable it. The term “e-science” is used to refer to the large-scale science that will increasingly be carried out through distributed global collaborations enabled by the Internet. The term e-science is meant to include also:  Science done using computer facilities - simulation and analysis (HPC)  Science enabled by linking computation with measurement facilities  Remote control of measurement and observation, e.g. robots IST-2000-25182 PUBLIC 20 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  Data management and knowledge discovery applied to scientific results  Visualisation, animation and VR  Science facilitated by using internet technology  Rapid electronic publishing, e.g. via the Web Funding of £98M GBP funding for a new UK e-Science was announced in November 2000 to be allocated to the ESRC, NERC, BBSRC, MRC, EPSRC, PPARC and CLRC research centres. CLRC will develop a new Teraflop scale HPC system and a core, cross-Council activity will develop and broker generic technology solutions and middleware which will also form the basis for new commercial e-business software. The e-Science Programme is overseen by a Steering Committee chaired by Professor David Wallace, Vice-Chancellor of Loughborough University and Professor Tony Hey, previously Dean of Engineering at the University of Southampton is the Director of the e-Science Core Programme. 4.3.10. GridPP http://www.gridpp.ac.uk/ GridPP is a collaboration of Particle Physicists and Computing Scientists from the UK and CERN which aims to build the infrastructure for integrated, collaborative use of high-end computers, networks, databases, and scientific instruments owned and managed by multiple organizations. GridPP aims to test a prototype of the Grid for the Large Hadron Collider (LHC) project at CERN. The project will also work closely with the existing UK Particle Physics and e- Science programmes, to help with the early uptake of the technology, the deployment of testbeds and to disseminate the results of the research in the UK particle physics community. 4.3.11. CrossGrid http://www.crossgrid.org The Cross Grid project will develop, implement and exploit new Grid components for interactive compute and data intensive applications like simulation and visualisation for surgical procedures, flooding crisis team decision support systems, distributed data analysis in high-energy physics, air pollution combined with weather forecasting. The elaborated methodology, generic application architecture, programming environment, and new Grid services will be validated and tested thoroughly on the CrossGrid testbed, with an emphasis on a user friendly environment. The work will be done in close collaboration with the Grid Forum and the DataGrid project to profit from their results and experience, and to obtain full interoperability. This will result in the further extension of the Grid across eleven European countries. The work will aim to develop new Grid components for interactive compute and data intensive applications such as :  Simulation and visualization for surgical procedures  Flooding crisis  Team decision support systems  Distributed data analysis in high-energy physics  Air pollution combined with weather forecasting More on Crossgrid testbeds in Section 6.2.3 IST-2000-25182 PUBLIC 21 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 4.4. ASIA 4.4.1. Asia-Pacific Grid http://www.apgrid.org/ Asia-Pacific Grid (ApGrid) provides Grid environments around Asia-Pacific region. ApGrid is a meeting point for all Asia-Pacific HPCN researchers. It acts as a communication channel to the GGF, and other Grid communities. Any Grid researchers can find international project partners through ApGrid. 4.4.2. Grid forum Korea http://www.gridforumkorea.org/ The Ministry of Information and Communication and KISTI (Korea Institute of Science and Technology Information) have established a national Grid baseline plan and have formed the Grid Forum Korea to lead domestic research and to participate in international Grid developments. The Forum provides an advisory committee backed up by a research institute and provides support for working groups and interdisciplinary exchange among those engaged in related research fields. Working groups include applied engineering, basic sciences, industrial technology, and information technology. There is an interest in:  HPC and work related to network and middleware technology  research in new fields  improvement of the research environment and scalability for production 4.5. CROSS-CONTINENTAL PROJECTS No grid initiative on a global scale would be complete without a transatlantic collaboration. A major aspect of international grid metacomputing is the linking of substantial HPC facilities on both sides of the Atlantic in America and Europe, to be spearheaded by the need to distribute and analyse Particle Physics collider experiment data. Further into the future we may perhaps see such links being extended to include participation of Russian, Asian and Australian organisations and resources. Interest has already been raised within the larger national grid projects. Initiatives in US are making moves to open up the grid network to include European nodes and vice-versa. However, such initiatives have not yet reached full swing; they are either not yet well developed or well publicized, therefore there is little information available on their current state. One collaboration is between the National Science Foundation's Distributed Terascale Facility and the UK e-Science program, as a result of the Memorandum of Understanding (MOU) that is in place between the UK Engineering and Physical Sciences Research Council (EPSRC) and the NSF. A recent workshop [R10] discussed the creation of a high-speed transatlantic testbed. Access Grid has participating sites in Europe and the NSF International Virtual Data Grid Laboratory (iVDGL) project, based on Globus and Condor, also has a global programme covering the Americas, Europe, Japan and Australia, to be used by GriPhyN, PPDG, DataGrid and other physics projects. The EU DataTag project (2 years, starting in January 2002, partners CERN, PPARC, INFN, UvA) aims to provide high-bandwidth transatlantic network links on which to implement fast and reliable bulk data transfers between the EU and US Grids. There is also the International HENP Grid Coordination and Joint Development team (INTERGRID), initially involving DataGrid, GriPhyN, and PPDG, as well as the national European Grid projects in IST-2000-25182 PUBLIC 22 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART UK, Italy, Netherlands and France, who have agreed to foster collaborative efforts to manage and coordinate their activities to obtain a consistent standards-based global Grid infrastructure. 4.5.1. DataTag http://www.datatag.org The DataTAG project will create a large-scale intercontinental Grid testbed that will focus upon advanced networking issues and interoperability between these intercontinental Grid domains, hence extending the capabilities of each and enhancing the worldwide programme of Grid development. DataTAG will involve the DataGRID project, several national projects in Europe, and related Grid projects in the USA. The project will address the issues which arise in the sector of high performance inter-Grid networking, including sustained and reliable high performance data replication, end-to-end advanced network services, and novel monitoring techniques. The project will also directly address the issues which arise in the sector of interoperability between the Grid middleware layers such as information and security services. The advance made will be disseminated into each of the associated Grid projects. The project formally starts on the 1st January 2002. IST-2000-25182 PUBLIC 23 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 5. BASIC GRID BUILDING TOOLS In order to report on the testbeds we first introduce the grid tools currently being used as basic building blocks, the tools from which the future grid platform protocols and middleware will emerge. Development of grid concepts, research into metacomputing and middleware development was under way for several years before the grid concept emerged as a definitive movement in both academic and commercial-sector research centers of excellence. As a result today we find several software bases which have been developed and are currently available for use as the building blocks of tomorrow’s grids. Here we describe a rather wide range of different software and tools which are currently being used and considered as the basis for developing testbeds. Within this frame there are potentially many different notions, concepts and definitions of exactly what constitutes a grid. To fully examine this question is the subject of another document. Here we take the term to be as broad as is the scope of the current definitive report on the collection of tools, methods and research brought together in the publication by Foster, Kesselman et. al. [R9]. Some of these are basic grid middleware or toolkits - collections of individual tools, such as Globus and Condor Unicore, GridEngine, Legion, while others, such as Cactus, Netsolve and NEOS are actually grid environments and frameworks, and then there are more specific tools, such as CAVE and VTK, used in collaborative (visual) environments. We also briefly mention a few other interesting tools for building grids, however it is a list which can be expected to grow in the future. In addition to those tools which have been developed specifically with grids in mind (e.g. Globus), we have included other interesting stand-alone tools (e.g. VTK), which are frequently used to bring additional functionality into the grid environment. Here we cannot hope to provide a fully comprehensive survey of all available tools, instead we are bound to focus on the most popular ones. In the last section we describe a number of miscellaneous individual tools for performing specific grid functions, although this list is by no means exhaustive. 5.1. GLOBUS http://www.globus.org/ Globus grid middleware is developed mainly by the Argonne National Laboratory, the University of Chicago and the University of Southern California. Globus has developed a comprehensive set of tools for building infrastructure to support high-performance computation over wide-area networks. Emphasis is placed on based on using existing and emerging standards and protocols. Not only does Globus provide a comprehensive set of tools for covering the main basic grid functional requirements, it is a major factor in its popularity that the code is platform independent and made freely available as open source. These features combined make it suitable for deployment and dissemination to a large range of user communities worldwide. Globus aims to further develop grid standards and protocols and to continue to develop the software tools at the same time. Globus is being used in many projects and testbeds and in the case of DataGrid, new, more sophisticated tools and services will be added on to the existing open-source code base by European middleware developers and useful, proven techniques, protocols, scripts and codes will be re-absorbed into the globus base. We can expect to see all new ideas and developments as a result of these collaborations implemented in Globus tools over the next few years. 5.2. CONDOR http://www.cs.wisc.edu/condor/ Condor has been developed to fill the need for a job submission, control and monitoring system for use in distributed parallel metacomputing architectures. Condor forms an important part of the Globus toolkit, almost to the point of the two being interdependent. Globus uses Condor to schedule and IST-2000-25182 PUBLIC 24 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART control parallel jobs. Condor layered over Globus provides strong fault tolerance including checkpointing and migration, and job scheduling across multiple resources, serving as a “personal batch system” for the Grid. Condor on its own lacks an information service without which it cannot locate resources on distant networks. In this way the two middleware tools are highly complementary, even though independently owned and developed. 5.3. UNICORE http://www.unicore.de/ Originally developed for the leading German supercomputer centers, UNICORE (UNiform Interface to COmputing REsources) provides a science and engineering GRID combining resources of supercomputer centers and making them available through the Internet. Strong authentication is performed in a consistent and transparent manner, and the differences between platforms are hidden from the user thus creating a seamless HPC portal for accessing supercomputers, compiling and running applications, and transferring input/output data. The UNICORE GRID system consists of three distinct software tiers:  Clients interacting with the user and providing functions to construct, submit and control the execution of computational jobs  Gateways acting as point-of-entry into the protected domains of the HPC centers  Servers that schedule and run the jobs on the HPC platforms that they control. All components are written in Java, and the protocols between the components are also defined using Java mechanisms. The system emphasizes strong authentication and data security relying on X.509 certificates and SSL. The server uses contains platform-specific incarnation databases to adapt to the multitude of current HPC systems, with the other components being designed for portability. A client interface enables the user to create, submit and control jobs from any workstation or PC on the Internet. The client connects to a UNICORE gateway which authenticates both client and user, before contacting the UNICORE job submission servers. The UNICORE client runs on MS Windows and Unix platforms. The desktop installation process includes the procedure to generate security certificates. The client is started by clicking on the Icon (Windows) or from the Unix command line. The installation remains valid for 30 days. Although the client is not dependent on the type of web-browser, the process to automatically generate a certificate and the associated user environment has only been tested with Netscape 4.6 and 4.7. Extensions of the UNICORE software are planned in the areas of  Application-specific clients and servers  Fast and reliable data transfer  Resource brokering,  Application services  Support for interactivity IST-2000-25182 PUBLIC 25 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 5.4. GRID-ENGINE http://gridengine.sunsource.net/ The Sun Grid-Engine has been derived from the Gridware company headed by Wolfgang Gentzch (formerly Genias Software). Gridware had three major assets which were interesting for Sun – a long standing participation in grid projects since 1995, including Unicore, Autobench, Medusa, Julius, and Eroppa (mostly funded by the German Government or the CEC in Brussels); a specialized team of per- sonnel with eight year’s experience developing grid software and grid concepts; and the Grid Engine technology, formerly known as Codine, a distributed resource management software which enables transparent management, administration and usability of networked computers, as one single virtual computer, to deliver compute power as a network service. Grid is one of Sun's major investments for the next IT era, besides Massive Scale and Continuous Real-Time. Sun intends to fully support and collaborate with all other open grid solutions, to integrate Grid Engine with the major Grid technologies currently developed and deployed in the Grid communi- ty, such as Globus, Legion, Punch and Cactus. There many grid tools and services available or being developed which Sun is interested in: the HPC Cluster Tools, the iPlanet Grid Access Portal, the Solaris Resource and Bandwidth Manager, the Sun Management Center, and the software from the newly acquired LSC, Inc., for distributed storage management. The EPCC Edinburgh Parallel Computing Center is a Sun Center of Excellence in Grid Computing. It is the location of the UK National e-Science Center which aims to build a UK e-Science Grid incorpo- rating eight regional distributed computing centers. Edinburgh uses Sun hardware and will evaluate software building blocks like Sun Grid Engine, the iPlanet Portal Server, Sun Management Center, HPC Cluster Tools, and Forte Workshop to build the next generation Grid infrastructure. Through this infrastructure, Edinburgh will deliver compute power to and exchange expertise with its partners in re- search and industry all over the UK. Another example is the OSC Ohio Supercomputer Center, which became a Sun Center of Excellence for High Performance Computing, earlier this year. Together with Sun, OSC is building the Grid infrastructure which enables distributed computing, collaboration, and communication with other partners, e.g. Ohio State University, Universities of Akron and Cincinnati, Nationwide Insurance, and Exodus. A further example is the Technical University of Aachen, Ger- many, which is a Sun Center of Excellence for Computational Fluid Dynamics. Among other objec- tives, the Center will be providing remote access to its large Sun system (which will grow to over 2 Teraflops) for researchers on the university campus, much like an ASP. One of their contributions is the enhancement of Grid Engine adding a Grid Broker, using the software code available in the Grid Engine open source project. Similarly, Sun is involved in about a hundred ongoing or upcoming Grid activities and partnerships worldwide. 5.5. LEGION http://www.cs.virginia.edu/~legion Legion is an object-based metasystems software project developed at the University of Virginia. The system is designed to interconnect millions of hosts and trillions of objects using high-speed network links, to give users the illusion of a single virtual computer with which to access distributed data and physical resources, such as digital libraries, physical simulations, cameras, linear accelerators, and video streams. The system will allow groups of users to construct shared virtual work spaces, to collaborate research and exchange information. Legion services support transparent scheduling, data management, fault tolerance, site autonomy, and a wide range of security options. IST-2000-25182 PUBLIC 26 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 5.5.1. Architecture Legion is designed as an adaptable, open and extensible object-based system which will allow applications to develop and interface their own components. Legion objects define the message format and high-level protocol for object interaction, but not the programming language or the communications protocol. Legion object classes are both object managers as well as policy makers and not just definers of instances Objects are given system-level responsibility; classes create new instances, schedule them for execution, activate and deactivate them, and provide information about their current location to client objects that wish to communicate with them. Different classes of object are designed to support the functions of the distributed grid system. Core objects implement common services, defining the interface and basic functionality to support basic system services, such as object creation, naming and binding, activation, deactivation, and deletion. Core Legion objects provide the mechanisms to implement policies appropriate for their instances. Users may define and build their own class objects out of existing object and so and may modify and replace the default implementations of several useful functions. Legion objects are independent, active objects that reside in separate address spaces and communicate via non-blocking method calls that may be accepted in any order by the called object. Object class interfaces can be described in interface description language (IDL). The state of a Legion object can be either active or inert. The inert object is stored in a persistent state together with information that enables the object to move to an active state. In the active state the object is a running process that responds to invocation of its member function invocations. Host objects running on the Legion computing resources create and manage processes for active Legion objects, allowing abstraction of the heterogeneity of different operating systems. Host object also allow resource owners to manage and control their resources. Vault objects represent persistent storage for the purpose of maintaining the state of the inert objects. Context objects perform name-mapping, allowing users to assign arbitrary high-level names which can coexist in multiple disjoint name spaces within Legion. Binding agents are objects that keep track of the binding between object instances and their executing processes. Binding agents can cache bindings and organize themselves in hierarchies and software combining trees in a scalable and efficient manner. Implementation objects allow other Legion objects to run as processes in the system. An implementation object typically contains machine code that is executed when a request to create or activate an object is made. An implementation object (or the name of an implementation object) is transferred from a class object to a host object to enable the host to create processes with the appropriate characteristics. The Legion testbeds are described in Section 6.2.7 5.6. CACTUS COMPUTATIONAL TOOLKIT http://www.cactuscode.org/ The Cactus system, developed by a collaboration between AEI, WashU, NCSA and Palma was originally designed as a modular, portable, easy to use, high-performance 3D tool for numerical analysis. It has been used successfully during the last three years for use with:  Porting existing uniform grid codes to get parallelism, using MPI, parallel I/O and other functionality (e.g. mesh refinement) available through Cactus  Developing portable parallel code, to run on machines as diverse as an Origin 2000, SP-2, T3E, or NT cluster, without having to write MPI message passing code. IST-2000-25182 PUBLIC 27 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  Writing small, parallel applications easily (e.g. wave equation) making use of things like the Cactus parameter parser and I/O libraries, etc  Starting collaborative projects  Reusing their existing codes and routines more easily  Astrophysicists and Relativists wishing to use Cactus in their research The version 4.0 public release has been largely rewritten and provides a general collaborative framework on a wide variety of platforms. Modules to solve specific problems can be written in Fortran or C and inserted into the Cactus infrastructure thus providing tools for a variety of projects in computational science, including:  MPI-based parallelism for uniform finite difference grids  Fixed and adaptive mesh refinement (e.g. provided by DAGH)  Access to a variety of architectures (SGI Origin, IBM SP, T3E, NT and Linux clusters, Exemplar, etc)  Different parallel I/O layers (e.g. HDF5 and others)  Elliptic solvers (e.g. Petsc and others)  Input/Output (FlexIO, HDF5, Checkpointing)  Metacomputing and remote computing and visualization tools (e.g. Globus)  Visualization Tools (e.g. Amira, IDL, and AVS) 5.7. NETSOLVE http://icl.cs.utk.edu/netsolve/ NetSolve provides software for building grid computing infrastructures using clients, agents and servers. When a NetSolve job is submitted by the end-user, first the client contacts the agent for a list of capable servers, then the client contacts the server and sends the input parameters, the server runs the appropriate service and returns output parameters or error status to client. Users interact with the NetSolve grid infrastructure using a simplified client interface, which has a wide variety of language support, including C, Fortran, Matlab, Mathematica, Java and Excel. The cli- ent interface handles the details of networking and distributed resource sharing and hides them from the end user. NetSolve agents maintain static and dynamic information about NetSolve servers, and match server resources against incoming client requests. The agent optimises and balances the load amongst its servers and keeps track of error conditions. The NetSolve server daemons can serve batch queues of single workstations, clusters of workstations, symmetric multi-processors or machines with massively parallel processors. A key component of the system is a problem description file which is used to create new modules and incorporate new functionality. NetSolve uses both RPC (Remote Pro- cedure Call) and DCOM (Distributed Component Object Model) technologies designed to allow clients to transparently communicate with components regardless of their location, providing a single programming model as well as built-in support for security. 5.8. NEOS http://www-neos.mcs.anl.gov/ The ANL NEOS servers provide a tool for solving numerical optimization problems automatically with minimal input from the user. Users only need a definition of the optimization problem; all additional information required by the optimization solver is determined automatically. Users submit IST-2000-25182 PUBLIC 28 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART jobs by email, web, or using an interface tool and the NEOS system finds the best server to run the job, using Globus and Condor tools. 5.9. COLLABORATIVE AUTOMATIC VIRTUAL ENVIRONMENTS http://www.evl.uic.edu/pape/CAVE/ The CAVE (CAVE Automatic Virtual Environment) is a projection-based virtual reality system developed at the Electronic Visualization Lab, (part of the US NSF), created by Carolina Cruz-Neira, Dan Sandin, and Tom DeFanti, along with other students and staff of EVL. More recent VR systems based on the CAVE are the ImmersaDesk and the IWall. The ImmersaDesk is a one-screen, drafting table style device. The IWall is a large, single screen display using four tiled graphics pipes for increased resolution. The CAVE was first demonstrated at the SIGGRAPH '92 conference. Since then, more than two dozen CAVEs and ImmersaDesks have been installed at various sites. The CAVE and ImmersaDesk are now commercial products sold in the US by Fakespace Systems (formerly Pyramid Systems Inc); the CAVE Library software is sold and distributed by VRCO. The CAVE is a projection-based VR system that surrounds the viewer with 4 screens, arranged in a cube with three rear-projectors for the walls and an overhead (down-pointing) projector for the floor. The virtual environment is viewed remotely using stereo-viewing headgear fitted with a head- movement tracking device. As the viewer moves inside the CAVE, the correct stereoscopic perspective projections are calculated for each wall. A second sensor and buttons in a wand held by the viewer provide interaction with the virtual environment. The projected images are controlled by an SGI Onyx with two Infinite Reality graphics pipelines, each split into two channels. An alternative to CAVE type 3-D projections is the more traditional approach of HMDs or BOOMs, which use small video displays that are coupled to the user's head. Other similar or related collaborative environment systems include:  Worldesign's Virtual Environment Theater  GMD VMSD's Responsive Workbench (German National Research Center for Information Technology) - also Stanford and NRL (Naval Research Laboratory)  Fakespace's Immersive Workbench  GMD VMSD's CyberStage and TelePort  Fraunhofer VISLab's CAVEEE  Sun's Virtual Portal & Holosketch  VREX's VR-COVE  SGI's Transportable Reality Center (Silicon Graphics)  University of Minnesota LCSE's PowerWall (IWall predecessor)  CieMed's Virtual Workbench  Alternate Reality Corp.'s Vision Dome  SICS's Cave (Swedish Institute of Computer Science)  University of Wisconsin I-CARVE Lab's Virtual Reality Studio for Design  IST's Mirage (Institute for Simulation & Training at the University of Florida) IST-2000-25182 PUBLIC 29 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART See also CAVERNSOFT (Section 4.2.3) 5.10. VISUALIZATION TOOLKIT http://public.kitware.com/VTK/ The Visualization ToolKit (VTK) is a freely available open source code for 3D computer graphics, image processing, and visualization being used by a large number of researchers and developers the world over, in applications such as advance electromechanical automotive design, acoustic field visualization, exploration of astronomical data in virtual reality, the Visible Human project, and Virtual Creatures for education. VTK consists of a C++ class library, and several interpreted interface layers including Tcl/Tk, Java, and Python. Professional support and products for VTK are provided by Kitware, Inc. VTK supports a wide variety of visualization algorithms including scalar, vector, tensor, texture, and volumetric methods; and advanced modeling techniques such as implicit modelling, polygon reduction, mesh smoothing, cutting, contouring, and Delaunay triangulation. In addition, dozens of imaging algorithms have been directly integrated to allow the user to mix 2D imaging / 3D graphics algorithms and data. The design and implementation of the library has been strongly influenced by object-oriented principles. VTK has been installed and tested on nearly every Unix- based platform and PCs (Windows 98/ME/NT/2000). 5.11. NLANR GRID PORTAL DEVELOPMENT KIT http://dast.nlanr.net/Features/GridPortal/ The project to develop common components that can be used by portal developers to build a website that can securely authenticate users to remote resources and help them make better decisions for scheduling jobs by allowing them to view pertinent resource information obtained and stored on a remote database. In addition, profiles are created and stored for portal users allowing them to monitor jobs submitted and view results. 5.12. OTHER GRID MIDDLEWARE TOOLS Besides the major tools already considered, an increasing number of tools already developed or being developed for managing and operating grid-like aggregate computing resources are being evaluated for their suitability as candidates for solving a myriad of required functionalities of a working grid environment. A hand-picked selection of some the tools available is listed below mainly to get a flavor of what is available. The EU Datagrid Project has also produced a number of reports on the currently available tools [R15, R16, R17, R18].  LCFG a tool developed at Edinburgh University for grid software installation and configuration management  GDMP Grid Data Management Pilot, a tool for performing data replication  NETSAINT for network monitoring  NWS Network Weather Service for forecasting network performance  MRTG Multi Router Traffic Grapher for network monitoring IST-2000-25182 PUBLIC 30 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 6. SURVEY OF CURRENT GRID TESTBEDS 6.1. USA NATIONAL PROGRAMMES 6.1.1. Terascale Computing System The Pittsburgh Supercomputing Center (PSC) was granted an award by the NSF in early 2000 to construct a terascale computing capability (TCS) for USA researchers in all science and engineering disciplines. A 750 node, 3000 processor, 6 Tflop system should be ready for 2002. An initial 64-node, 256-processor offering came online in April 2001. The major criteria to be considered for such a system are stated as follows in the programme announcement:  The ability to provide the most effective computational capability to the broadest range of research community requiring leading edge, terascale, computational capabilities.  The availability of system software and tools to effectively use the computational capabilities of the hardware.  The ability to call on a wide range of expertise essential for addressing the challenging system integration problems expected to arise..  Links to a variety of applications.  The basis for the amount of on-line and archival storage being proposed. The initial TCS configuration has 64 interconnected Compaq ES40 Alphaservers, each of which features four EV67 microprocessors. Later these will be replaced by more than 682 faster Alphaservers, each with four of Compaq's new EV68 chips, resulting in a major advance compared to the largest currently installed Alphaserver systems, which have a maximum of 512 processors. Reports indicated the testbed to be well ahead of schedule and exceeding performance expectations, the initial 256 processor configuration was on-line and running research software weeks ahead of the expected target. During a 2-day trial period the system consistently surpassed speed expectations and operated virtually without interruption. The high-performance computing system will eventually exceed six trillion operations per second (teraflops), apparently making it the world's fastest for civilian research. PSC is preparing the system to allow researchers nationwide to use TCS for projects with potential benefits such as more accurate storm, climate and earthquake predictions; more-efficient combustion engines; better understanding of chemical and molecular factors in biology; and research into physical, chemical and electrical properties of materials. 6.1.2. Distributed Terascale Facility http://www.interact.nsf.gov/cise/descriptions.nsf/pd/DTF/ The DTF claims to be the largest grid testbed deployed for scientific research, with more than 13.6 teraflops of computing power and more than 450 Tb of data storage. The project is driven by the four main research institutions: the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign, the San Diego Supercomputer Center at the University of California at San Diego, Argonne National Laboratory, and the California Institute of Technology. The DTF will consist of both 32- and 64-bit machines and will be linked via a dedicated optical network that will initially operate at 40 Gbps and later upgraded to 50-80 Gbps. IST-2000-25182 PUBLIC 31 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART Building and deploying the DTF will take place over three years. 6.1.3. Access Grid http://www.accessgrid.org The Access Grid (AG) aims to support human interaction in the grid environment. It connects multimedia display, presentation and human interaction environments, provides interfaces to grid middleware and visualization environments. The project aims to support large-scale distributed meetings, collaborative work sessions, seminars, lectures, tutorials, etc. The Access Grid aims to complement the computational grid and to support group interactions. The main focus is on group to group communication rather than individual communication and enabling both formal and informal interactions. Grid access may be for remote visualization or interactive applications, or for utilizing the high- bandwidth environment for virtual meetings and events. Grid nodes will provide meeting rooms which are "designed spaces" containing the high-end audio and visual technology to provide an interactive, “high-quality user experience”. The activity is aimed at prototyping a number of nodes to conduct remote meetings, site visits, training sessions and educational events. The project also aims to carry out research into collaborative work in distributed environments and to develop distributed data visualization corridors. In an effort to expand the number of Access Grid nodes the Alliance has also introduced an “Access Grid-in-a-Box” software package that facilitates new sites participating in the programme. An interesting Access Grid paper investigates the current and future direction of immersive visualization environment development [R22]. 6.1.4. SC Grid http://www-fp.mcs.anl.gov/scglobal/ http://www.gridcomputing.org/grid2001/ At the yearly US supercomputer conference in Denver in November 2001 at least 35 computing centres were expected to participate in the SC Global conference on Grid technologies linking groups of people worldwide. The participating sites spanned six continents, from the Arctic Region Supercomputing Center in Fairbanks, Alaska, to the NSF Polar Research Center in Antarctica. Sites include Australia, Brazil, Canada, China, Germany, Italy, Japan, The Netherlands, South Korea, the United Kingdom. There will be 21 sites in the US, sites in Europe include the Universities of Stuttgart, Heidelberg and Manchester, and the research centre in Juelich. SC Global will use the ANL Access Grid to link participants using high-speed networks for virtual meetings and other collaborative sessions. Over 40 sites in ten countries around the globe were expected to participate. 6.1.5. DOE Science Grid http://www-itg.lbl.gov/Grid/ Part of the program for “Scientific Discovery Through Advanced Computing”. The grid system based on Globus tools is being developed at LBNL using a variety of diverse computing resources. The following software is being used on various machines:  Globus 1.1.3  GSI WuFTPD 0.4b4 IST-2000-25182 PUBLIC 32 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  GSI NcFTP v0.3  GSI SSH v1  Portable Batch System (PBSPro) v 5.0.2  MOM, scheduler and server  Maui Scheduler to work with PBS on both machines.  Netscape Certificate Management System 4.2  Interactive Data Visualization (IDL). First to test the grid is the Supernova Cosmology Project at LBNL which generates a huge amount of data every day which needs to be processed using IDL image processing algorithms. Tests are ongoing to submit ~700 jobs to the system through Globus. 6.1.6. DOE / ASCI DisCom2 http://www.cs.sandia.gov/discom/ Within the Accelerated Strategic Computing Initiative (ASCI) program (Section 4.2.2), the Distance and Distributed Computing program, or DisCom2, is charged with ensuring that Laboratory scientists have the best possible access to computing resources no matter where those resources may be located. Network file transfer has been identified as an important part of this effort. The three weapons labs, Lawrence Livermore, Sandia, and Los Alamos, have been working for some time on plans for a secure, high-speed, low-latency Wide Area Network (WAN) spanning the sites in Livermore, Albuquerque, and Los Alamos. The proposed file transfer tool for this new network is the parallel file transfer protocol (FTP) client as distributed with the High Performance Storage System (HPSS). However HPSS has been shown to adversely affect the transfer performance - a detailed report is available on the results of the tests [R21]. 6.1.7. Information Power Grid http://www.ipg.nasa.gov The IPG is a collaborative effort between NASA Ames, NASA Glenn, and NASA Langley Research Centers, and the NSF PACI programs at SDSC and NCSA, and is funded by the Computing, Information and Communications Technology (CICT) program at NASA Ames Research Center. The current IPG testbed has the following characteristics:  Computing resources: Six 100-node SGI Origin 2000s and several workstation clusters at Ames, Glenn, and Langley, with plans for incorporating Goddard and JPL, and approx. 270 workstations in a Condor pool  Communications: High speed, wide area network testbed among the participating Centers  Storage resources: 30-100 Terabytes of archival information/data storage uniformly and securely accessible from all IPG systems  Software: The Globus system providing Grid common services and Grid programming and program execution support. through Grid MPI (via the Globus communications library), CORBA integrated with Globus, a high throughput job manager, and the Condor job management system for workstation cycle scavenging  Human resources: Stable and supported operational and user support environments IST-2000-25182 PUBLIC 33 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  Applications: Several “benchmark” applications operating across IPG (parameter studies, multi-component simulations, multi-grid CFD code)  Multi-Grid operation (e.g. example applications operating across IPG and the NCSA Grid)  Active participation in the Global Grid Forum for standardization and technology transfer However, the following still have to be provided:  Integration of CORBA in the grid environment  Mechanisms for CPU resource reservation, a global queuing and user-level queue management capability on top of Globus  Network bandwidth reservation  General grid data handling 6.1.8. NEESGrid http://www.neesgrid.org/ NEESgrid is an integrated NEES (Network for Earthquake Engineering Simulation) network, funded by the National Science Foundation that will link earthquake engineering research sites across the country, provide data storage facilities and repositories, and offer remote access to the latest research tools. 6.1.9. IGRID http://www.startap.net/igrid The Electronic Visualization Laboratory (EVL) at the University of Illinois at Chicago and Indiana University (Alliance partners) created the International Technology Grid (iGrid) testbed at the IEEE/ACM Supercomputing '98 (SC'98) conference in Orlando, Florida, 7-13 November 1998, to demonstrate global collaborations. The centerpiece of iGrid is the NSF-sponsored initiative called STAR TAP (Science, Technology and Research Transit Access Point), a persistent infrastructure to facilitate the long-term interconnection and interoperability of advanced networking in support of applications, performance measuring, and technology evaluations. It is managed by EVL, Argonne National Laboratory, and Chicago's Ameritech Advanced Data Services (AADS). STAR TAP provides the international component of the NSF very high speed backbone network service (VBNS), installed at the following sites:  USA Department of Energy's ESnet  USA Department of Defense's DREN  NASA's NISN and NREN  Internet2/Abilene network  Canada's CA*net II  Singapore's SingAREN  Taiwan's TANet2  Russia's MirNET  The Asian Pacific Advanced Network IST-2000-25182 PUBLIC 34 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  (APAN) consortium which includes Korea, Japan, Australia, and Singapore  Nordic countries' NORDUnet  France's Renater2  The Netherlands' SURFnet  Israel's QMED networking initiative  CERN [http://www.cern.ch] are imminent StarLight is an advanced optical network infrastructure and proving ground for network services opti- mized for high-performance applications. StarLight is being developed by the Electronic Visualization Laboratory (EVL) at the University of Illinois at Chicago (UIC),, International Center for Advanced Internet Research (iCAIR) at Northwestern University, Mathematics and Computer Science Division and the Argonne National Laboratory, in partnership with CANARIE and SURFnet. 6.1.10. Particle Physics Data Grid http://www.ppdg.net/ http://www.cacr.caltech.edu/ppdg/ The Particle Physics Data Grid is a collaboration between ANL, BNL, Caltech, FNAL, JLAB, LBNL, SDSC, SLAC, Wisconsin, UCSD aims to provide an infrastructure for very widely distributed analysis of multi-petabytes of particle physics data, accessible to hundreds to thousands of physicists. Focus is on the development of advanced network and middleware infrastructure for data-intensive collaborative science. A wider collaboration pilot project is multinational, with more than 500 collaborators from 65 institutions worldwide. The PPDG grid is being developed for the six HEP application experiments: Atlas, BaBar, CMS, D0, STAR and JLAB. The main goals of PPDG are:  End-to-end integration and deployment in experiment production systems.  After one experiment uses software adapt/reuse/extend for other experiments.  Extend & interface existing distributed data access code bases with Grid middleware.  Develop missing components and new technical solutions  Milestones for deliverables of the order of 1 year for each Project Activity These goals are to be reached pursuant to the major milestones of the project:  Generic Grid Services  Application-Grid interfaces  PPDG End-to-End  Experiment Specific Fabric & Hardware  Fabric-Grid interfaces Since PPDG members are also collaborating to develop the current leading edge grid tools such as Condor, Globus, SRB (Storage Resource Broker), as well as GDMP, PPDG will use the existing grid IST-2000-25182 PUBLIC 35 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART tools base and develop them further, contributing their findings and developments back to the grid technology pool as part of the core grid components. Work is also carried out in close, but loose collaboration with DataGrid, for instance by mirroring the DataGrid work package structure. The PPDG D0 experiment will need to process total data sets of 0.3 PB/year in the first phase and approximately 1 PB/year in the second phase. The physics analysis capabilities depend strongly on the computing tools and data handling tools that can be provided for this large data set. The D0 grid testbed offers a fully distributed data access system, SAM (Sequential Access to data via Metadata), which has been extensively exercised on Monte Carlo and Cosmic test data. SAM was originally developed prior to the grid in 1998 as a distributed system based on CORBA and has been adapted by D0 to work in a grid infrastructure. The testbed features include:  data replication  disk cache management  resource management  metadata cataloguing and querying  dataset definition and processing history SAM has also been deployed at NIKHEF and IN2P3. Data files stored at these sites are transferred transparently over the network to Fermilab mass storage. PPDG will concentrate on extending the facilities for intelligent global data, job and resource management, and integrating these into the production SAM services for the benefit of the D0 physics community. Deliverables are:  Formal Job Control and Resource Management Language - Year 1 To develop a unified approach to the allocation and scheduling of all resources, pertaining both to the data delivery and to the processing. To tackle complex issues related to data availability and load balancing among the grid resources.  Global Monitoring of Resource and System Performance and Utilization - Year 1 and 2 To enhance the existing SAM design to be scalable, flexible and reusable.  Enhanced Integrated Production Experiment wide analysis job and data placement - Year 1, 2 and 3 To deploy the global distributed data access and analysis system as a production service, concentrating on job dispatch, data placement, distributed cache and resource management, and the support of fully transparent user control and response.  Enhanced Data Reclustering and Restreaming Services - Year 3 To enhance existing data delivery and job dispatch optimization techniques to handle the increasing data volumes 6.1.11. Grid Physics Network (GriPhyN) IST-2000-25182 PUBLIC 36 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART http://www.griphyn.org/ A US National Science Foundation funded team of experimental physicists and information technology (IT) researchers who plan to implement the first Petabyte-scale computational environments for data intensive science in the 21st century. The project will focus on several new concepts necessary for the creation of Petascale Virtual Data Grids (PVDGs), including the policy- driven request planning and scheduling of networked data and computational resources and the concept of Virtual Data - to define and deliver to a large community, a virtual space of data products, of potentially unlimited size, derived from experimental data. Within this data space, requests can be satisfied via direct access and/or computation, with local and global resource management, policy, and security constraints determining the strategy used. The testbed is needed for four main physics and astronomical experiments about to enter a new era of exploration of the fundamental forces of nature and the structure of the universe. The CMS and ATLAS experiments using the CERN Large Hadron Collider will search for the origins of mass and probe matter at the smallest length scales; LIGO (Laser Interferometer Gravitational-wave Observatory) will detect the gravitational waves of pulsars, supernovae and in-spiraling binary stars; and SDSS (Sloan Digital Sky Survey) will carry out an automated sky survey enabling systematic studies of stars, galaxies, nebulae, and large-scale structure. The following are key features of the PVDGs being developed:  Large extent-national or worldwide-and scale, incorporating large numbers of resources on multiple distance scales  Sophisticated new services layered on top of local policies, mechanisms, and interfaces, to manage geographically remote resources in a coordinated way  Transparency in how data-handling and processing capabilities are integrated to deliver data products to end-user applications, so that requests for such products are easily mapped into computation and/or data access at multiple locations. (This transparency is needed to enable optimization across diverse, distributed resources, and to keep application development manageable.) GriPhyN was awarded five year’s funding in September 2000. IST-2000-25182 PUBLIC 37 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 6.1.12. GRADS http://www.isi.edu/grads/ The Grid Application Development Software (GrADS) Project [R6] aims to develop tools and procedures to allow scientists and engineers to construct Grid-deployable applications. The aim is to simplify distributed heterogeneous computing in the same way that the World Wide Web simplified information sharing over the Internet. The GrADS project will explore the scientific and technical problems that must be solved to make grid application development and performance tuning for real applications an everyday practice. The following software packages are used by the testbed:  Globus 1.1.3 The basic gird software toolkit  mpich-g 1.1.2 Enables an MPI program to run in a grid environment without change  Autopilot An infrastructure for real-time performance measurement and adaptive control which makes use of the Pablo SDDF and Fuzzy Library components  BLACS / ATLAS / ScaLAPACK A MPI implementation for linear-algebra- algorithms, implemented efficiently and uniformly across a large range of distributed memory platforms  NWS For monitoring and dynamically forecasting the performance of network and computational resources  RIB Support for online software repositories based on metadata  Cactus Modular framework and services for parallel computation across different architectures and collaborative code development between different groups Besides constructing integrated testbeds within the GrADS institutions, project investigators are also working with the NSF Alliance and NPACI, NASA, and DOE to build large-scale prototype grids. The GrADSoft architecture (Fig. 1) is being develop to provide an environment for application creation, compilation, execution, and results analysis which is capable of adapting different applications to a dynamically changing Grid infrastructure. Two key concepts in the architecture are the encapsulation of the application as a configurable object which can be optimised rapidly for IST-2000-25182 PUBLIC 38 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART execution on a specific collection of Grid resources, and a system of performance contracts that describe the desired performance as a function of available resources [R7][R8]. 6.1.13. Bandwidth to the World http://www-iepm.slac.stanford.edu/monitoring/bulk/sc2001/ The Bandwidth to the World project is designed to demonstrate the current data transfer capabilities to several sites with high performance links, involving around 25 sites in 6 countries. A demonstration testbed is planned for the SC2001 conference, in which the site will simulate a HENP tier 0 or tier 1 site (an accelerator or major computation site) distributing copies of the raw data to multiple replica sites. The demonstration will be over real live production networks with no efforts to manually limit other traffic. The objective will be to saturate the local link to SciNet and to control the local router at one end of the congested link, to evaluate the QBone Scavenger Service (QBSS) as a tool for managing competing traffic flows and to observe the effect on the response time of lower volume interactive traffic on high performance links. The demonstration will use the Internet 2, ESnet, JAnet, GARR, Renater WANs and the CERN-STARTAP link. ESnet will have an OC48 from Denver to Sunnyvale. SLAC will have an OC12 to Sunnyvale. The SC2001 SLAC/FNAL booth will have 2*Gbps connections to SciNet. Details of the tests and the results are published on the project web site. 6.1.14. CAVE / IMMERSADESK CAVEs (See Section 5.9) have been installed at:  EVL (Electronic Visualization Laboratory)  NCSA Virtual Reality Lab (National Center for Supercomputing Applications)  Argonne National Laboratory Mathematics and Computer Science Division  [D]ARPA (Defence Advance Projects research Agency)  AEC (Ars Electronica Center, Linz, Austria)  Iowa State University VRAC  NASA JSC / University of Houston Virtual Environment Technology Lab  CTC Cornell Theory Center  General Motors Research  EDS Detroit Virtual Reality Center  Concurrent Technologies Corp.  FMC Corporation, Chicago  Wright State University / Wright-Patterson Air Force Base  NTT Corporation InterCommunication Center  Virginia Tech Laboratory for Scientific Visual Analysis  Virginia Polytechnic Institute and State University  SARA (Academic Computing Services Amsterdam)  Indiana University, Bloomington ImmersaDesks have been installed at: IST-2000-25182 PUBLIC 39 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  EVL (Electronic Visualization Laboratory)  NCSA Virtual Reality Lab  Argonne National Laboratory Mathematics and Computer Science Division  UIC (University of Illinois at Chicago) Mechanical Engineering Department Industrial Virtual Reality Institute  UIC Biomedical Visualization Lab  UIUC (University of Illinois at Urbana-Champaign) Robotics and Computer Vision Laboratory  NIST (National Institute of Standards and Technology)  Silicon Graphics  [D]ARPA (Defence Advance Projects research Agency)  Old Dominion University  Army Corps of Engineers Waterways Experiment Station  Swedish Royal Institute of Technology's Center for Parallel Computers  Nichols Research Corporation  Virginia Tech Laboratory for Scientific Visual Analysis  Pyramid Systems  Nissho Electronics  Boston University Scientific Computing and Visualization Group / Center for Computational Science  University of Chicago  ISI (Information Sciences Institute, University of California)  Indiana University - Purdue University Indianapolis  Ars Electronica Center, Linz Austria IST-2000-25182 PUBLIC 40 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 6.2. EUROPEAN PROGRAMMES 6.2.1. EU DataGrid http://www.eu-datagrid.org/ There will be three main phases of the testbed, to be delivered and tested at the end of each year of the project lifetime. An initial startup phase (testbed0) was also introduced at the beginning of the project. An architecture task force was set up to design the basic DataGrid middleware services, there have been two iterations of the architecture design resulting in [R21]. The organisation of the project is described in Section 4.3.1. 6.2.1.1. DataGrid Testbed 0 The DataGrid Testbed0 was an initiative to get an early start to evaluate the Globus software as a basis for the development of the first official testbed version. Initially it was thought the DataGrid middleware would be based on the Globus software which represented the most advanced set of grid tools available at the start of the project and this choice has been confirmed by the results of Testbed0. A major reason for this choice is that software has the advantage of being largely open-source based – even though there still remained a few licensing issues to be solved due to the inclusion of software from several different organizations. The Testbed0 efforts have evaluated the Globus software tools with particular regard to:  deployment and installation  security  information services  resource management  data access and migration  other services The results are available in the report produced by the INFN involving over 30 participants of DataGrid WP1[R1]. A major result of the evaluation has been the identification of the parts of the system which need to be developed in order to enhance the grid functionality to be provided by DataGrid testbed1. A further evaluation of Globus tools was carried out by the DataGrid Work Package 9 and reported on in [R2]. 6.2.1.2. DataGrid Testbed 1 http://marianne.in2p3.fr/datagrid/ DataGrid Testbed1 is the first official release of the grid middleware. Delivery of middleware components was due for project month 9 (September 2001), to be followed by an integration phase, followed by validation by the three application groups. The release of testbed1 has coincided with the rollout of Globus 2 and several of the DataGrid middleware tools dependent on Globus have had to be modified to work with the new version. As a result the testbed will not be finally released until mid- November, however, several components have already been delivered to the integration team. Work is currently underway to set up the VO infrastructure to support different experiments and validation teams, and to install, configure and test the delivered middleware. On the applications side, detailed plans have been drawn up for testing and validating the testbed by High Energy Physics, Earth Observation and Biology applications. IST-2000-25182 PUBLIC 41 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART Table 1. summarises the main tools, components and services of DataGrid testbed1 and the work packages responsible for providing them. Table 1. Datagrid Testbed1: Middleware Tools, Components and Services Service Acro- Description Provider nym Testbed Administration n/a DataGrid certificate authorities, user registration, WP6 Acceptable Use Policy. Middleware Integration. VO Server n/a LDAP directory for mapping grid users (along with NIKHEF their certificates) to one of six DataGrid Virtual Oganisations User Interface UI Used for submitting JDL files (job-specification WP1 scripts) for execution on the grid, monitoring job status, retrieving results Job Submission Service JSS Used by the UI for managing the submission of user WP1 jobs to the resource broker Information Index II Provides information about all available grid WP1 resources, using GIIS/GRIS information hierarchy Resource Broker RB Uses the Information Index to discover and select WP1 available resources based on the JDL requirements Logging & Book- L&B Used for the collection of resources usage, job WP1 keeping submission and lifecycle status information Computing Element CE Gatekeeper to a grid computing resource, i.e. WP1 computing cluster Storage Element SE Grid-aware, high-capacity data storage area, situated WP5 close to on or more CEs Replica Manager RM For data replication to one or more grid SEs, based WP2 on an exsisting tool, GDMP Replica Catalog RC For keeping track of the location of multiple data WP2, files “replicated” in one or more SEs WP5 Information and n/a For providing information about Grid resource WP3, Monitoring utilisation and performance WP7 Grid Fabric Management n/a Configuration and automated installation and WP4 maintenance of grid software packages and environments Network Performance, n/a To provide efficient network transport services, WP7 Security & Monitoring network security and bandwidth performance monitoring MetaData management N/a To provide access to grid-aware relational databases WP2 and metadata storage, known as “Spitfire” In the following sub-sections we briefly describe the current state of the delivered testbed in terms of the major services present in a grid system: IST-2000-25182 PUBLIC 42 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  Resource management  Data management  Job management  Security  Accounting and billing  Network performance  Information system  Monitoring 6.2.1.2.1. Resource management One aspect is interfacing to grid batch queues with the Local Resource Management System (LRMS) and network file system and providing information about the resources to the Grid Information Index hierarchy. This service is largely provided by the Globus tools. In addition, local resource management tools are needed as a standard way to install operating system, middleware and application softwares simultaneously across a large number of processing nodes. Particularly important is the capability to easily update, disseminate and install new software as well as updates quickly and easily in all grid resource nodes, using a standard procedure easily accessible to application developers, system administrators and grid system engineers in different organisational and applicative environments. This need is addressed by WP4 middleware tools:  Interim installation system based on LCFG (Large Scale Linux Configuration) [R11] for software updates and maintenance  Configuration Management Prototype for ubiquitous software installation from a mirror or clone of some predefined reference system 6.2.1.2.2. Data management The principal need for data management involves the replication and dissemination of multiple copies of single master data files within the grid infrastructure, specifically to areas where computing power and high-capacity, high-performance storage are concentrated. For this, WP2, initially proposes the GDMP (Grid Data Management Pilot) [R12], being developed in collaboration with PPDG, as a basis for initial evaluation. 6.2.1.2.3. Job management Several tools and services are provided by WP1, described in the architecture document [R14]:  JDL Job Description Language based on Condor Classads  UI User Interface for job submission  JSS Job submission system  RB Resource broker  LB Logging and book-keeping  II Information Index 6.2.1.2.4. Security In the current testbed security services are limited to those provided by the LDAP / Globus based MDS (Metadata Directory Services). A security working group has had a late start and is currently in IST-2000-25182 PUBLIC 43 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART the process of defining and developing the security infrastructure. Interestingly, a major finding of the group is summed up in the following verbatim quotation: “once and for all, VPNs are *not* the solution! We do *not* need VPNs. Mostly, when people talk about VPNs, what they actually want is "managed bandwidth" or "encryption", or everything *but* a VPN.” [source: DataGrid security working group coordination meeting, CERN, 2001.06.06]. 6.2.1.2.5. Accounting and billing Work is at a proposal and design stage and currently there are no services in the PM9 rollout of the testbed. An INFN-based working group is currently investigating and has made initial proposals centered around an economic ‘Computational Economy’ approach [R13], where a price-setting policy is needed in order to achieve a self-regulating equilibrium between resource availability and demand in the grid workload distribution. One interesting aspect of the accounting system is its potential role, in tandem with the workload management system, as regulator of grid resource usage. The accounting model proposed also aims to develop the recommendations of the IETF working group on Authentication, Authorization and Accounting for network access. 6.2.1.2.6. Network performance An interim network performance monitoring service has been provided by WP7. A graphic map in a web page has been developed which users can click on to discover the grid testbed resources. There is considerable interest in Netsaint as it may have many of the best match of features suitable for grid use. 6.2.1.2.7. Information system An information system is a rendez-vous point for both producers and consumers of all kinds of grid information, whether it be resources, services, users and virtual organisations. Therefore there are two aspects to be considered:  Information System a central registry or rendez-vous where producers can publish their facts and where consumers can find them. Preferable it should use a standard protocol so that its services can be accessed by many different software tools and services, and it should also be distributed to allow for the widely scattered geographically distribution of grid resources and services. It should be optimised for searching, and due to the dynamic nature of grid information it should also be optimised for frequent updates.  Information Providers every producer of grid information has to supply information to be published in the information system. The LDAP protocol is well suited to support this model, with an underlying relational database capable of handling frequent updates in a dynamic environment. Two solutions are being evaluated, Globus MDS2.0 and OpenLDAP FTREE, the latter as an alternative solution, due to concerns over the performance and scalability of the former which has never been tested in such a large-scale, i.e. Trans-European, deployment. 6.2.1.2.8. Monitoring WP3 is in the process of developing the Grid Monitoring Architecture software [R19], a monitoring and information management service for distributed resources, based on that of the Grid Monitoring Architecture (GMA) of the Global Grid Forum. It is being developed using a collection of freely avail- able, “state-of-the-art” software tools, services and protocols: IST-2000-25182 PUBLIC 44 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART  tomcat Java servlet engine  ant software development environment  MySQL JDBC Driver Java database interface  JavaCC compiler generator  log4j Flexible logging system  junit Unit testing framework  Xerces XML parser  TexDoclet plugin for javadoc to generate LaTex  w3c-libwww-devel http library in C, standard RPM on the RH6.2 distribution  Doxygen to generate documentation from C++  Graphviz needed by Doxygen to draw graphs - it can be built without the latest Tk/tcl to get the components needed by Doxygen. R-GMA demonstrates the use of an information service which might be used for monitoring, or for any other grid information purposes. In this way, local, dynamic, information about the state of grid resources, such as CPU nodes available, available service types, batch queue status, can be published to the information service and made available to the specific information consumers who need it. Another set of tools provided by WP3 is GRM/PROVE, which combined together provide support for tracing events while running a job. Also being evaluated is NetLogger, a tool for monitoring and analysis of distributed systems, which will be used for performance monitoring. 6.2.2. EuroGrid http://www.eurogrid.org Three releases of EUROGRID will be performed that incorporate the horizontal and domain–specific extensions and adaptations developed in the project. Two public workshops will be organised. The EUROGRID project relies on a proven GRID infrastructure originally developed for the leading German supercomputer centers in the UNICORE project. During the 1st Egrid Workshop in Poznan (12-13 April 2000) several application testbeds were proposed: Testbed Contact Middleware System Cactus eseidel@aei-potsdam.mpg.de globus, MPI, Linux, T3E, SGI, NT, HDF5 SP2 DLR - Combustion michael.faden@dlr.de globus, condor, SGI, SUN Engine MPI EOS (Earth giovanni.aloisio@unile.it globus, MPI COW, SGI, Compaq Observation Systems) SC, HP Exemplar Conformational ludek@ics.muni.cz PVM, MPI SGI, Dec, COW, Unix Analysis IST-2000-25182 PUBLIC 45 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 6.2.3. CROSSGRID http://www.crossgrid.org/ The CrossGrid testbed (Section 4.3.11) will collaborate closely with the Global Grid Forum and the DataGrid project in order to build on the results and experience of current grid projects and achieve full interoperability with the emergent grid systems and protocols. A CrossGrid testbed will be constructed and used to validate the methodology, generic application architecture, programming environment and new Grid services being developed. The emphasis will be on providing a user-friendly grid environment. The work aims to extend the Grid across eleven European countries. The first kick-off meeting is planned to take place on 11-13 March 2002 in Cracow. 6.2.4. INFN Grid http://server11.infn.it/grid/ The INFN GRID project aims to develop and deploy a prototype grid for clusters and supercomputers distributed in the INFN nodes of the Italian research network Garr-b to provide a coherent high throughput computing facility transparently accessible to all INFN users. The INFN national grid will also be integrated with similar infrastructures being developed in similar, parallel grid projects ongoing in all major European countries, in US and Japan. The project aims to develop new grid technology components, whenever possible, in collaboration with international partners via European or international projects. The INFN-GRID will fully integrate services and tools developed by the EU DataGrid project (Section 4.3.1) and has plans to develop specific areas of the technology:  Validating and adapting basic grid services (e.g. Globus) and developing application middleware services on top of them  Addressing specific INFN computing needs not addressed elsewhere, to develop intermediate releases of grid services optimized for the needs of specific applications  Deployment of an Italian national grid testbed characterized by the INFN collaboration requirements for computing resources, data and network capacity, and computing model, connecting with other international computing power grids. Testbed research includes workload management, data management, grid resource monitoring, standardization of existing INFN distributed computing resources and mass storage. The network connectivity is to be provided by GARR. 6.2.5. EPSRC E-Science Testbed http://www.soton.ac.uk/~escience The testbed aims to develop and demonstrate a Grid Enabled Optimisation and Design Search for Engineering (GEODISE), which will initially focus on the use of Computational Fluid Dynamics (CFD) techniques. The project aims to provide grid-based seamless access to an intelligent knowledge repository, a state-of-the-art collection of optimisation and search tools, industrial strength analysis codes, and distributed computing and data resources. IST-2000-25182 PUBLIC 46 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 6.2.6. Grid-Ireland www.grid-ireland.org/ The Grid-Ireland (see Section 4.3.8) testbed currently provides a grid software kit for installing Globus and the grin-enabled MPICH-G2 library. It is supported on Intel i686 platforms running RedHat Linux 6.x and 7.x, and Sun platforms running Solaris 2.6 and 2.7. and in the very near future Compaq Alpha platforms running OSF 4.1 as well. Support for Condor-G and Cactus are also planned. A GUI installer is also being developed. The Grid-Ireland currently has the following sites :  Department of Computer Science, Trinity College Dublin  Department of Computer Science, University College Cork  IT Centre, National University of Ireland, Galway  Department of Computer Science, Queens University Belfast The number of participants is expected to increase substantially by January 2002. 6.2.7. CENTURION http://legion.virginia.edu/centurion/Centurion.html The CENTURION testbed uses the Legion grid middleware developed at the University of Virginia (Section 5.5). The main testbed consists of 128 533 MHz DEC Alphas with 32 GB RAM, 768 BG disk, 128 Dual 400 MHz Pentium II processors with 32 GB RAM, to give a total capacity of 64 GB RAM more than 1.7 TB disk storage, providing more than 240 +GFlops peak processing power. Applications using the CENTURION testbed include:  Hawley MHD astronomical code for simulating gas accretion disks  MM5, a mesoscale weather modelling code used for both research into weather prediction  CHARMM, a program for modelling macromolecular dynamics and mechanics  Axially Symmetric Direct Simulation Monte Carlo code for the Directed Vapor Deposition research Shallow Water Ocean Simulator code provided by Northrop- Grumman  Complib comparison of protein DNA sequences  Amsterdam Density Functional (ADF) calculations on polyatomic systems for molecular spectroscopy, organic and inorganic chemistry, crystallography and pharmacochemistry  Assisted Model Building with Energy Refinement (Amber), a suite of Fortran and C programs used for biomolecular simulation  Gaussian98, a connected system of programs for performing a variety of semi- empirical and ab-initio molecular orbital (MO) calculations  Neural-network applications IST-2000-25182 PUBLIC 47 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 6.2.8. GEMSviz http://www.pdc.kth.se/projects/GEMSviz GEMSviz is a collaborative development of two units of the Royal Institute of Technology in Stockholm, Sweden: Parallelldatorcentrum (PDC) and Parallel Scientific Computing Institute (PSCI). A demonstration at INET 2000 in Yokohama, Japan, used IBM SP equipment at PDC and also at the Texas Center for Computation and Information Sciences at the University of Houston; an ImmersaDesk and CAVE at the INET 2000 site; and the NORDUnet, Abilene, and APAN networks connected through STAR TAP. The goal of the GEMSviz project is to produce a framework for a distributed computational steering code. GEMSviz builds on the General ElectroMagnetic Solver program package, the Visualization ToolKit (VTK), SGI IRIS Performer and pfCAVE, CAVERNsoft, and the Globus toolkit for communication, data handling, and resource allocation. The project's collaborators are Erik Engquist, Per Oster and Bjorn Engquist at PDC and PSCI, and Lennart Johnsson at TCCIS. 6.2.9. Distributed ASCI Supercomputer (DAS) http://www.cs.vu.nl/das/ DAS is a wide-area distributed computer of 200 Pentium Pro nodes, distributed over four Dutch universities, designed by the Advanced School for Computing and Imaging. Participating universities are:  Vrije Universiteit, Amsterdam  University of Amsterdam  Delft University of Technology  Leiden University  University of Utrecht Four clusters, one of 128 nodes and three of 24 nodes are located at four universities The individual clusters are connected by the wide-area SurfNet4 ATM network which allows the entire 200 node wide-area distributed cluster to be viewed as a single machine. The system was built by Parsytec (Aachen, Germany), a company with many years of experience in building large-scale parallel systems. Each node consists of:  200 Mhz Pentium Pro (Siemens-Nixdorf D983 motherboard)  64 MB EDO-RAM in DIMM modules (128 MB for the clusters in Leiden and VU)  A 2.5 GByte local disk  A Myrinet interface card  A Fast Ethernet interface card Local cluster nodes use Myrinet for high-speed interconnect and the network file system uses Fast Ethernet. The topology is a 3D-mesh, in which dimension order routing is used. A dedicated gateway machine at each site routes the Fast Ethernet packets over the wide area ATM between clusters. The links use Constant Bitrate (CBR) virtual circuits, offering guaranteed, reliable IST-2000-25182 PUBLIC 48 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART and predictable performance, currently operating at 6 Mbit/sec with a roundtrip latency between 1.5 and 3.7 msec. Although not a computing grid as such, this system is just one example of many such “grid-like” installations which have implemented proprietary or non-standard software for wide area high- performance distributed computing, which however do not scale outside of the organisations implementing them. This serves to illustrate a prime goal of metacomputing grids, namely the need to provide interoperability across a wide range of different organizations and underlying HPC architectures transparent to international and regional boundaries. 6.3. OTHER COUNTRIES 6.3.1. Scandinavia www.quark.lu.se/grid/ The Nordic Testbed for Wide Area Computing project (NorduGrid) is a part of the Nordunet2 programme (Section 4.3.7), aimed to develop networked applications with extensive usage of modern utilities and tools in the Nordic countries, including: Nordunet2 is a research programme financed by the Nordic Council of Ministers and by the Nordic Governments. The overriding aim of this programme is to help secure the position of Nordic countries at the forefront of Internet development. Its focus is on network utilisation and network-based applications in four main areas:  Distance education and lifelong learning  Tele Medicine  Digital libraries  Infraservices (similar to middleware services concept) The aim of the Nordic Testbed is to establish an inter-Nordic test bed facility for implementation of wide area computing and data handling. The facility will provide the infrastructure for interdisciplinary feasibility studies of GRID-like computer structures. The results of the project will be used for guiding the future strategy for providing computing infrastructure for scientific research requiring high-volume distributed data storage and high-performance processing. The testbed will initially focus on the developments and results of the DataGrid project (Section 6.2.1). 6.3.2. Russia The Russian grid testbed consists of the following sites:  Skobeltsyn Institute of Nuclear Physics, Moscow State University (SINP MSU)  Institute of Theoretical and Experimental Physics (ITEP)  Joint Institute of Nuclear Research, Dubna (JINR)  Institute of High Energy Physics, Protvino (IHEP) Two of these sites, SINP MSU and ITEP will participate in DataGrid Testbed1 (Section 6.2.1). IST-2000-25182 PUBLIC 49 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 6.3.3. EcoGrid – EconomyGRID http://www.csse.monash.edu.au/~rajkumar/ecogrid/ This research work carried out by the School of Computer Science and Software Engineering at the Melbourne Monash University aims to develop economic or market-based management and scheduling systems in global grid computing. The focus is on developing economic principles to drive the use of grids. Two major investigations are GRid Architecture for Computational Economy (GRACE), a generic framework/infrastructure for grid computational economy, and Grid Resource Broker (GRB, aka Super Schedulers/MetaSchedulers) The key components of GRACE infrastructure include:  Grid Resource Broker (e.g. Nimrod/G)  Grid Resource and Market Information Server (provided by Globus MDS and extended services for resource access price)  Grid Open Trading Protocols and API  Trade Manager (part of the broker involved in establishing service price)  Grid Trade Server (works for Grid Service Providers) These components work closely with exiting grid middleware and fabrics such as Globus and Legion. The Nimrod-G Resource Broker carries out resource discovery, selection, and scheduling computations over them. The scheduling model can be sequential, embarrassingly or pleasantly parallel (task farming) or simply parallel. Ongoing work is on the use of a Parameter Sweep Specification (PSP) language for SPMD (single program, multiple data) can be used to parameterize applications (e.g. shop simulation, Drug Design, network and fuzzy logic simulation, high energy physics events processing). The latest version supports both deadline (soft real-time) and budget (computational economy) constraints in scheduling and optimisation of execution time or expenses. The GRACE infrastructure will use Nimrod/G to dynamically trade grid resources on the open market, allowing users to select resources that meet their deadline and cost requirements. Focus is on the design and development of smart constraint-based scheduling algorithms that support economic principles. The scheduling algorithms can handle a dynamically changing set of heterogenous and unreliable grid resources with varying performance, cost, price, access policies, and match these to user constraints such as deadline for completion of the assigned work, and budget limitations. IST-2000-25182 PUBLIC 50 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 7. EARTH SCIENCE / SPACE PROGRAMMES FOR GRID In order to highlight the research areas in Earth Science, Space and Astronomy research where grid technology will make an impact, in this section we present a few selected projects currently ongoing which are set to benefit from grid technology. 7.1. ESA SPACEGRID http://esagrid.esa.int/ http://esagrid.esa.int/esagrid/activities/spacegridhome.htm The European Space Agency, in collaboration with its international and national partners and the European Commission, is fully committed to support the deployment of Grid technology and the development of Grid applications. ESA believes that these initiatives contribute effectively to the building of the European Research Area, open to the entire world and enabling steady scientific progress. ESA SpaceGrid is a study which aims to assess how GRID technology can serve requirements across a large variety of space disciplines, sketch the design of an ESA-wide GRID infrastructure, foster collaboration and enable shared efforts across space applications. It will analyse the highly complicated technical aspects of managing, accessing, exploiting and distributing large amounts of data, and set up test projects to see how well the GRID performs at carrying out specific tasks in Earth observation, space weather, space science and spacecraft engineering. The SpaceGrid study started in September 2001 and is financed by the Agency's General Studies Programme. The study is run by an international consortium of industry and research centres led by Datamat (Italy). 7.2. CDF - CONCURRENT DESIGN FACILITY Funded by ESA General Studies, this ESTEC facility has shown the practical utilisation of advanced IT for supporting daily work in space activities. It is a good starting point to develop e-collaboration in space technology. 7.3. ASTROVIRTEL http://www.stecf.org/astrovirtel/ The ASTROVIRTEL project (Accessing Astronomical Archives as Virtual Telescopes) is supported by the European Commission and managed by the Space Telescope - European Coordinating Facility (ST-ECF) on behalf of ESA and the European Southern Observatory (ESO). It aims to enhance the scientific return of the ST-ECF/ESO Archive. It offers the possibility to European Users to exploit it as a virtual telescope, retrieving and analysing large quantities of data with the assistance of the Archive operators and personnel. The first call for proposals was issued in April 2000 and the second in April 2001 with a deadline for submission of June 15, 2001. 7.4. EGSO http://www.mssl.ucl.ac.uk/grid/egso The European Grid of Solar Observations (EGSO) will provide the tools and infrastructure needed to create the data grid that will form the fabric of a virtual solar observatory. The EGSO proposal has been positively evaluated and the project has been selected for negotiation. If successful, the project will start in Early 2002. The main objectives of the project are: IST-2000-25182 PUBLIC 51 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART • Federate solar data archives across Europe (and beyond) • Create tools to select, process and retrieve distributed and heterogeneous solar data • Provide mechanisms to produce standardised observing catalogues for space and ground- based observations • Provide the tools to create a solar feature catalogue 7.5. CEOS http://www.ceos.org/ CEOS (Committee on Earth Observation Satellites) is the international forum with membership from the world’s civil Earth Observation programmes, for coordinating international EO missions and developing and maintaining international standards for Earth Observation. ESRIN plans to host the first CEOS workshop on GRID and EO at Frascati in May 2002, topics of interest will include:  WEB Map  Use of dedicated high speed link across atlantic  Participation in the GGF application working group 7.6. ISS VIRTUAL CAMPUS http://www.esa.it/export/esaCP/GGGO5IG3AEC_index_2.html In September 2000 ESA inaugurated a Virtual Campus for the International Space Station (ISS). This exciting new development will allow present and future users of the ISS in Europe to be kept informed on all the new developments taking place, share knowledge and find new research partners. Both the pressurised and non-pressurised laboratories of the station will be open to European scientific researchers, development engineers and commercial service providers. The Virtual Campus will attract new users to the ISS and ensure that they are able to fully benefit from the many unique services that the ISS will provide. 7.7. ASTROPHYSICAL VIRTUAL OBSERVATORY http://www.eso.org/projects/avo/ The Astrophysical Virtual Observatory (AVO) Project is a three year study for the design and implementation of a virtual observatory for European astronomy: observing the digital sky. A virtual observatory (VO) is a collection of interoperating data archives and software tools which utilise the internet to form a scientific research environment in which astronomical research programs can be conducted. In much the same way as a real observatory consists of telescopes, each with a collection of unique astronomical instruments, the VO consists of a collection of data centres each with unique collections of astronomical data, software systems and processing capabilities. The need for the development of a VO is driven by two key factors. Firstly, there is an explosion in the size of astronomical data sets delivered by new large facilities such as the ESO VLT, the VLT Survey Telescope (VST), and VISTA. The processing and storage capabilities necessary for astronomers to analyse and explore these data sets will greatly exceed the capabilities of the types of desktop systems astronomers currently have available to them. Secondly, there is a great scientific gold mine going unexplored and underexploited because large data sets in astronomy are unconnected. If large surveys and catalogues could be joined into a uniform and interoperating "digital universe", entire new areas of astronomical research would become feasible. IST-2000-25182 PUBLIC 52 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 7.8. ASTROGRID http://www.astrogrid.ac.uk/ A 5M UKP funded project aimed at building a UK astronomy grid linking the UK with a global Virtual Observatory. The Consortium sites are:  Institute for Astronomy, University of Edinburgh  Institute of Astronomy, University of Cambridge  Dept. of Physics & Astronomy, University of Leicester  Space Data Division, Rutherford Appleton Laboratory  School of Computer Science, Queens University Belfast  Mullard Space Science Laboratory, UCL  Jodrell Bank Observatory, University of Manchester 7.9. NATIONAL VIRTUAL OBSERVATORY http://us-vo.org/ The American National Virtual Observatory (NVO), operated by the Association of Universities for Research in Astronomy under a co-operative agreement with the National Science Foundation, will link the archival data sets of space- and ground-based observatories, the catalogues of multi- wavelength surveys, and the computational resources necessary to support comparison and cross- correlation among these resources. The NVO will benefit the entire astronomical community. It will democratise astronomical research: the same data and tools will be available to students and researchers, irrespective of geographical location or institutional affiliation. The NVO will also have far-reaching education potential. Astronomy occupies a very special place in the public eye: new discoveries fascinate both the large number of amateur astronomers and the general public alike. The NVO will be an enormous asset for teaching astronomy, information technology, and the method of scientific discovery. Outreach and education will be key elements: the NVO will deliver rich content via the Internet to a wide range of educational projects from K-12 through college and to the public. 7.10. THE VIRTUAL SOLAR OBSERVATORY http://www.nso.noao.edu/vso/ The Virtual Solar Observatory (VSO) is a new tool for investigating the physics of the Sun and its impact on the Earth environment. The VSO will address one of the central challenges of solar research: the need to locate, correlate, absorb, and analyse data from a wide array of scientific instruments that measure the Sun on spatial and temporal scales that range over seven orders of magnitude. Currently, this process is extraordinarily time-consuming and labour-intensive; yet it is pursued, because the coupled Sun-Earth system demands it. The VSO will greatly increase the power and pace of the correlative studies needed to address fundamental challenges such as:  Predicting geomagnetic storms;  Understanding solar irradiance variations and their effect on the Earth environment;  Detecting active regions below the solar surface before they cause solar activity; and  Understanding the generation of the solar wind. IST-2000-25182 PUBLIC 53 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 7.11. GRIDLAB http://www.gridlab.org The GridLab project is currently under negotiation with the EU commission to be funded under the Fifth Call of the Information Society Technology (IST) Program. The GridLab project will develop a easy-to-use, flexible, generic and modular Grid Application Toolkit (GAT), enabling today's applications to make innovative use of global computing resources. The project is grounded by two principles: (i) the co-development of infrastructure with real applications and user communities, leading to working scenarios (ii) (dynamic use of grids, with self-aware simulations adapting to their changing environment. 7.12. MM5 http://www.mmm.ucar.edu/mm5/mm5-home.html MM5 is an integrated forecasting system, uses Globus services to apply networked computing resources to operational weather forecasting problems. The forecast model used is the Penn State/NCAR Mesoscale Model (MM5), adapted to run on a multitude of platforms including scalable distributed-memory computers. This model is integrated into a multistage pre- and post- processing pipeline for generating, analyzing, and displaying 24 to 36-hour weather forecasts over desired regions of the globe. 7.13. TARDIS http://esc.dl.ac.uk/StarterKit/Examples/tardis.html The Astro3D program is a three-dimensional simulation of astrophysical thermonuclear flashes. A run of Astro3D will produce a time series of output data sets, each containing a three-dimensional data set for several model parameters. Series of isosurfaces can be generated on the output data, either across time steps or across values of a parameter within a timestep. Such a series of isosurfaces can then be displayed in a CAVE or IDesk using Tardis. This demo uses Globus resource management (GRAM) and communication (Nexus) services to do on-the-fly generation of isosurfaces of Astro3D data for display in Tardis. Through the Tardis interface, the user can specify a series of isosurfaces to be generated and displayed. Tardis uses GRAM to acquire computational resources to compute the isosurfaces, and Nexus to communicate with those isosurface generators. 7.14. NEPH http://esc.dl.ac.uk/StarterKit/Examples/neph.html The focus of Neph is on real-time visualization of meteorological data. The Neph Cloud Detection application was originally developed as a proof-of-concept parallelization for the Defense Meteorological Satellite Program. These algorithms take visible light and infrared satellite imagery and determine cloudy pixels based on background geography and brightness, time-of-day and time-of- year calibrations, and historical information. IST-2000-25182 PUBLIC 54 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART Neph has a natural, distributed "streaming" architecture and was demonstrated on the I-WAY at SC95. This demonstration, however, emphasized the need for dynamic and incremental resource management. Globus is used in the testbed for incremental and dynamic resource management, including:  resource discovery (determine what hosts and network interfaces are available)  resource negotiation (interact with local schedulers for access to the resource)  resource acquisition (start processes on the resource in a secure manner) All of this is accomplished both at start-up or on the fly, in the event of host or network failures. 7.15. THE EARTH SYSTEM GRID http://www.earthsystemgrid.org/ The Earth System Grid II (ESG) is a new research project sponsored by the U.S. DOE Office of Science under the auspices of the Scientific Discovery through Advanced Computing program (SciDAC). The primary goal of ESG is to address the formidable challenges associated with enabling analysis of and knowledge development from global Earth System models. Through a combination of Grid technologies and emerging community technology, distributed federations of supercomputers and large-scale data & analysis servers will provide a seamless and powerful environment that enables the next generation of climate research. 7.16. AVANT GARDE http://www-fp.mcs.anl.gov/acpi/ This project is one of two experimental projects for the Accelerated Climate Prediction Initiative (ACPI). The focus is on global climate modelling, while the other project (led by Tim Barnett of SCRIPPS) is focused on regional climate modelling. The project goal is the creation of a performance- portable parallel coupled climate system model. IST-2000-25182 PUBLIC 55 / 56
    • DataGrid-09-D9.2-0902-1_4 Date: 01/02/2002 REPORT ON CURRENT GRID TESTBEDS STATE OF THE ART 8. CONCLUSION It is clear that grid technology is currently going through a startup and prototyping stage. There is a tremendous amount of interest and as a result, a substantial amount of government funding is being made available for development and a large number of organisations, projects and testbeds have been formed, often out of pre-existing initiatives. Although a clear, common vision is shared, the problems to be solved are unprecedented, large-scale and highly complex. Due to the magnitude and details of the problems to be solved it could take twenty years to fully implement the technology. Such a large-scale problem can only be solved by taking small steps at a time. With the large number of participating countries, organisations and research groups, there will be many different solutions proposed which will need to be designed, developed, tested and evaluated. Currently the tools, largely based on the existing Globus, Condor, X.509 (PKI), LDAP (X.500), TCP/IP and Linux standards, provide very basic low-level services, which require dedicated specialists to install, configure and operate. To be successful, another complete layer will be needed to manage the collections of individual tools and services with their complex interactions. Simpler user interfaces, which hide the underlying complexity will be needed in order to interface grid services to a large, diverse and dynamically changing user and application base. Grid programming models will be needed to implement abstractions which simplify the thinking and hide the low-level complexities. High-level programming languages will need to provide powerful built-in commands and structures for accessing and controlling interfaces to grid services (e.g. Java CoG kit). In the future end-systems will need to be shipped by the manufacturer with built-in support for grid integration. In the meantime, intense design, development and testing activity are ongoing on a large scale, yet it will take some time before truly effective, global standards begin to emerge. Here it is worth drawing to the reader’s attention a particularly succinct description, which summarises a broad view of the task in hand, given in the DOE report [R23]; it reads as follows: The complex and evolutionary nature of the scientific environment requires general services that can be combined in many different ways to support different types of collaboratory applications and support the changes in those applications so that the collaboratory can evolve along with the scientific understanding of the problem. Resource management for such dynamic and widely distributed environments requires global naming and authorization services, scalability and fault-tolerance well beyond the scope of existing systems and standards. This survey has highlighted several aspects of the current metacomputing grid technology drive. One major aspect is the enormous effort currently underway towards the advancement of the underlying national and international networking capability, which will be indispensable to support the future grid technological and data throughput requirements. Another aspect is the development of new international standards, and accompanying middleware, for data, resource and job management services and a robust and flexible global security infrastructure. Another aspect is the increasing importance of intercommunication and coordination between diverse national and international organisations at all levels including government, research and industry, which itself is not without problems of language, terms of reference, time-zone differences, etc. We may conclude that providing solutions for a future collaboratory grid infrastructure, whether it be for real-time collaborative visualization or high performance distributed processing, will first need to solve many pressing issues of an increasingly technological age. Besides providing solutions for the more immediate scientific data processing requirements, grid research is helping to foster a productive environment for development towards a new era of global information technology, in line with new, emerging global standards and cooperation. IST-2000-25182 PUBLIC 56 / 56