e-Science Technology/Middleware (Grid, Cyberinfrastructure) Gap Analysis e-Science Town Meeting Strand Palace Hotel May 14 2003 Geoffrey Fox, Indiana University David Walker, Cardiff University Note for this report the terms e-Science Technology/Middleware, Grid, and Cyberinfrastructure are NOT distinguished
Examined requirements and services already understood/developed for e-Science (reasonably broad coverage) and e-Business, e-Government and e-Services (inevitably rather spotty coverage)
Gaps divided into four broad areas
Education and Support
Research (not well separated from Near-term Technical )
Perception and Organization
Appendix listed over 60 significant UK services (perhaps clustered together) and tools – in the context of a total of some 150 world wide Grid services
Categorization of Technical Gaps and Grid Services Network 8.11 Information Compute Resources Portals PSE’s 8.10 Application Specific Resource Specific Generic Grid Services: Architecture and Style 8.1 Basic Technology Runtime and Hosting Environment 8.2 Information 8.7 Compute/File 8.8 Security 8.3 Workflow 8.4 Notification 8.5 Meta-data 8.6 Other 8.9
Taxonomy of Grid Functionalities Note: Term Data Grid not used consistently in community so avoided Grid supporting a company’s enterprise infrastructure Enterprise Grid Grid supporting University community computing Campus Grid Hybrid combination of Information and Compute/File Grid emphasizing integration of experimental data, filters and simulations Complexity or Hybrid Grid Grid service access to distributed information, data and knowledge repositories Information Grid “ Internet Computing” and “Cycle Scavenging” with secure sandbox on large numbers of untrusted computers Desktop Grid Run multiple jobs with distributed compute and data resources (Global “UNIX Shell”) Compute/File Grid Description of Grid Functionality Name of Grid Type
HPC Simulation Data Filter Data Filter Distributed Filters massage data For simulation Other Grid and Web Services Analysis Control Visualize Complexity Grid Computing Model Grid OGSA-DAI Grid Services This Type of Grid integrates with Parallel computing e.g. HPC(x) Data Filter Data Filter Data Filter
Taxonomy of Grid Operational Style Fault tolerant and self-healing Grid Robust Reliable Resilient R3 R3 or Autonomic Grid Grid supporting collaborative tools like the Access Grid, whiteboard and shared applications. Collaboration Grid Grid designed for rapid deployment and minimum life-cycle support costs Lightweight Grid Grid built with peer-to-peer mechanisms Peer-to-peer Grid Integration of Grid and Semantic Web meta-data and ontology technologies Semantic Grid Description of Grid Operational or Architectural Style Name of Grid Style
Substantial comments on “hosting environments” OGSI and “permeating principles”
Agreement on Web service model
“ Central Services And Architecture” Central Gaps “ Modular” Services natural for distributed teams Specific Gaps 4: Key OGSA Services 5: OGSA-compliant System Grid Services 6: Domain-Specific (Application) Grid Services 1: Hosting Environment WS WS WS WS 2: OGSI Web service Enhancements 3: Permeating Principles and Policies
An OGSA Grid Architecture in detail (from GGF GPA)
Meta-data rich Message-linked Web Services as the permeating paradigm
“ User” Component Model such as “Enterprise JavaBean (EJB)” or .NET.
Service Management framework including a possible Factory mechanism
High level Invocation Framework describing how you interact with system components.
This could for example be used to allow the system to built from either W3C or GGF style (OGSI) Web Services and to protect the user from changes in their specifications.
Security is a service but the need for fine grain selective authorization encourages
Policy context that sets the rules for each particular Grid.
Currently OGSA supports policies for routing, security and resource use.
The Grid Fabric or set of resources needs mechanisms to manage them. This includes automatic recording of meta-data and configuration of software.
Quality of service (QoS) for the Network and this implies performance monitoring and bandwidth reservation services.
Challenging as end-to-end and not just backbone QoS is needed.
Messaging systems like MQSeries from IBM provide robustness from asynchronous delivery and can abstract destination and allow customization of content such as converting between different interface specifications.
Messaging is built on transport mechanisms which can be used to support mechanisms to implement QoS and to virtualize ports
Unicore (GRIP) , GridLab , the European Data Grid (EDG) and LCG (LHC Computing Grid)
Many other (20) EU Projects but these have most of technology development
Storage Resource Broker SRB-MCAT from SDSC
The DoE Science Grid and related activities such as the Common Component Architecture (CCA) project
Examination of services from a collection of portal projects in the US from Argonne, Indiana, Michigan, NCSA and Texas.
This includes best practice discussion from Global Grid Forum in portals.
Review of contributions to the recent book Grid Computing: Making the Global Infrastructure a Reality edited by Fran Berman, Geoffrey Fox and Tony Hey, John Wiley & Sons, Chichester, England, ISBN 0-470-85319-0, March 2003
This includes other major projects like Cactus, NetSolve, Ninf
Some 6 Core and other application specific UK e-Science Projects
Noted opportunities for modern middleware ideas to be used – lightweight, message-based
Noted that Enterprise JavaBeans not optimized for Science which has high volume dataflow
Federated Grid Architecture natural for integration of heterogeneous functionality, style and security
Bioinformatics and other fields require integration of Information and Compute/File Grids
Information Grid Enterprise Grid Compute Grid Campus Grid Teacher Students Dynamic light-weight Peer-to-peer Collaboration Training Grid Overlapping Heterogeneous Dynamic Grid Islands R2 R1
(a) Layered OGSA Grid Core Service Core Service Core Service Core Service Application Service Application Service Application Service OGSA Interface OGSA Mediation Core Service Core Service Core Service Core Service Core Service Core Service Appl. Service Appl. Service Appl. Service Appl. Service Grid-1 Grid-2 OGSA or non OGSA Interface-2 OGSA or non OGSA Interface-1 (b) Federated OGSA Grid
Portals and User Interfaces – Noted gap that not using Grid Computing Environment “best practice” with component based user-interfaces matching component-based middleware
Programming Models (using workflow runtime)
Fabric Management (should be integrated with central service management and Information system), Computational Steering , Visualization , Datamining , Accounting , Gridmake , Debugging , Semantic Grid tools (consistent with Information system), Collaboration , provenance
Note new production central Infrastructure can support both research and production services of this type
Could involve asynchronous messaging , federated security (fine-grain authorization), “ e-ScienceBean ”, notification (as part of service management), invocation frameworks “virtualizing” service component structure
Integrate network monitoring/ reservation/ management including end-to-end network operations
Support critical policies like security , provenance
Powerful Service management (Research needed here)
Need to either federate and/or interoperate a world of “Grid Islands”