4. Pre-GUG histroy
● ClusterGrid since 2002
● First middleware: Condor Flock
– problems: strong centralization, required central
auth with LDAP, the io overhead was lot because
the shadow processes etc.
● Second middleware: Centralized broker based
on Apache/Postgresql/PHP
– since then Condor just an LRMS
– problems: missing storage, missing interoperability
● Final solution :) GUG
5. Grid UnderGround
● new generation ClusterGrid middleware.
● Since Feb 2006 using in the production system
● Design goals:
– pure web service based framework (no WSRF)
– using selected GGF, W3C standards
– simplify service development
– focus on core services (info, storage, job management,
security, monitoring)
– KISS: Keep It Simple, Stupid
– destop and HPC ware: low memory and cpu usage
– open source development
(http://www.sourceforge.net/projects/gug)
6. GUG Architecture
● Pure python framework:
– framework runs as a single daemon
– manage threads
– handle network communication over HTTP(S)/SOAP
– every service is a dinamicaly loadable plugin of the
framework, services use backends to separate
interfaces and functions
● Mandatory services:
– Manager service: manage simple lifecycle of other
services. Remote management also possible.
– Grid Information System: p2p system to route
advertisements, service descriptions of services
(better than UDDI)
8. GIS
● separates data and metadata
● advertisements: (metadata, data) tuple
● data: XML description of anything like service,
resource etc.
● metadata: source of data, TTL, etc.
● simple routing algorithm based on static peer list
and TTL (we like news feeds :)
● two main source of data:
– services (get_description function),
– GIS backends: standalone data provideres
9. GUG Core Services
● VOService (security)
– every entity identified by X509 cert
– every VO should set up at least one VO service
– manage authorization information, organize them into the
tree
– manage VO membership like a maling list
● Job management components
– Exec: run and manage job in SMP systems (useful on
destops)
– Job Controller: using GGF BES interface and GGF
JSDL. Interface with common LRMS (eg: Condor, Exec
etc), no scheduling
– SuperScheduler: use the same interace and data model
as Job Controller, it's a grid level scheduler
10. GUG Core Services
● Storage management components:
– file based arch.
– Storage Controller: stores and gives back files
using transport independent protocol like SRM
– ShareDirectory: directory and file sharing (same
interface as Storage Controller)
– File System Service: metadata catalog
– Storage Manager: provides POSIX like interface
(mkdir, ls, mv, cp etc.), create replicas on Storage
Contollers, manage file system entity types as a
plugin: file, directory, shared directory etc.
11. GUG Serives and UI
● Additional services:
– Compiler service: create binaries from source to all
avalilable platforms. Use job management componets
● User Interface:
– modular command line interface: 'grid' command:
$ grid storage ls /grid/tmp R
/grid/tmp:
d 20060412 14:04 proba
/grid/tmp/proba:
8 20060412 14:05 szoveg
8 20060412 14:06 szoveg.1
8 20060412 14:06 masnev
$ grid job submit testjob.jsdl
– graphical and web interface comming soon
12. The real life say good
● Seamles transaction to new middleware: UI
have 'almost' compatibility interface with old
ClusterGrid broker
● GUG already tested with real life applications:
– virtual screening (pharmacy)
– nonlinear dinamics (physics)
– compliler optimalization (IT research)
– usual ClusterGrid applications