Experiences With Globus On Das 2 In An Educational Setting
Upcoming SlideShare
Loading in...5
×
 

Experiences With Globus On Das 2 In An Educational Setting

on

  • 800 views

 

Statistics

Views

Total Views
800
Views on SlideShare
800
Embed Views
0

Actions

Likes
0
Downloads
4
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Experiences With Globus On Das 2 In An Educational Setting Experiences With Globus On Das 2 In An Educational Setting Presentation Transcript

  • Experiences with Globus on DAS-2 in an educational setting Herbert Bos & Lex Wolters LIACS, Leiden University {herbertb,llexx}@liacs.nl
  • Seminar Grid Computing
    • Fall 2001
    • 11 students (year 3 or 4) started, 8 finished
    • Once a week, 2 hours
    • 13 classes
    • Programming assignment
    • Goals
    • Try to separate Grid hype from Grid reality
    • Show the underlying technologies that are currently being developed and used to provide a 'pervasive computing grid'
  • Topics by lecturers
    • What is Grid Computing?
    • Requirements
    • History
    • Grid architecture
    • Basic Services
    • Taxonomy of Grids
    • QoS (final class)
  • Presentations by students
    • Legion
    • Globus
    • Resource management: GRAM
    • Scheduling: AppLeS
    • Communication (Nexus, etc.)
    • Information service: MDS, GRIS, GIIS
    • Data access: GASS + RSL
    • Security: GSS-API
    • Language support
  • Programming assignment
    • Using Globus to implement a Grid application:
      • Computation chopped up in subtasks which are distributed to computational nodes
      • Final result is combination of results of subtasks
      • Resource discovery
      • At least one of the following options:
        • Data is distributed in secure fashion
        • Incorporate costs
  • Topics
    • Willem de Bruijn: Distributed Evolutionary Algorithm
    • Hongqin Chen: RSA Key Breaking
    • Jeroen Laros: GridCrafty
    • Hui Li: Parallel Fractal Image Generation
    • Yafei Sun: Adaptive Quadrature
    • Arjan Tijms & Shlomo Raikin: Parallel Genetic Algorithm
  • Situation
    • Delivery DAS-2 delayed
    • Globus installation on SUN server
      • System-managers unfamiliar with Globus
      • Incorrect installation, e.g. certificates
    • Jan 21, 2002: DAS-2 operational
      • Platform for students
      • Focus on PBS, MPI; not Globus
  • Distributed Evolutionary Algorithm
    • Purpose: minimizes an arbitrary function
    • Strategy:
      • self-adaptation, no distinction between worker and controller nodes
      • predefined number of runs
    • Language: C++
    • Modules:
      • communication: Globus IO
      • resource management: GRAM
    • Results:
      • master/slave set-up best results in shortest time-span
      • other strategies increases self-adaptiveness, but worse results in current setting
  • Distr. Evolutionary Algorithm (cont’d)
    • Problems:
      • Distinction between fileserver and compute node: starting up new processes
      • Wall-time value (60 s) of scheduler cannot be altered (also not by maxTime in RSL): waiting processes are killed
    • Suggestions for improvement:
      • Symbolic links to Globus libraries
      • Documentation on Globus:
        • Overall idea is neglected
        • Q&A forum, globus.org
  • RSA Key Breaking
    • Purpose: factoring large numbers
    • Strategy:
      • Pollard’s Rho factoring algorithm
      • Master/slave framework
    • Language: C
    • Modules:
      • Communication: Nexus
      • Job allocation: GRAM and PBS
    • Results:
      • Significant speed-ups, depending on work-load/distribution
  • RSA Key Breaking (cont’d)
    • Problems:
      • Start-up
        • Problems to get correct certificate
        • Libraries were not installed correctly
        • Functions were not available
      • ‘ Real’ problems
        • GRAM macro-definitions not in corresponding header-file
      • Documentation
        • Lack of practical guidelines and examples
  • GridCrafty
    • Purpose: shell script which parallelises the chess engine Crafty
    • Strategy:
      • Master: all possible moves; worker: grade moves
    • Modules:
      • Storage access GASS, globus_rcp, openssh
    • Results:
      • Due to problems with Globus implementation it was also bypassed entirely which leads to speed-up of 17.5 (theoretical 22)
  • GridCrafty (cont’d)
    • Problems:
      • start-up
        • GASS did not work properly
        • Globus_rcp was not installed
        • Openssh did not work
      • ‘ real’ problem
        • Scheduling of tasks takes a lot of time
    • Final implementation:
      • connect to all nodes; query load:
        • Static: < 10% host free
        • Dynamic: clients checks load before start of intensive calculations
      • ssh implementation much faster than Globus (speed-ups 17.5 versus 5-9)
  • Parallel Fractal Image Generation
    • Purpose: see title
    • Strategy
      • Master distributes work, collects output, draw image
      • Slaves calculates points line-wise
    • Language: C and C++
    • Modules
      • Resource management GRAM
      • Communication MPI
  • Par. Fractal Image Generation (cont’d)
    • Problem
      • Conflict between current MPI set-up and GRAM job submit script (temporary fixed only on UvA-cluster)
    • Suggestions for improvement:
      • Installation of MPICH-G2
      • Where can one find good examples on exploiting Globus to get started?
  • Adaptive Quadrature
    • Purpose: calculate the quadrature of the curve of an arbitrary function
    • Strategy:
      • Divide curve into smaller ones
      • Ring of processes
      • Results via files
    • Language: gcc
    • Modules
      • Process control and allocation DUROC
      • Communication Nexus
  • Adaptive Quadrature (cont’d)
    • Problems
      • Start-up
        • Getting the correct certificate
        • Using the right RSL parameter (hostCount)
      • ‘ Real’ problem
        • Conflict between duroc_runtime_barrier and PBS: fixed only on UvA-cluster
    • Suggestions for improvement
      • Info on different communication techniques
  • Parallel Genetic Algorithm
    • Purpose: improving results of GAs
    • Strategy
      • Start independent searches at different locations of the solution landscape
      • Periodically exchange highest fitting of individuals
      • Init process: job dispatching and bootstrap communication set-up
      • Master process: relay for communications, synchronizes the start of worker processes, collects final results, and sets up GUI for monitoring and progress display
      • Worker processes: each runs a single N-generation run
  • Parallel Genetic Algorithm (cont’d)
    • Language: C and C++
    • Modules
      • Communication NEXUS – RPC
      • Job submission GRAM
      • Thread creation Globus_Common
    • Preliminary results
      • Parallel algorithm achieves results that are 8-17% better than sequential algorithm
  • Parallel Genetic Algorithm (cont’d)
    • Problems
      • Start-up
        • Environment and path setting
        • Obtaining certificates
        • Who is responsible for globus on das-2?
        • Different versions of globus (1.1.3 versus 2.0 beta)
      • ‘ Real’ problems
        • Shared libraries are not installed at nodes
        • Delegating proxies
        • Information about resource availability static or not present
        • Globus 2.0 is a beta version: things not implemented or missing
  • Parallel Genetic Algorithm (cont’d)
    • Suggestions for improvement:
      • Default Globus environment
      • Globus libraries on nodes via
        • NFS partition
        • Symbolic link to the ‘strange’ globus-edg beta 2.1 names
      • ‘ fork’ is default jobmanager, which only ‘schedules’ jobs to local file server (adding PBS makes code dependent on this scheduler)
      • Installation of a cluster monitor better than beowulf
      • Examples and makefiles
  •  
  •  
  •  
  • Conclusions
    • Seminar quite successful
    • DAS-2
      • Great environment for teaching purposes
      • Start-up problems
      • Current setting not optimal
      • Who is responsible for DAS-2?
      • Who determines policies, implementations?
    • Globus
      • Documentation, examples (probably better with current training material on globus.org)
      • Installation not trivial
    • IBM
      • Pre-sales OK, after-sales???
  • Thanks
    • Many thanks to David Groep who helped our students many, many times without any hesitation! Great job!