• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Session29 Arc
 

Session29 Arc

on

  • 1,157 views

 

Statistics

Views

Total Views
1,157
Views on SlideShare
1,134
Embed Views
23

Actions

Likes
1
Downloads
9
Comments
0

5 Embeds 23

http://softwareengel.blogspot.de 17
http://softwareengel.blogspot.co.uk 2
http://softwareengel.blogspot.com 2
http://softwareengel.blogspot.fr 1
http://softwareengel.blogspot.co.at 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Session29 Arc Session29 Arc Presentation Transcript

    • Introduction to ARC Middleware ISSGC’09, Sophia Antipolis, Nice, France Ivan Degtyarenko and Michael Gindonis CSC – IT Center for Science, Espoo, Finland July 11th, 2009 ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 1 / 36
    • Today’s session What is it about? After a quick introduction, you will familiarize yourselves with ARC middleware with practical examples. By this point you have already covered grid middleware basics, X509, certificates, proxies, virtual organizations, etc. so let’s dive in! ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 2 / 36
    • ARC Tutorial: timetable for this morning Time Title Session I Welcoming and Seminar Practicalities 09:00 – 10:00 Applications on the Grid Amphitheater NorduGrid / ARC Middleware Overview Session II Off to the PC Class 10:00 – 12:30 Hands-on ARC tutorial Exercises Class room (break as required / when the coffee etc. is available) ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 3 / 36
    • Short introduction: A “Hello Grid” job with ARC $ grid-proxy-init generate proxy $ ngsub -f hello.xrsl submit $ ngstat -a monitor $ ngget hello fetch the results hello.xrsl hello.sh & (executable=hello.sh) #!/bin/sh (jobname=hello) echo “Hello Grid!” (stdout=hello.out) (stderr=hello.err) (gmlog=gridlog) (cputime=10) (memory=200) (disk=1) ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 4 / 36
    • Steps to start running on Grid ● get an account for a system with a Grid User Interface installed (or install it on your own PC) once ● request a certificate from a Certificate Authority (CA) ● install the certificate into ~/.globus/ ● join a VO ● log in to the Grid (create a proxy) ● write a job description in a file every ● check available resources (optional) session ● submit the job ● monitor the progress of the job ● fetch the results ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 5 / 36
    • Privacy Note! When working on the Grid, you must accept that some information about your jobs and your Grid identity may be made public, for example via monitoring tools i.e. your name / affiliation IP address of your client computer job names and duration runtime environment other information Fortunately, for today you are relatively anonymous: /C=IT/O=GILDA/OU=Personal Certificate/L=Sophia Antipolis/CN=ISSGCXX ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 6 / 36
    • Security Policies ● policies vary in different grids and VOs ● you will need to accept these terms to use these resource ● Since you are in the Gilda VO you have already accepted its policy ● You will need to accept the M-grid Acceptable Use Policy since some resources used in this tutorial are part of M-grid ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 7 / 36
    • The NorduGrid collaboration  a community around the open source ARC Grid middleware − national Grids (e.g. M- grid, SweGrid, NorGrid), users also outside the Nordic countries − real users, real applications − implemented a production Grid system working non stop since May 2002 − open for anyone to participate ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 8 / 36
    • ARC Middleware ARC middleware (Advanced Resource Connector) ● open source out-of-the-box Grid solution software which enables production quality computational and data Grids ● Easily Installable/Buildable for a variety of distributions − non-intrusive server installation ● Supports a many common LRMS (Batch Systems) − Grid Engine, PBS/torque, Platform LSF ● builds upon standard Open Source solutions such as OpenLDAP, OpenSSL, SASL and Globus Toolkit − adds services not provided by Globus such as scheduling − extends or completely replaces some Globus components ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 9 / 36
    • ARC Middleware (cont.) • provides a reliable implementation of the fundamental Grid services, such as information services, resource discovery and monitoring, job submission and management, brokering and data management and resource management ● integrates computing resources and storage elements via a secure Grid layer ● provides a light-weight standalone client, the User Interface, which allows to submit, manage and monitor jobs on the Grid, move data around and query recourse info ● UI built-in broker allows to select the best resource for a job ● Grid job requirements are expressed in extended Resource Specification Language (xRSL) ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 10 / 36
    • ARC Middleware Architecture ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 11 / 36
    • The not so short introduction: Installing the ARC client ● required to submit jobs to NorduGrid ● download from http://ftp.nordugrid.org/download/ −binaries for various Linux distributions, source code also available ● the easiest way to install the client is to use the standalone version −uncompress in a directory (no root privileges required): $ tar zxvf nordugrid-standalone- <latest>.i386.tgz − run the environment setup script: $ cd nordugrid-standalone-<latest> $ . ./setup.sh ● RPM packages are recommended for multi-user installations ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 12 / 36
    • Requesting and Installing the grid Certificate ● create a certificate request $ grid-cert-request -int − generates the .globus subdirectory with a key (userkey.pem) and the request (usercert_request.pem) − identity string: e.g. /O=Grid/O=NorduGrid/OU=bccs.uib.no/CN=Per Hansen − remember to select a good passphrase and keep the key secret! ● send the file ~/.globus/usercert_request.pem to a Certification Authority (CA) see the instructions at your local site / country which CA to − contact ● wait for an answer from the CA − signed certificate returned by the Certificate Authority should be saved as file .globus/usercert.pem ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 13 / 36
    • Logging in to the Grid ● "Log in": grid-proxy-init − the command does not actually log in anywhere, but decrypts the private key and uses it to create a time-limited proxy − the proxy is used for authenticating to the resources ● "Log out": grid-proxy-destroy − destroys the proxy ● "whoami": grid-proxy-info − Shows information about the validity of the proxy subject : /O=Grid/O=NorduGrid/OU=csc.fi/CN=Michael Gindonis/CN=413289378 issuer : /O=Grid/O=NorduGrid/OU=csc.fi/CN=Michael Gindonis identity : /O=Grid/O=NorduGrid/OU=csc.fi/CN=Michael Gindonis type : Proxy draft (pre-RFC) compliant impersonation proxy strength : 512 bits path : /tmp/x509up_u7060 timeleft : 11:59:39 ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 14 / 36
    • Writing a job description file ● Resource Specification Language (RSL) files are used to specify job requirements and parameters for submission −NorduGrid uses an extended language (xRSL) based on the Globus RSL ● similar to scripts for local batch systems, but include some additional attributes − job name − executable location and parameters − location of input and output files of the job − architecture, memory, disk and CPU time requirements − runtime environment requirements ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 15 / 36
    • xRSL example ● hellogrid.sh #!/bin/sh echo “Hello Grid!” ● hellogrid.xrsl & (executable=hellogrid.sh) (jobname=hellogrid) (stdout=hello.out) (stderr=hello.err) (gmlog=gridlog) (cputime=10) (memory=200) (disk=1) ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 16 / 36
    • Submitting the job ● submit the job $ ngsub -d 1 -f hellogrid.xrsl ● a job id is returned => Job submitted with jobid gsiftp://ametisti.grid. helsinki.fi:2811/jobs/4556112397793721413313 07 ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 17 / 36
    • ARC Grid Monitor ● shows currently connected resources ● almost all elements "clickable" − browse queues and job states by cluster − list jobs belonging to a certain user ● no authentication, anyone can browse the info − privacy issues ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 18 / 36
    • Monitoring the Job ● Query the status using the command line $ ngstat hellogrid => Job gsiftp://ametisti.grid.helsinki.fi:2811/ jobs/455611239779372141331307 Jobname: hellogrid Status: INLRMS:Q − Most common status values are ACCEPTED, PREPARING, INLRMS:Q, INLRMS:R, FINISHING, FINISHED ● Or use the Grid Monitor ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 19 / 36
    • Fetching the results ● print the job output $ ngcat hellogrid − shows the standard output of the job − this can be done also during the job is running ● download the result files $ ngget hellogrid => ngget: downloading files to /home/ajt/455611239779372141331307 ngget: download successful - deleting job from gatekeeper. ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 20 / 36
    • Using a storage element ● Storage Elements are disk servers accessible via the Grid −can be used to store job output while user is logged out and client machine disconnected from the Grid ● allows to store input files close to the cluster where the program is executed, on a high bandwidth network ● files can be local and remote in the same job: (inputFiles= ("input1". "/home/user/myexperiment" ("input2", "gsiftp://se.example.com/files/data")) (outputFiles= ("output", "gsiftp://se.example.com/mydir/result1") ("prog.out", "gsiftp://se.example.com/mydir/stdout")) (stdout="prog.out") ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 21 / 36
    • Runtime environments ● software packages which are preinstalled on a computing resource and made available through Grid − just send the data and/or parameters to be processed − useful if there are many users of the same software or if the same program is used frequently − allows local platform specific optimizations ● For a specific CPU or Parallel Environment ● Perhaps in the near future… GPUs, CUDA ● required runtime environments can be specified in the job description file, for example: (runtimeenvironment=APPS/GRAPH/POVRAY-3.6) ● Runtime Environment Registry: − http://www.csc.fi/grid/rer/ ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 22 / 36
    • ARC / NorduGrid / M-grid references NorduGrid (resource monitor, presentations, tutorials, docs, …) http://nordugrid.org/ ARC middleware http://nordugrid.org/middleware User guide: http://www.nordugrid.org/documents/ui.pdf user support mailing list: nordugrid-support at nordugrid.org M-grid (Finnish National Grid) http://www.csc.fi/english/research/Computing_services/grid_environments/mgrid https://extras.csc.fi/mgrid/ support email at CSC: grid-support at csc.fi regular ARC training by CSC: http://www.csc.fi/english/csc/courses ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 23 / 36
    • Do I need to change my application to use ARC? three different approaches: using the application as is: grid middleware will move the executable and the data to the target system ● library dependencies often need to be resolved by linking statically or packing them to go with the application installing the application on the target system and using it via the Grid interface ● batch processing type applications normally work without changes, interactive applications are more difficult ● with ARC middleware this is facilitated by runtime environments (RE) modifying the application to fully exploit a distributed environment ● using ARC libraries ● distributing over a large geographical area is not practical unless the computation can be split to independent parts ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 24 / 36
    • Real life applications ● it's common to send several smaller jobs to the Grid to solve a larger problem ● parallel MPI jobs to a single cluster are supported (if correct runtime environment installed), but no MPI between clusters ● splitting the job to suitable parts and gathering the parts together is left to the user − more error prone environment than traditional local systems => error checking and recovery important − fault reporting and debugging has room for improvements ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 25 / 36
    • Real life applications ● Size your job to best exploit the grid − group many short jobs into one to avoid submission overhead − If possible break up larger or longer jobs into independent parts − If your job must run for a long time, checkpoint your results so that your calcuation can be resumed, no resource will stay up indefinitely… − M-grid is ideally suited to jobs of length 1 hour to 1 day. ● Use file caching if it is available − Eliminate unnecessarily file transfers (load on network) − Save time needed to stage files − Save disk space on the cluster front-ends ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 26 / 36
    • Further development of ARC middleware ● Stated goal: not to undermine existing functionality and capabilities available in “pre-…”ARC components (current stable version) ● Two SVN branches ● ARC0 (version 0.6.5, 0.8rc) ● Pre-existing production components (Pre-KnowARC project) − Backported features from KnowARC − Nordic DataGrid Facility provides support and backports features from the KnowARC project into the current stable releases of ARC ● ARC1 (0.9.xxx) − Next generation components developed by the KnowARC project ● More information at www.ndgf.org and www.knowarc.eu ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 27 / 36
    • What is new Service Oriented Architecture Modular structure Self-sufficient core components Interoperability built on standard User and developer friendly Business friendly open source License Apache 2.0 Portable – runs on almost all Linux variants, Solaris, porting to Windows and Mac OS in progress Aiming at integration into Fedora Debian and Ubuntu 07/11/09 www.knowarc.eu 28
    • ARC WS-based components Internal structure of ARC components 07/11/09 www.knowarc.eu 29
    • Key Feature - New ARC client Relies on dedicated library  Implemented in C++  Python and Java bindings  Allows easy development of application- specific clients Implements a user Grid toolbox  Handling of user & host credentials  computing resource discovery & information retrieval  matchmaking & brokering & job submission  input/output data handling The new library and arc* commands can handle glite-CREAM and UNICORE Windows and Mac OS client GUI – user interface, just delivered ! 07/11/09 www.knowarc.eu 30
    • Key Feature - HED HED – The Hosting Environment Daemon Container for all the server-side functional components Main functions:  Route messages between the services and the outside world  Provide inter service communication Provides a basic security infrastructure Consists of pluggable modules Light-weight (no Apache, no Axis) 07/11/09 www.knowarc.eu 31
    • Key Service – A-Rex ARC Resource-coupled Execution Service  Provides Execution Management capability  The Grid Manager from ARC Classic as core  Extended with WS interface implementing Basic Execution Service (BES)  Accepts Job Submission Description Language (JSDL)  Information and resource discovery – GLUE 2 schema Support for wide range of Local Resource Management Systems:  Torque, PBS/OpenPBS, SGE, LoadLeveler, LSF, Condor and SLURM Released in ARC 0.8, available at: http://wiki.nordugrid.org/index.php/ARC_v0.8 07/11/09 www.knowarc.eu 32
    • Key Service – New Storage ‘Distributed by Design’ storage system v Global namespace v Supports collections and subcollections to any depth A-Hash – a replicated database to store metadata Librarian – handles: v Metadata and hierarchy of collections and files v The location of replicas v Health data of the shepherd services Bartender - high-level interface for the users an for other services Shepherd – manages storage services, and provides a simple interface for storing files on storage nodes 07/11/09 www.knowarc.eu 33
    • Welcome to ARC Let’s begin… Off to the PC classroom! (unless the coffee is ready) ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 34 / 36
    • Abstracting the middleware http://technical.eu-egee.org/index.php?id=290  Expand the functionality of the grid infrastructure for users,  Reduce duplicated development when porting applications, and  Speeds the porting of new application to the grid. GridWay Metascheduler (http://www.gridway.org/) The GridWay Metascheduler performs job execution management and resource brokering, allowing unattended, reliable, and efficient execution of jobs, job arrays, and workflows on heterogeneous and dynamic Grids. P-GRADE Portal (http://portal.p-grade.hu/) The Parallel Grid Run-time and Application Development Environment Portal (P-GRADE Portal) is a workflow-oriented graphical environment that covers every stage of Grid application lifecycles. Ganga (http://ganga.web.cern.ch/ganga/) Ganga is an easy-to-use frontend for job definition and management, implemented in Python. Ganga allows trivial switching between testing on a local batch system and large-scale processing on Grid resources. ISSGC09, Sofia-Antipolis,France - Intro to ARC middleware, CSC – IT Center for Science Ltd. Slide 35 / 36