Gridpp5 Atlas Inst


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Gridpp5 Atlas Inst

  1. 1. GridPP CM, ICL 16 September 2002 Roger Jones
  2. 2. EDG Integration <ul><li>EDG decision to put short-term focus of effort on making the ATLAS DC1 production work on 1.2 and after </li></ul><ul><li>Good input from the EDG side. Effort from ATLAS, especially UK, Italy and CERN (Notably Frederic Brochu, but also others: Stan Thompson, RJ, and Alvin Tam joining now) </li></ul><ul><ul><li>The submission seems to work `after a fashion’ using multiple UIs and one production site </li></ul></ul><ul><ul><li>More problems using multiple production sites, esp. data replication </li></ul></ul><ul><ul><li>Inability to access Castor using Grid tools was a major problem, fixed </li></ul></ul><ul><ul><li>Large files sizes also a problem </li></ul></ul><ul><ul><li>Interfaces to catalogues need work </li></ul></ul><ul><li>Encouraging effort, expect full production quality service for demonstrations in November </li></ul><ul><li>Note: analysis requires more work – the code is not already `out there’ </li></ul><ul><li>Integration with other ATLAS Grid efforts needed </li></ul><ul><li>Evaluation session on Thursday at RHUL Software Week </li></ul>
  3. 3. Exercises <ul><li>Common ATLAS Environment pre-installed on sites </li></ul><ul><li>Exercise 1: simulation jobs </li></ul><ul><ul><li>50 jobs, 5000 events each, 5 submitters </li></ul></ul><ul><ul><li>One-by-one submission because RB problems </li></ul></ul><ul><ul><li>Running restricted to CERN </li></ul></ul><ul><ul><li>1 job 25h real time, 1Gb output </li></ul></ul><ul><ul><li>Output stored on the CERN SE </li></ul></ul><ul><li>Exercise 2 with patched RB and massive submission </li></ul><ul><ul><li>250 shorter jobs (several hours, 100Mb output) </li></ul></ul><ul><ul><li>CERN, CNAF, CCIN2P3, Karlsrurhe, NIKEF, RAL </li></ul></ul><ul><ul><li>One single user </li></ul></ul><ul><ul><li>Output back to CERN – SE space is an issue </li></ul></ul><ul><li>Also have `fast simulation in a box’ running on the EDG testbed, good exercise for analysis/non-installed tasks </li></ul>
  4. 4. Issues <ul><li>There is no defined `system software’ </li></ul><ul><li>The system managers cannot dictate the shells to use, compilers etc </li></ul><ul><li>Multiple users will lead to multiple copies of the same tools unless a system advertising what is installed and where is available (PACMAN does this for instance) </li></ul><ul><li>A user service requires binaries to be `installed’ on the remote sites – trust has to work both ways </li></ul>
  5. 5. Software Away from CERN <ul><li>Several cases: </li></ul><ul><ul><li>Software copy for developers at remote sites </li></ul></ul><ul><ul><li>Software (binary) installation for Grid Productions </li></ul></ul><ul><ul><li>Software (binary) download for Grid jobs </li></ul></ul><ul><ul><li>Software (source) download for developers on the Grid </li></ul></ul><ul><li>Initially rely on by-hand packaging and installation </li></ul><ul><li>True Grid use requires automation to be scalable </li></ul><ul><li>Task decomposes three requirements </li></ul><ul><ul><li>Relocatable code and environment </li></ul></ul><ul><ul><li>Packaging of the above </li></ul></ul><ul><ul><li>Deployment tool (something more than human+ftp!) </li></ul></ul>
  6. 6. World Relocatable Code <ul><li>Six months ago, ATLAS code was far from deployable </li></ul><ul><li>Must be able to work with several cases: </li></ul><ul><ul><li>afs for installation </li></ul></ul><ul><ul><li>No afs available </li></ul></ul><ul><ul><li>afs present but not to be used for the installation because of speed (commonplace!) </li></ul></ul><ul><li>Significant improvement in this area, with help from John Couchman, John Kennedy, Mike Gardner and others </li></ul><ul><li>Big effort in reduction of package dependencies – work with US colleagues </li></ul><ul><li>However, it takes a long time for this knowledge to become the default – central procedures being improved (Steve O’Neale) </li></ul><ul><li>For the non-afs installation the cvsupd mechanism seems to work generally, but patches are need for e.g. Motif problems </li></ul><ul><ul><li>Appropriate for developers at institutes, but not a good Grid solution in the long-term </li></ul></ul>
  7. 7. An Aside: Installation Method Evaluation <ul><li>Previous work from Lancaster, now big effort from John Kennedy </li></ul><ul><li>Cvsupd </li></ul><ul><ul><li>Problems with pserver/kserver; makefiles needed editing </li></ul></ul><ul><ul><li>Download took one night </li></ul></ul><ul><ul><li>Problems with CMT paths on the CERN side? </li></ul></ul><ul><ul><li>Problems fixing afs paths into local paths – previously developed script does not catch all </li></ul></ul><ul><li>NorduGrid rpms </li></ul><ul><ul><li>Work `after a fashion’. Does not mirror CERN, much fixing by hand </li></ul></ul><ul><ul><li>Much editing of makefiles to do anything real </li></ul></ul>
  8. 8. An Aside: Installation Method Evaluation (II) <ul><li>Rsynch method </li></ul><ul><ul><li>Works reasonably at Lancaster </li></ul></ul><ul><ul><li>Hard to be selective – easy to have huge downloads taking more than a day in the first instance </li></ul></ul><ul><ul><li>Only hurts the first time – better for updates </li></ul></ul><ul><li>Official DC1 rpms </li></ul><ul><ul><li>Circularity in dependencies cause problems with zsh </li></ul></ul><ul><ul><li>Requires root privilege for installation </li></ul></ul>
  9. 9. Installation Tools <ul><li>To use the Grid, deployable software must be deployed on the Grid fabrics, and the deployable run-time environment established </li></ul><ul><li>Installable code and run-time environment/configuration </li></ul><ul><li>Both ATLAS and LHCb use CMT for the software management and environment configuration </li></ul><ul><li>CMT knows the package interdependencies and external dependencies  this is the obvious tool to prepare the deployable code and to `expose’ the dependencies to the deployment tool (MG testing) </li></ul><ul><li>Grid aware tool to deploy the above </li></ul><ul><li>PACMAN is a candidate which seems fairly easy to interface with CMT, see following talk </li></ul>
  10. 10. CMT and deployable code <ul><li>Christian Arnault and Charles Loomis have a beta-release of CMT that will produce package rpms, which is a large step along the way </li></ul><ul><ul><li>Still need to clean-up site dependencies </li></ul></ul><ul><ul><li>Need to make the package dependencies explicit </li></ul></ul><ul><ul><li>Rpm requires root to install in the system database (but not for a private installation) </li></ul></ul><ul><li>Developer and binary installations being prepared, probably also needs binary+headers+source for a single package </li></ul><ul><li>Converter making PACMAN cache files from auto-generated rpms seems to work </li></ul>
  11. 11. The way forward? <ul><li>An ATLAS Group exists for deployment tools (RWLJ convening) </li></ul><ul><li>LCG have `decided’ to use SCRAM. </li></ul><ul><ul><li>Grounds for the decision seemed narrow, with little thought to implications outside of LCG </li></ul></ul><ul><ul><li>If this is to be followed generally, should reconsider strategy </li></ul></ul><ul><li>What about using SlashGrid to create a virtual file system? </li></ul>