CHEP 2013

Using ssh as portal
The CMS CRAB over glideinWMS experience
by

I Sfiligoi1, S Belforte2,
J Letts1, T Martin1, ...
Agenda

●
●

What we did?

●

CHEP 2013

How we got there?
How it worked out?

Using SSH as portal for CMS

2
CMS AnaOps
●

●

CMS is one of the 4 experiments at the
Large Hadron Collider
CMS Analysis Operations (AnaOps) is a
comput...
CRAB
CMS AnaOps submission tool
●

●

CMS AnaOps has been using CRAB
as the user interface to the Grid
since 2004
Initiall...
CRAB2 Server
●

CRAB2 Server introduced in 2007

●

Client does not need the Grid stack anymore

CHEP 2013

Using SSH as p...
glideinWMS
●

●

In 2009 CMS added glideinWMS to the list of
supported middleware
glideinWMS, based on HTCondor,
does not ...
glideinWMS and CRAB2 Server
A schematic view

HTCondor
Scheduler

glideinWMS

CHEP 2013

Using SSH as portal for CMS

7
Operational experience
●

CMS AnaOps experienced
a lot of operational issues with
CRAB2 Server submitting through glideinW...
A look at the CRAB2 Server
●

Designed to be very modular and flexible
–

●

But that made it
complex, too

CMS decided to...
A look at the CRAB2 Server
●

Designed to be very modular and flexible
–

●

But that made it
complex, too

CMS decided to...
What was wrong with CRAB2 Client?
●

The CRAB2 Client never went away
–

–
●

A significant fraction of CMS users kept usi...
Why not just use ssh?
●

The major stated benefit of CRAB2 Server was
“remote submission using a standard interface”
(ther...
The ssh-enabled CRAB2 Client
●

There are three stages in CRAB2 Client
–

Job submission
●

Includes uploading the input s...
The ssh-enabled CRAB2 Client
A schematic view

Client node

scp input_sandox

crab_submit

ssh condor_submit
ssh condor_q
...
Server side setup
●

gsisshd is a standard package
–

●

Using lcmaps for authorization
–

●

Just installed it from RPM
W...
Users liked it!
●

●

Took only 4 months for all users to completely
switch from Server to ssh-Client for glideinWMS
And a...
Smooth operations
●

We have not seen any major problems with
ssh-based CRAB2 client itself
–

●

Now almost all problems ...
Works at scale
Essentially ssh-only

Essentially ssh-only

Includes non-ssh jobs as well

CHEP 2013

Using SSH as portal f...
Hickups during the voyage
●

We occasionally hit a ssh quirk
–

–
●

In order to speed up ssh connections,
we re-use the s...
Remaining issue
●

The major remaining issue is the distribution of
the CRAB2 Client code
–

This still must be installed ...
Conclusions

●

●

CHEP 2013

Using (gsi)ssh as a portal
has proven to be very effective
By removing complexity out of
the...
Acknowledgments
●

●

The idea of using ssh came from the
RCondor package (see poster #322)
This work was partially sponso...
Upcoming SlideShare
Loading in …5
×

Using ssh as portal - The CMS CRAB over glideinWMS experience

350 views

Published on

The User Analysis of the CMS experiment is performed in distributed way using both Grid and dedicated resources. In order to insulate the users from the details of computing fabric, CMS relies on the CRAB (CMS Remote Analysis Builder) package as an abstraction layer. CMS has recently switched from a client-server version of CRAB to a purely client-based solution, with ssh being used to interface with either HTCondor or glideinWMS batch systems. This switch has resulted in significant improvement of user satisfaction, as well as in significant simplification of the CRAB code base. This presentation presents the reasoning behind the change as well as the new experience.

Presented at CHEP2013 in Amsterdam.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
350
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Using ssh as portal - The CMS CRAB over glideinWMS experience

  1. 1. CHEP 2013 Using ssh as portal The CMS CRAB over glideinWMS experience by I Sfiligoi1, S Belforte2, J Letts1, T Martin1, M D Saiz Santos1 and F Fanzago3 2 University of California San Diego Università e INFN di Trieste 3 Università e INFN di Padova 1 CHEP 2013 Using SSH as portal for CMS 1
  2. 2. Agenda ● ● What we did? ● CHEP 2013 How we got there? How it worked out? Using SSH as portal for CMS 2
  3. 3. CMS AnaOps ● ● CMS is one of the 4 experiments at the Large Hadron Collider CMS Analysis Operations (AnaOps) is a computing task focused on the operational aspects of enabling physics data analysis – CHEP 2013 i.e. provides the computing infrastructure used by the actual physicists Using SSH as portal for CMS 3
  4. 4. CRAB CMS AnaOps submission tool ● ● CMS AnaOps has been using CRAB as the user interface to the Grid since 2004 Initially was a purely client interface – CRAB basically just a fancy wrapper – So the Client machine had to have the full Grid stack to function CHEP 2013 Using SSH as portal for CMS 4
  5. 5. CRAB2 Server ● CRAB2 Server introduced in 2007 ● Client does not need the Grid stack anymore CHEP 2013 Using SSH as portal for CMS 5
  6. 6. glideinWMS ● ● In 2009 CMS added glideinWMS to the list of supported middleware glideinWMS, based on HTCondor, does not support remote job submission – CHEP 2013 CRAB2 Server had to sit on the same node as (one of) the HTCondor scheduler(s) Using SSH as portal for CMS 6
  7. 7. glideinWMS and CRAB2 Server A schematic view HTCondor Scheduler glideinWMS CHEP 2013 Using SSH as portal for CMS 7
  8. 8. Operational experience ● CMS AnaOps experienced a lot of operational issues with CRAB2 Server submitting through glideinWMS – ● One major problem being the server losing track of job status Users were complaining a lot CHEP 2013 Using SSH as portal for CMS 8
  9. 9. A look at the CRAB2 Server ● Designed to be very modular and flexible – ● But that made it complex, too CMS decided to give up on fixing it – CHEP 2013 CRAB3 would simply replace it Using SSH as portal for CMS 9
  10. 10. A look at the CRAB2 Server ● Designed to be very modular and flexible – ● But that made it complex, too CMS decided to give up on fixing it – CRAB3 would simply replace it But by late 2012, CRAB3 still did not exist CHEP 2013 Using SSH as portal for CMS 10
  11. 11. What was wrong with CRAB2 Client? ● The CRAB2 Client never went away – – ● A significant fraction of CMS users kept using it after CRAB2 Server was put in production And they were, by and large, liking it It just could not be used to submit to glideinWMS – CHEP 2013 Due to lack of (native) remote submission to HTCondor Using SSH as portal for CMS 11
  12. 12. Why not just use ssh? ● The major stated benefit of CRAB2 Server was “remote submission using a standard interface” (there were of course other stated benefits, but they had very little impact when paired with glideinWMS) ● But what users normally use for remote access? – SSH ● So, what if we used ssh in CRAB2 client as well? – CHEP 2013 The gsissh dialect to support x509 proxies Using SSH as portal for CMS 12
  13. 13. The ssh-enabled CRAB2 Client ● There are three stages in CRAB2 Client – Job submission ● Includes uploading the input sandbox – – ● Job monitoring Output log fetching The Client uses ssh and/or scp to talk to a node running the HTCondor scheduler CHEP 2013 Using SSH as portal for CMS 13
  14. 14. The ssh-enabled CRAB2 Client A schematic view Client node scp input_sandox crab_submit ssh condor_submit ssh condor_q crab_monitor crab_fetch CHEP 2013 scp output_logs Using SSH as portal for CMS ssh HTCondor Scheduler glideinWMS 14
  15. 15. Server side setup ● gsisshd is a standard package – ● Using lcmaps for authorization – ● Just installed it from RPM With callbacks to GUMS in the USA, and Argus in Europe Each user gets a standard Linux account – CHEP 2013 No technical limits on what the user can run, but we would kill any offenders, if spotted Using SSH as portal for CMS 15
  16. 16. Users liked it! ● ● Took only 4 months for all users to completely switch from Server to ssh-Client for glideinWMS And another 4 months to get 90% of all work ssh-based Client CHEP 2013 Using SSH as portal for CMS 16
  17. 17. Smooth operations ● We have not seen any major problems with ssh-based CRAB2 client itself – ● Now almost all problems are in the Grid layer Maybe not too surprising – CRAB2 Client is really just a fancy wrapper – SSH is a very mature tool Operational load now at historical low in spite of increasing usage CHEP 2013 Using SSH as portal for CMS 17
  18. 18. Works at scale Essentially ssh-only Essentially ssh-only Includes non-ssh jobs as well CHEP 2013 Using SSH as portal for CMS Includes non-ssh jobs as well 18
  19. 19. Hickups during the voyage ● We occasionally hit a ssh quirk – – ● In order to speed up ssh connections, we re-use the ssh control session (-S) If it gets stuck, user needed to manually remove it CRAB2 Client now automatically detects and fixes the problem CHEP 2013 Using SSH as portal for CMS 19
  20. 20. Remaining issue ● The major remaining issue is the distribution of the CRAB2 Client code – This still must be installed on the client node – Any change must be pushed to ALL the users ● CHEP 2013 Making server-side changes challenging Using SSH as portal for CMS 20
  21. 21. Conclusions ● ● CHEP 2013 Using (gsi)ssh as a portal has proven to be very effective By removing complexity out of the code we have to maintain, life got much easier Using SSH as portal for CMS 21
  22. 22. Acknowledgments ● ● The idea of using ssh came from the RCondor package (see poster #322) This work was partially sponsored by the US National Science Foundation under Grants No. PHY-1148698 and PHY-1120138. CHEP 2013 Using SSH as portal for CMS 22

×