Your SlideShare is downloading. ×
0
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

glideinWMS Frontend Installation - Part 1 - Condor Installation - glideinWMS Training Jan 2012

745

Published on

Instructions on how to install the Condor part of a glideinWMS Frontend system. …

Instructions on how to install the Condor part of a glideinWMS Frontend system.
Part of the glideinWMS Training session held in Jan 2012 at UCSD.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
745
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. glideinWMS Training @ UCSD glideinWMS Frontend Installation Part 1 – Condor Installation by Igor Sfiligoi (UCSD)UCSD Jan 17th 2012 Condor Install 1
  • 2. Overview ● Introduction ● Planning and Common setup ● Central Manager Installation ● Submit node InstallationUCSD Jan 17th 2012 Condor Install 2
  • 3. Refresher - Glideins ● A glidein is just a properly configured Condor execution node submitted as a Grid job ● glideinWMS Central manager provides glidein Execution node Collector CREAM automation glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMSUCSD Jan 17th 2012 Condor Install 3
  • 4. Refresher - Glideins ● The glideinWMS triggers glidein submission ● The “regular” negotiator matches jobs to glideins Central manager glidein Execution node Collector CREAM glidein Execution node Negotiator Submit node Submit node glidein Execution node Submit node Execution node glidein Schedd Startd Globus Job glideinWMSUCSD Jan 17th 2012 Condor Install 4
  • 5. Bottom line Condor is king! (glideinWMS just a small layer on top)UCSD Jan 17th 2012 Condor Install 5
  • 6. Condor installation ● Proper Condor installation and configuration the most important task ● Condor will do most of the work ● … and is thus the most resource hungry ● GlideinWMS installation almost an afterthought ● Although it does require proper security config of Condor ● GlideinWMS installation proper will be described in a separate talkUCSD Jan 17th 2012 Condor Install 6
  • 7. Planning and Common setupUCSD Jan 17th 2012 Condor Install 7
  • 8. Refresher - Condor ● Two main node types ● Submit node(s) ● Central manager Central manager ● (execute nodes are dynamic – glideins) Collector ● Public TCP/IP Submit node Submit node Submit node Negotiator networking needed Schedd ● GSI used for network security glideinUCSD Jan 17th 2012 Condor Install 8
  • 9. Planning the setup ● In theory, all Condor daemons can be installed on a single node ● However, if at all possible, put Central Manager on a dedicated node ● i.e. do not use it as a submit node, too ● Both for security and stability reasons ● You may want/need more than one submit node ● Depends on expected use and available HW ● You do need at least one, thoughUCSD Jan 17th 2012 Condor Install 9
  • 10. Common system considerations ● Condor is supported on a wide variety of platforms ● Including Linux (e.g. RHEL5), MacOS and Windows ● Linux recommended in OSG (and assumed in the rest of talk) ● GSI security requires ● Host or service certificate ● CAs & CRLs – Typically delivered via OSG RPMs (but other means acceptable) https://twiki.grid.iu.edu/bin/view/Documentation/Release3/InstallCertAuth ● Full Grid Client software recommended (for ease of ops) https://twiki.grid.iu.edu/bin/view/Documentation/Release3/InstallOSGClientUCSD Jan 17th 2012 Condor Install 10
  • 11. OSG Grid Client ● Requires RHEL5-compatible Linux ● RHEL6 support promised for early 2012 ● Procedure in a nutshell ● Add EPEL and OSG RPM repositories to sys conf. ● yum install osg-ca-certs ● yum install osg-client Other Grid clients ● Enable CRL fetching crontab (e.g. EGI/glite) will work just as wellhttps://twiki.grid.iu.edu/bin/view/Documentation/Release3/InstallOSGClientUCSD Jan 17th 2012 Condor Install 11
  • 12. Requesting a host certificate ● OSG provides a script to talk to DOEGrids https://twiki.grid.iu.edu/bin/view/Documentation/Release3/GetHostServiceCertificates ● Procedure in a nutshell ● Install OSG client ● yum install osg-cert-scripts ● cert-request … ● Wait for email If you have other ways ● cert-retrieve … to obtain a host cert, feel free to use them ● cp into /etc/grid-security/UCSD Jan 17th 2012 Condor Install 12
  • 13. Condor Central ManagerUCSD Jan 17th 2012 Condor Install 13
  • 14. Refresher - Central Manager Central manager ● Two (groups of) processes Collector ● Collector Negotiator ● Negotiator ● The Collector defines the Condor pool ● Knows about all the glideins it owns ● Knows about all the schedds ● The Negotiator does the matchmaking ● Decides who gets what resourcesUCSD Jan 17th 2012 Condor Install 14
  • 15. Condor Collector – considerations ● The Collector is the repository of all knowledge ● All other daemons report to it ● Including the glideins, who get its address at run-time ● Must process lots of info Central manager ● One update every 5 mins Negotiator Collector from each and every daemon Collector Collector ● With strong security → expensive ● Typically deployed as a tree of collectors glidein glidein ● All security handled in leafs glidein ● Top one still has the complete picture glideinUCSD Jan 17th 2012 Condor Install 15
  • 16. CCB – An additional cost ● The Condor collectors are also acting as CCBs ● Each glidein will open 5+ long-lived TCP sockets ● Make sure you have enough file descriptors ● Default OS limit is 1024 per process ● Plan on having one CCB per 100 glideins CCB Call me back Leafs in the I want to connect tree of collectors to the execute node transfer filesUCSD Jan 17th 2012 Condor Install 16
  • 17. High availability (theory) ● Central manager can be a single point of failure ● If it dies, the Condor pool dies with it! ● To avoid this, one can deploy multiple CMs ● All daemons will advertise to 2 (or more) Collectors Currently not supported by glideinWMS ● All CMs will have the same view of the world ● There can only be one Negotiator, though ● One negotiator will be Active, all others in standby ● More details on Condor man page http://www.cs.wisc.edu/condor/manual/v7.6/3_11High_Availability.html#SECTION004112000000000000000UCSD Jan 17th 2012 Condor Install 17
  • 18. Hardware needs ● Tree of collectors spreads the load over multiple processes ● So several CPUs come handy ● Negotiator single threaded ● Will benefit from fast CPU Exact footprint depends on how many ● Memory usage not terrible additional attributes the VO defines ● O(100k) per glidein to store ClassAds ● Concrete CMS example: 25k glideins ~ 6G memory ● Negligible disk IOUCSD Jan 17th 2012 Condor Install 18
  • 19. System considerations Minimize risk due to Condor bugs ● Does not need to run as root (although it can) ●Make sure the host cert is readable by that user ● Must be on the public IP network ● Each collector listens on its own well defined port, must be reachable by all glideins (WAN) Must open firewall ● Negotiator has a dynamic list port, at least must be reachable by submit nodes (schedds) for these ● Will use a large number of network sockets ● Will overwhelm most firewalls ● Consider disabling stateful firewalls (e.g. iptables)UCSD Jan 17th 2012 Condor Install 19
  • 20. Security considerations ● Cannot be firewalled → endpoint security ● GSI security used (i.e. x509 certs) for networking ● Limit administrative rights to local users (FS auth) ● The Collector is central trust point of the pool ● The DNs of all other daemons are whitelisted here, including: – Schedds – Glideins (i.e. pilot proxies) – Clients (e.g. glideinWMS Frontend)UCSD Jan 17th 2012 Condor Install 20
  • 21. Installing the CM ● Two major burdens (for basic install) ● Collector tree ● Security setup ● The glideinWMS installer helps with both ● Starting from Condor tarball Easy-to-use update cmdline tool ● As any user (e.g. as non-root) available, too ● Highly recommended ● RPM install also an option ● Easy to keep up-to-date (i.e. yum update) ● But you will need to configure by hand ● And will run as root Unless you hack the startup scriptUCSD Jan 17th 2012 Condor Install 21
  • 22. Collector tree setup ● In a nutshell ● For each secondary collector: – Tell Master to start a collector on different port – repeat ● Forward ClassAds to main Collector ... ... COLLECTORXXX = $(COLLECTOR) COLLECTORXXX = $(COLLECTOR) COLLECTORXXX_ENVIRONMENT = "_CONDOR_COLLECTOR_LOG=$(LOG)/CollectorXXXLog" COLLECTORXXX_ENVIRONMENT = "_CONDOR_COLLECTOR_LOG=$(LOG)/CollectorXXXLog" xN COLLECTORXXX_ARGS = -f -p YYYY COLLECTORXXX_ARGS = -f -p YYYY DAEMON_LIST = $(DAEMON_LIST) COLLECTORXXX DAEMON_LIST = $(DAEMON_LIST) COLLECTORXXX … … # forward ads to the main collector # forward ads to the main collector # (this is ignored by the main collector, since the address matches itself) CONDOR_VIEW_HOSTthe main collector, since the address matches itself) # (this is ignored by = $(COLLECTOR_HOST) CONDOR_VIEW_HOST = $(COLLECTOR_HOST)UCSD Jan 17th 2012 Condor Install 22
  • 23. Security setup (1) ● In a nutshell ● Configure basic GSI (i.e. point to CAs and host cert) ● Set up authorization (i.e. switch to whitelist) ● Whitelist all DNs ● Enable GSI ● DN whitelisting a bit annoying ● Must be done in two places – in condor_config, and And is a regexp here! – in condor_mapfile ● glideinWMS provides a cmdline toolUCSD Jan 17th 2012 Condor Install 23
  • 24. Security setup (2)# condor_config.local # condor_config.local# Configure GSI # Configure GSICERTIFICATE_MAPFILE=/home/condor/glidecondor/certs/condor_mapfile CERTIFICATE_MAPFILE=/home/condor/glidecondor/certs/condor_mapfileGSI_DAEMON_TRUSTED_CA_DIR=/etc/grid-security/certificates GSI_DAEMON_TRUSTED_CA_DIR=/etc/grid-security/certificatesGSI_DAEMON_CERT = /home/condor/.globus/hostcert.pem GSI_DAEMON_CERT = /home/condor/.globus/hostcert.pemGSI_DAEMON_KEY = /home/condor/.globus/hostkey.pem GSI_DAEMON_KEY = /home/condor/.globus/hostkey.pem# Force whitelisting # Force whitelistingDENY_WRITE = anonymous@* DENY_WRITE = anonymous@*DENY_ADMINISTRATOR = anonymous@* DENY_ADMINISTRATOR = anonymous@*DENY_DAEMON = anonymous@* DENY_DAEMON = anonymous@*DENY_NEGOTIATOR = anonymous@* DENY_NEGOTIATOR = anonymous@*DENY_CLIENT = anonymous@* DENY_CLIENT = anonymous@*ALLOW_ADMINISTRATOR = $(CONDOR_HOST) ALLOW_ADMINISTRATOR = $(CONDOR_HOST)ALLOW_WRITE = * ALLOW_WRITE = *USE_VOMS_ATTRIBUTES = False # use only pilot DN, not FQAN USE_VOMS_ATTRIBUTES = False # use only pilot DN, not FQAN# list all DNs # condor_mapfile # condor_mapfile... list all DNs # ... ... ...GSI_DAEMON_NAME=$(GSI_DAEMON_NAME),DNXXX GSI_DAEMON_NAME=$(GSI_DAEMON_NAME),DNXXX GSI "^DNXXX$" UIDXXX GSI "^DNXXX$" UIDXXX xN... ... ... ... GSI (.*) anonymous GSI (.*) anonymous# enable GSI FS (.*) 1 # enable GSI FS (.*) 1SEC_DEFAULT_AUTHENTICATION_METHODS = FS,GSI SEC_DEFAULT_AUTHENTICATION_METHODS = FS,GSISEC_DEFAULT_AUTHENTICATION = REQUIRED SEC_DEFAULT_AUTHENTICATION = REQUIREDSEC_DEFAULT_ENCRYPTION = OPTIONAL SEC_DEFAULT_ENCRYPTION = OPTIONALSEC_DEFAULT_INTEGRITY = REQUIRED Also enable local auth SEC_DEFAULT_INTEGRITY = REQUIRED# optionally, relax client and read settings # optionally, relax client and read settings UCSD Jan 17th 2012 Condor Install 24
  • 25. Installing with Q&A installer ~/glideinWMS/install$ ./glideinWMS_install ~/glideinWMS/install$ ./glideinWMS_install ... ... Please select: 4 Please select: 4 [4] User Pool Collector ... User Pool Collector [4] ... Where do you have the Condor tarball? /home/condor/Downloads/condor-7.6.4-x86_rhap_5-stripped.tar.gz Where do you have the Condor tarball? /home/condor/Downloads/condor-7.6.4-x86_rhap_5-stripped.tar.gz Where do you want to install it?: [/home/condor/glidecondor] /home/condor/glidecondor If Where do you want to install Condor, who should get email about it?: me@myemail something goes wrong with it?: [/home/condor/glidecondor] /home/condor/glidecondor If something goes wrong with Condor, who should get email about it?: me@myemail Do you want to split the config files between condor_config and condor_config.local?: (y/n) [y] y Do you want to split the config files between condor_config and condor_config.local?: (y/n) [y] y ... ... Do you want to get it from VDT?: (y/n) y Do you want to get it from VDT?: (y/n) y Do you have already a VDT installation?: (y/n) y Do you have already a VDT installation?: (y/n) y Where is the VDT installed?: /etc/osg/wn-client Where is the VDT installed?: /etc/osg/wn-client ... ... Will you be using a proxy or a cert? (proxy/cert) cert Will you be using a proxy or a cert? (proxy/cert) cert Where is your certificate located?: /home/condor/.globus/hostcert.pem Where is your certificate located?: /home/condor/.globus/hostcert.pem Where is your certificate key located?: /home/condor/.globus/hostkey.pem Where is your certificate key located?: /home/condor/.globus/hostkey.pem My DN = DN1 My DN = DN1 ... You can also add ... DN: DNXXX DN: DNXXX nickname: [condor001] uidXXX the DNs as an nickname: [condor001] uidXXX xN independent step Is this a trusted Condor daemon?: (y/n) y Is this a trusted Condor daemon?: (y/n) y ... ... DN: DN: How many slave collectors do you want?: [5] 200 How many slave collectors do you want?: [5] 200 What name would you like to use for this pool?: [My pool] MyVO What name would you like to use for this pool?: [My pool] MyVO What port should the collector be running?: [9618] 9618 What port should the collector be running?: [9618] 9618UCSD Jan 17th 2012 Condor Install 25
  • 26. Maintenance ● If you need to add more DNs, use ● cmdline tool glidecondor_addDN ~/glideinWMS/install$ ./glidecondor_addDN -daemon "DN of Schedd A" "DNA" UIDA ~/glideinWMS/install$ ./glidecondor_addDN -daemon "DN of Schedd A" "DNA" UIDA Configuration files changed. Configuration files changed. Remember to reconfig the affected Condor daemons. Remember to reconfig the affected Condor daemons. ● To upgrade the Condor binaries, use ● cmdline tool glidecondor_upgrade ~/glideinWMS/install$ ./glidecondor_upgrade ~/Downloads/condor-7.6.5-x86_rhap_5-stripped.tar.gz ~/glideinWMS/install$ ./glidecondor_upgrade ~/Downloads/condor-7.6.5-x86_rhap_5-stripped.tar.gz Will update Condor in /home/condor/glidecondor Will update Condor in /home/condor/glidecondor .. .. Creating backup dir Creating backup dir Putting new binaries in place Putting new binaries in place Finished successfully Finished successfully Old binaries can be found in /home/condor/glidecondor/old.120102_13 Old binaries can be found in /home/condor/glidecondor/old.120102_13UCSD Jan 17th 2012 Condor Install 26
  • 27. Starting Condor ● The installer will start Condor for you, but you still should know how to stop and start it by hand ● To start condor, run: ~/glidecondor/start_condor.sh ● To stop Condor, use condor_off -daemon master ● Finally, to force Condor to re-read the config: ~/glidecondor/sbin/condor_reconfigUCSD Jan 17th 2012 Condor Install 27
  • 28. Condor Submit node(s)UCSD Jan 17th 2012 Condor Install 28
  • 29. Refresher - Submit node(s) ● Submit node defined by the schedd ● Which holds user jobs Submit node Schedd Shadow ● Shadows will be started as the . . . jobs are matched to glideins Shadow ● One per running job ● At least one submit node is needed ● But there may be manyUCSD Jan 17th 2012 Condor Install 29
  • 30. Network use ● Glideins must contact the submit node in order to run jobs ● Both with standard protocol and CCB ● Each shadow normally uses 2 random ports ● Not firewall friendly Although firewalls can get overwhelmed anyhow ● Can be a problem over O(10k) jobs (see CM slides) ● Newer versions of Condor support “shared port daemon” Does not reduce ● Listens on a single port number of sockets ● Forwards the sockets to the appropriate local processUCSD Jan 17th 2012 Condor Install 30
  • 31. Security considerations ● Like with CM, must use endpoint security ● Schedd and CM must whitelist each other ● Certificate DN based Central manager ● AuthZ with glideins indirect Collector ● No need to whitelist glidein DN(s) Negotiator ● Collector trusts glidein, Submit node Schedd trusts Collector Schedd ● Schedd also must whitelist any clients (e.g. VO Frontend) Local users ● Only startds can use use FS auth glidein (i.e. UID based) indirect AuthZ http://research.cs.wisc.edu/condor/manual/v7.6/3_3Configuration.html#param:SecEnableMatchPasswordAuthenticationUCSD Jan 17th 2012 Condor Install 31
  • 32. Hardware needs ● Submit node is memory hungry Actual need depends on how ● 1M per running jobs due to shadows many additional ● O(10k) per job in queue for ClassAds VO attributes used ● Schedd can use a fast CPU (single threaded) ● Shadows very light CPU users ● Jobs may put substantial IO load on HDD ● Depends on how much data is being produced ● Depends how short are the jobs ● And the above is just for Condor ● VO may have portal software Make sure the remaining HW is adequate for these ● or actual interactive usersUCSD Jan 17th 2012 Condor Install 32
  • 33. User account considerations ● Users must be able to launch condor_submit locally on the submit node ● Remote submission not recommended Still local (and disabled by default) from the Condor ● VO must decide how to do it point of view ● SSHd (i.e. interactive use) ● Portal (e.g. CMS CRABServer) ● Will need one UID per user No need to create ● Non-UID based auth possible, user accounts before Installing Condor, but but not recommended do plan for it (but not supported out of the box)UCSD Jan 17th 2012 Condor Install 33
  • 34. Schedd is a superuser ● Schedd must run as root (euid==0, even as it drops ruid to “condor”) ● So it can switch UID as needed ● To access user files ● Same for shadows (but ruid set to job user) ● Host cert thus must be owned by rootUCSD Jan 17th 2012 Condor Install 34
  • 35. Installing the submit node ● Two major burdens (for basic install) ● Shared port daemon ● Security setup ● The glideinWMS installer helps with both ● Starting from Condor tarball Easy-to-use ● Should be run as root update cmdline tool available, too ● Highly recommended ● RPM install also an option ● Easy to keep up-to-date (i.e. yum update) ● But you will need to configure by handUCSD Jan 17th 2012 Condor Install 35
  • 36. Shared port daemon ● Not enabled by default in Condor ● In a nutshell ● Pick a port for it ● Enable it ● Add it to the list of Daemons to start ## condor_config.local condor_config.local ## Enable shared_port_daemon Enable shared_port_daemon SHARED_PORT_ARGS == -p 9615 SHARED_PORT_ARGS -p 9615 USE_SHARED_PORT == True USE_SHARED_PORT True DAEMON_LIST == $(DAEMON_LIST) SHARED_PORT DAEMON_LIST $(DAEMON_LIST) SHARED_PORTUCSD Jan 17th 2012 Condor Install 36
  • 37. Security setup (1) ● In a nutshell ● Configure basic GSI (i.e. point to CAs and host cert) ● Enable match authentication ● Set up authorization (i.e. switch to whitelist) ● Whitelist all DNs ● Enable GSI ● DN whitelisting a bit annoying ● Must be done in two places – in condor_config, and And is a regexp here! – in condor_mapfile ● glideinWMS provides a cmdline toolUCSD Jan 17th 2012 Condor Install 37
  • 38. Security setup (2) # condor_config.local # condor_config.local # Configure GSI # Configure GSI CERTIFICATE_MAPFILE=/opt/glidecondor/certs/condor_mapfile CERTIFICATE_MAPFILE=/opt/glidecondor/certs/condor_mapfile GSI_DAEMON_TRUSTED_CA_DIR=/etc/grid-security/certificates GSI_DAEMON_TRUSTED_CA_DIR=/etc/grid-security/certificates GSI_DAEMON_CERT = /etc/grid-security/hostcert.pem GSI_DAEMON_CERT = /etc/grid-security/hostcert.pem GSI_DAEMON_KEY = /etc/grid-security/hostkey.pem GSI_DAEMON_KEY = /etc/grid-security/hostkey.pem # Enable match authentication # Enable match authentication SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION = TRUE SEC_ENABLE_MATCH_PASSWORD_AUTHENTICATION = TRUE # Force whitelisting # Force whitelisting DENY_WRITE = anonymous@* DENY_WRITE = anonymous@* … # see CM slides for details … # see CM slides for details # list all DNs # condor_mapfile # list all DNs # condor_mapfile ... ... ... ... GSI_DAEMON_NAME=$(GSI_DAEMON_NAME),DNXXX GSI_DAEMON_NAME=$(GSI_DAEMON_NAME),DNXXX GSI "^DNXXX$" UIDXXX GSI "^DNXXX$" UIDXXX xN ... ... ... ... GSI (.*) anonymous # enable GSI GSI (.*) anonymous FS (.*) 1 # enable GSI SEC_DEFAULT_AUTHENTICATION_METHODS = FS,GSI FS (.*) 1 SEC_DEFAULT_AUTHENTICATION_METHODS = FS,GSI SEC_DEFAULT_AUTHENTICATION = REQUIRED SEC_DEFAULT_AUTHENTICATION = REQUIRED SEC_DEFAULT_ENCRYPTION = OPTIONAL SEC_DEFAULT_ENCRYPTION = OPTIONAL SEC_DEFAULT_INTEGRITY = REQUIRED Also enable local auth SEC_DEFAULT_INTEGRITY = REQUIRED # optionally, relax client and read settings # optionally, relax client and read settingsUCSD Jan 17th 2012 Condor Install 38
  • 39. Network optimization settings ● Since glideins often behind firewalls ● The glidein Startd setup optimized to avoid incoming connections and UDP ● The Schedd must also play along ## condor_config.local condor_config.local ## Reverse protocol direction Reverse protocol direction STARTD_SENDS_ALIVES == True STARTD_SENDS_ALIVES True ## Avoid UDP Avoid UDP SCHEDD_SEND_VACATE_VIA_TCP == True SCHEDD_SEND_VACATE_VIA_TCP TrueUCSD Jan 17th 2012 Condor Install 39
  • 40. Installing with Q&A installer ~/glideinWMS/install$ ./glideinWMS_install ~/glideinWMS/install$ ./glideinWMS_install ... ... Please select: 5 [5] User Schedd5 Please select: [5] User Schedd … … Which user should Condor run under?: [condor] condor Which user should Condor run under?: [condor] condor Where do you have the Condor tarball? /root/condor-7.6.4-x86_rhap_5-stripped.tar.gz Where do you have the Condor tarball? /root/condor-7.6.4-x86_rhap_5-stripped.tar.gz Where do you want to install it?: [/home/condor/glidecondor] /opt/glidecondor Where do you want to install it?: [/home/condor/glidecondor] /opt/glidecondor If something goes wrong with Condor, who should get email about it?: me@myemail If something goes wrong with Condor, who should get email about it?: me@myemail Do you want to split the config files between condor_config and condor_config.local?: (y/n) [y] y Do you want to split the config files between condor_config and condor_config.local?: (y/n) [y] y ... ... Do you want to get it from VDT?: (y/n) y Do you want to get it from VDT?: (y/n) y Do you have already a VDT installation?: (y/n) y Do you have already a VDT installation?: (y/n) y Where is the VDT installed?: /etc/osg/wn-client Where is the VDT installed?: /etc/osg/wn-client Will you be using a proxy or a cert? (proxy/cert) cert Will you be using a proxy or a cert? (proxy/cert) cert Where is your certificate located?: /etc/grid-security/hostcert.pem Where is your certificate located?: /etc/grid-security/hostcert.pem Where is your certificate key located?: /etc/grid-security/hostkey.pem Where is your certificate key located?: /etc/grid-security/hostkey.pem My DN = DN1 My DN = DN1 ... ... You can also add DN: DNXXX the DNs as an DN: DNXXX nickname: [condor001] uidXXX nickname: [condor001] uidXXX xN independent step Is this a trusted Condor daemon?: (y/n) y Is this a trusted Condor daemon?: (y/n) y ... ... DN: DN: What node is the collector running (i.e. CONDOR_HOST)?: collectornode.mydomain What node is the collector running (i.e. CONDOR_HOST)?: collectornode.mydomain Do you want to enable the shared_port_daemon?: (y/n) y Do you want to enable the shared_port_daemon?: (y/n) y What port should it use?: [9615] 9615 What port should it use?: [9615] 9615 How many secondary schedds do you want?: [9] 0 How many secondary schedds do you want?: [9] 0UCSD Jan 17th 2012 Condor Install 40
  • 41. Maintenance ● If you need to add more DNs, use ● cmdline tool glidecondor_addDN ~/glideinWMS/install$ ./glidecondor_addDN -daemon "DN of Schedd A" "DNA" UIDA ~/glideinWMS/install$ ./glidecondor_addDN -daemon "DN of Schedd A" "DNA" UIDA Configuration files changed. Configuration files changed. Remember to reconfig the affected Condor daemons. Remember to reconfig the affected Condor daemons. Do not use -daemon ● To upgrade the Condor binaries, use for clients DN ● cmdline tool glidecondor_upgrade ~/glideinWMS/install$ ./glidecondor_upgrade ~/Downloads/condor-7.6.5-x86_rhap_5-stripped.tar.gz ~/glideinWMS/install$ ./glidecondor_upgrade ~/Downloads/condor-7.6.5-x86_rhap_5-stripped.tar.gz Will update Condor in /home/condor/glidecondor Will update Condor in /home/condor/glidecondor .. .. Creating backup dir Creating backup dir Putting new binaries in place Putting new binaries in place Finished successfully Finished successfully Old binaries can be found in /home/condor/glidecondor/old.120102_13 Old binaries can be found in /home/condor/glidecondor/old.120102_13UCSD Jan 17th 2012 Condor Install 41
  • 42. Starting Condor ● The installer will start Condor for you, but you still should know how to stop and start it by hand ● The installer has created an init.d script for you /etc/init.d/condor start|stop ● To force Condor to reload its config, still use /opt/glidecondor/sbin/condor_reconfig All as rootUCSD Jan 17th 2012 Condor Install 42
  • 43. Fine tunningUCSD Jan 17th 2012 Condor Install 43
  • 44. Fine tunning ● The previous slides provide only basic setup ● Although the glideinWMS does some basic tunning ● You will likely want to tune the system further ● Proper limits in the submit node ● Default job attributes ● Sanity checks ● Priority tunning ● Not part of this talk ● Will go into details tomorrowUCSD Jan 17th 2012 Condor Install 44
  • 45. Integration with OSG AccountingUCSD Jan 17th 2012 Condor Install 45
  • 46. OSG Accounting ● OSG tries to keep accurate accounting information of who used what resources ● Using GRATIA https://twiki.grid.iu.edu/twiki/bin/view/Accounting/WebHome http://gratia-osg-prod-reports.opensciencegrid.org/gratia-reporting/UCSD Jan 17th 2012 Condor Install 46
  • 47. Per-user accounting ● OSG has per-user accounting, too ● With glideins, this level of detail lost ● Only pilot proxy seen by OSG (sites)UCSD Jan 17th 2012 Condor Install 47
  • 48. The glidein GRATIA probe ● OSG thus asks glidein operators to install a dedicated probe alongside the glidein schedd(s) ● Which will provide per-user accounting info to the OSG GRATIA server ● Optimized for use with OSG glidein factory https://twiki.grid.iu.edu/bin/view/Accounting/ProbeConfigGlideinWMS Submit node Schedd OSG GRATIA GRATIA Probe ServerUCSD Jan 17th 2012 Condor Install 48
  • 49. Installing the GRATIA probe ● In a nutshell ● Register submit node with GOC ● Tweak condor config ● yum install gratia-probe-condor ● Configure GRATIA https://twiki.grid.iu.edu/bin/view/Accounting/ProbeConfigGlideinWMSUCSD Jan 17th 2012 Condor Install 49
  • 50. Condor changes for GRATIA ● GRATIA gets information from history logs ● Requires one file per terminated job for efficiency ● GRATIA needs to know where the job ran ● Additional attribute added to the job ClassAd (more general details on this tomorrow) ## condor_config.local condor_config.local PER_JOB_HISTORY_DIR ==/var/lib/gratia/data PER_JOB_HISTORY_DIR /var/lib/gratia/data JOBGLIDEIN_ResourceName= JOBGLIDEIN_ResourceName= "$$([IfThenElse(IsUndefined(TARGET.GLIDEIN_ResourceName), "$$([IfThenElse(IsUndefined(TARGET.GLIDEIN_ResourceName), IfThenElse(IsUndefined(TARGET.GLIDEIN_Site), IfThenElse(IsUndefined(TARGET.GLIDEIN_Site), FileSystemDomain, TARGET.GLIDEIN_Site), FileSystemDomain, TARGET.GLIDEIN_Site), TARGET.GLIDEIN_ResourceName)])" TARGET.GLIDEIN_ResourceName)])" SUBMIT_EXPRS == $(SUBMIT_EXPRS) JOBGLIDEIN_ResourceName SUBMIT_EXPRS $(SUBMIT_EXPRS) JOBGLIDEIN_ResourceNameUCSD Jan 17th 2012 Condor Install 50
  • 51. GRATIA configuration ● Essentially just tell GRATIA ## /etc/gratia/condor/ProbeConfig /etc/gratia/condor/ProbeConfig what name you have SiteName="VOX_glidein_node1" SiteName="VOX_glidein_node1" EnableProbe="1" registered in with GOC EnableProbe="1" ## add this line to allow user jobs add this line to allow user jobs ● Then enable it ## without a proxy without a proxy MapUnknownToGroup="1" MapUnknownToGroup="1" ● You also need to tell it ## /root/setup.sh /root/setup.sh where to find Condor source /etc/profile.d/condor.sh source /etc/profile.d/condor.shUCSD Jan 17th 2012 Condor Install 51
  • 52. The EndUCSD Jan 17th 2012 Condor Install 52
  • 53. Pointers ● The official glideinWMS project Web page is http://tinyurl.com/glideinWMS ● glideinWMS development team is reachable at glideinwms-support@fnal.gov ● Condor Home Page http://www.cs.wisc.edu/condor/ ● Condor support condor-user@cs.wisc.edu condor-admin@cs.wisc.eduUCSD Jan 17th 2012 Condor Install 53
  • 54. Acknowledgments ● The glideinWMS is a CMS-led project developed mostly at FNAL, with contributions from UCSD and ISI ● The glideinWMS factory operations at UCSD is sponsored by OSG ● The funding comes from NSF, DOE and the UC systemUCSD Jan 17th 2012 Condor Install 54

×