Your SlideShare is downloading. ×
PPT
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
359
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Automating Distributed Software Testing Steven Newhouse Deputy Director, OMII
  • 2. OMII Activity
    • Integrating software from multiple sources
      • Established open-source projects
      • Commissioned services & infrastructure
    • Deployment across multiple platforms
      • Currently:
        • SUSE 9.0 (server & client) WinXP (client)
      • Future:
        • WinXP (server), RHEL
    • Verify interoperability between platforms & versions
  • 3. Distributed Software Testing
    • Automatic Software Testing vital for the Grid
      • Build Testing – Cross platform builds
      • Unit Testing – Local Verification of APIs
      • Deployment Testing – Deploy & run package
      • Distributed Testing – Cross domain operation
      • Regression Testing – Compatibility between versions
      • Stress Testing – Correct operation under real loads
    • Distributed Testbed
      • Need a breadth & variety of resources not power
      • Needs to be a managed resource – process
  • 4. Contents
    • Experience from ICENI
    • Recent ETF activity
    • NMI build system
    • What next?
  • 5. In another time in another place… (thanks to Nathalie Furmento)
    • ICENI
      • Daily builds from various CVS tags
      • On a successful build deploy the software
      • Run tests against the deployed software
    • Experience
      • Validate the state of the checked in code
      • Validate that the software still works!
      • On reflection… probably needed more discipline in the development team & even more testing!
  • 6. Therefore several issues…
    • Representative resources to build & deploy software
    • Software to specify & manage the builds
    • Automated distributed co-ordinated tests
    • Reporting and notification process
  • 7. Secure Flocked Condor Pool
    • Activity within the UK Engineering Task Force
    • Collaboration between:
      • Steven Newhouse - OMII (Southampton)
      • John Kewley - CCLRC Daresbury Laboratory
      • Anthony Stell - Glasgow
      • Mark Hayes - Cambridge
      • Andrew Carson - Belfast e-Science Centre
      • Mark Hewitt - Newcastle
  • 8. Stage 1: Flocked Condor Pools
    • Configure flocking between pools:
      • Set FLOCK_TO & FLOCK_FROM
      • Set HOSTALLOW_READ & HOSTALLOW_WRITE
    • Firewalls:
      • A reality for most institutions
      • Configure outgoing & incoming f/wall traffic
      • Set LOWPORT & HIGHPORT
      • Experiences
        • http://www.doc.ic.ac.uk/~marko/ETF/flocking.html
        • http://www.escience.cam.ac.uk/projects/camgrid/documentation.html
  • 9. Issues
    • Good News
      • Well documented & mature code
    • Bad News
      • Firewalls
        • Need to open large port range to many hosts
        • Depending on your site policy this may be a problem!
      • Access Policy
        • Need access control mechanisms
      • Scalability
  • 10. Flocking & firewalls Execution Node Manager Node
  • 11. Upcoming Solution: Condor-C
    • Condor:
      • Submit a job which is managed by the schedd
      • Schedd discovers a startd through matchmaking and starts job on remote resource
    • Condor-G:
      • Submit a job which is managed by the schedd
      • Schedd launches job through gatekeeper on remote Globus enabled resource
    • Condor-C:
      • Submit a job which is managed by the schedd
      • Schedd sends job to a schedd on remote Condor pool
    • This is a good:
      • Submission machine needs no direct route to startd just remote schedd.
    • http://www.opensciencegrid.org/events/meetings/boston0904/docs/vdt-roy.ppt
  • 12. Stage 2: Configuring Security
    • Use ‘standard’ Grid authentication
      • X.509 certificates & GSI proxy certificates
    • Condor Configuration
      • Require authentication by using the local filesystem or GSI
        • SEC_DEFAULT_AUTHENTICATION = REQUIRED
        • SEC_DEFAULT_AUTHENTICATION_METHODS = GSI, FS
      • Point to the location of the certificate directory (authentication)
        • GSI_DAEMON_DIRECTORY = /etc/grid-security
      • Point to the location of the gridmap file (authorisation)
        • GRIDMAP = /etc/grid-security/grid-mapfile.condor
  • 13. Stage 3: Authorising Access
    • List trusted masters (possibly all hosts?)
      • All entries on one line
      • UK CA requires DN in two forms:
        • Email & emailAddress
    • GSI_DAEMON_NAME = /C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.soton.ac.uk/emailAddress=s.newhouse@omii.ac.uk, /C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.soton.ac.uk/Email=s.newhouse@omii.ac.uk,
    • … OTHER HOST DNs
  • 14. Stage 4: Controlling Access
    • Girdmap file has same layout as in GT
      • “ DN” USER@DOMAIN
    • "/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.soton.ac.uk/emailAddress=s.newhouse@omii.ac.uk" host@polaris.ecs.soton.ac.uk "/C=UK/O=eScience/OU=Southampton/L=SeSC/CN=polaris.ecs.soton.ac.uk/Email=s.newhouse@omii.ac.uk" host@polaris.ecs.soton.ac.uk "/C=UK/O=eScience/OU=Imperial/L=LeSC/CN=steven newhouse" snewhouse@polaris.ecs.soton.ac.uk
  • 15. Issues
    • Good News
      • Authorised access to flocked Condor pools
      • Provides know how for UK wide pool (NGS?)
    • Bad News
      • Documentation (will provide feedback)
      • Implemented through a lot of trial & error
    • Activity HowTo
      • http://wiki.nesc.ac.uk/read/sfct
  • 16. Exploiting the distributed pool
    • Simple build portal
      • Upload software, select resources, download binaries
  • 17. Build Management
    • We want to build a package
      • May have installed dependencies (e.g. compilers)
      • May have build dependencies (e.g. openssl)
    • Package may require patching
      • Take existing source code package
      • Patch before building
    • Building on multiple platforms
      • Move source to the specified platform
      • Build, package and return binaries to host
    • Distributed Inter-dependent Tasks
  • 18. Use Condor to manage builds
    • Leverage Condor’s established features
      • Execution of a job on a remote resource
      • Persistent job execution
      • Matching of job requirements to resource capability
      • Management of dependent tasks – DAGMAN
    • Integrated into NMI build system
      • Manages the builds of the NMI releases
      • Declare build parameters which are converted to Condor jobs
  • 19. The Future… NMI & OMII
    • Building OMII software on NMI system
      • Rolling changes back into main software base
      • Integrating OMII builds into NMI system
    • Ongoing activity
      • Adding in UK resources into the NMI pool
      • Distributed deployment & testing still to be resolved