Your SlideShare is downloading. ×
0
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Open Science Grid For Virtual Cell
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Open Science Grid For Virtual Cell

506

Published on

Presentation given by Prasanna Gautam on completion of UCHC Summer internship at Center for Cell Analysis and Modeling. The work involved figuring out how to run virtual cell programs on Open Science …

Presentation given by Prasanna Gautam on completion of UCHC Summer internship at Center for Cell Analysis and Modeling. The work involved figuring out how to run virtual cell programs on Open Science Grid sites.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
506
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
3
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • In my project, I don’t really care about the client much, execept maybe notifying and returning some information about progress, which is still far away Connection Manager maintains the active communication with the client and notifies the JMS JMS is what keeps Vcell running. Keeps a state of running jobs in queues and if something fails, respawns Keeps track of client nodes (compute cluster) Ensures messages get delivered - PBS runs the jobs and monitors on the Compute Cluster
  • A VO is just a loosely based set of users basically, usually affiliated to a core organization but not everyone is. A site would be of one of the subtypes: Compute Element (CE) or Storage Element (SE)
  • i.e, we need to agree on a few standards if we want to have a functioning grid
  • VDT – Virtual Data Toolkit - Forms the client and gateway infrastructure by taking a subset of tools like Condor, Globus and others - Pretty self sufficient GSI - Grid Security Infrastructure [http://www.globus.org/security/overview.html] - Provides a single sign-on for users on the grid. Every user is identified via a certificate provided by DOE (X.509 format), a third party Certificate Authority is used to certify the link between the public key and the user. Globus Toolkit – A set of tools used by a lot of grid sites to manage the workflow and monitor jobs GridFTP – GridFTP is like multi-threaded FTP, if your regular file transfer is a pipe, GridFTP is like a collection of those pipes to make a hose WSRF – A framework to represent objects on the grid, like computing resources, jobs as the resources using XML. The nice thing about WSRF is that it helps to maintain a state on both ends. An analogue to this would be the RESTful representation of objects in the web using XML, or Javascript Objects.
  • This is a small section of Condor Status output from a pool, this can be used to send jobs and ensure they’re placed in right type of systems
  • Transcript

    • 1. Bringing Open Science Grid to Virtual Cell Prasanna Gautam Trinity College ‘11
    • 2. Goals <ul><li>Understand how Virtual Cell deploys jobs </li></ul><ul><li>Understand how OSG works </li></ul><ul><li>Figure out how to deploy and monitor jobs on OSG </li></ul><ul><li>Figure out how to do it without breaking Virtual Cell/reinventing </li></ul>
    • 3. VCell Software Architecture (web-based distributed client/server framework) Compute Cluster Simulation Worker Service Siumulation Data Service Data Export Service Database Service Simulation Dispatch Service Connection Manager Server Manager Database Service Database Service Data Export Service Data Export Service Siumulation Data Service Siumulation Data Service Simulation Dispatch Service Simulation Dispatch Service Simulation Worker Service Simulation Worker Service Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Storage Cluster Servers at CCAM VCell meets OSG Client JMS Broker (SonicMQ) Database (Oracle) Batch Scheduler (PBSPro)
    • 4. Scalability <ul><li>200 nodes will not be enough in foreseeable future for Virtual Cell </li></ul><ul><li>Solution? </li></ul><ul><ul><li>Adding more machines? </li></ul></ul><ul><ul><ul><li>Doesn’t always scale, but it always adds cost </li></ul></ul></ul><ul><ul><li>Maybe we can get someone else to run our programs? </li></ul></ul>
    • 5. Grid <ul><li>A common framework for running jobs on remote computing nodes. </li></ul><ul><li>Terms </li></ul><ul><ul><li>Fabric – Underlying hardware infrastructure, networking </li></ul></ul><ul><ul><li>Middleware – Software linking end-user applications and fabric </li></ul></ul><ul><ul><li>Virtual Organization (VO) – Group of certified users employing grid technology </li></ul></ul><ul><ul><li>Site – A computation or storage service accessible on the grid </li></ul></ul><ul><ul><li>Gatekeeper – A point of entry to a site for submitting jobs and querying information </li></ul></ul>
    • 6. We want a grid, not tower of Babel!
    • 7. Open Science Grid <ul><li>Started in 2004 (fairly new) </li></ul><ul><li>Mostly Linux – 32 bit machines </li></ul><ul><li>Common middleware (VDT) </li></ul><ul><li>Common Authentication (GSI) – based on Public Key Infrastructure (PKI) </li></ul><ul><li>Common API for running jobs (Globus) </li></ul><ul><li>File Transfer protocols (GridFTP) </li></ul><ul><li>Common high level communication protocols (WSRF) </li></ul>
    • 8. VCell meets OSG Compute Cluster Simulation Worker Service Siumulation Data Service Data Export Service Database Service Simulation Dispatch Service Connection Manager Server Manager Database Service Database Service Data Export Service Data Export Service Siumulation Data Service Siumulation Data Service Simulation Dispatch Service Simulation Dispatch Service Simulation Worker Service Simulation Worker Service Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Compiled Simulation Jobs Storage Cluster Servers at CCAM Outside Firewall VCell Architecture Client JMS Broker (SonicMQ) Database (Oracle) Batch Scheduler (PBSPro) OSG Services OSG OSG Web service
    • 9. My Project VCell meets OSG
    • 10. Overall structure <ul><li>A light central server that “listens” for everything. </li></ul><ul><ul><li>Runs on vdtclient2 (outside the firewalls, so jobs can provide feedback) </li></ul></ul><ul><ul><li>Listens for changes in the supporting sites </li></ul></ul><ul><ul><li>Platform for remote and internal jobs to communicate. </li></ul></ul><ul><ul><li>Gives a point of administration/monitoring for OSG part of VCell </li></ul></ul>VCell meets OSG My Project
    • 11. Overall structure <ul><li>Services that can be spawned by PBS (Portable Batch System) that Vcell uses </li></ul><ul><ul><li>Used to </li></ul></ul><ul><ul><ul><li>Search for sites </li></ul></ul></ul><ul><ul><ul><li>Notify Listener </li></ul></ul></ul><ul><ul><ul><li>Submit Jobs </li></ul></ul></ul><ul><ul><ul><li>Monitor Jobs </li></ul></ul></ul><ul><ul><li>Should be able to run on existing cluster </li></ul></ul><ul><ul><ul><li>A lot of extra dependencies that I’m trying to minimize </li></ul></ul></ul>VCell meets OSG My Project
    • 12. Scavenging for sites <ul><li>Few Existing tools </li></ul><ul><ul><li>MyOSG </li></ul></ul><ul><ul><ul><li>A website for giving summary for resources </li></ul></ul></ul><ul><ul><li>VORS </li></ul></ul><ul><ul><ul><li>A website for getting information </li></ul></ul></ul><ul><ul><ul><li>Extremely useful but getting rid of by the end of summer </li></ul></ul></ul><ul><ul><li>LDAP query to BDII server at is.grid.iu.edu </li></ul></ul><ul><ul><ul><li>Glue schema </li></ul></ul></ul>VCell meets OSG My Project
    • 13. VCell meets OSG My Project
    • 14. Matching with sites <ul><li>Two main ways </li></ul><ul><ul><li>Using Condor ClassAds </li></ul></ul><ul><ul><li>Running standard jobs and ranking sites based on them </li></ul></ul>VCell meets OSG My Project
    • 15. Condor ClassAds <ul><li>Think of Classified Adverts in newspapers </li></ul><ul><ul><li>A service provider (Compute Element, Service Element in this case), tells what it has </li></ul></ul><ul><ul><li>A client (us in this case) ask for what it wants and we try to match a suitable site. </li></ul></ul><ul><ul><li>Easier, but not very reliable </li></ul></ul><ul><ul><ul><li>Our requirements are fairly static </li></ul></ul></ul><ul><ul><ul><li>A significant rework to get to work on current system </li></ul></ul></ul>VCell meets OSG My Project
    • 16. Examples <ul><li>condor_status -const 'KeyboardIdle > 20*60 && Memory > 100' </li></ul><ul><ul><li>Returns computers that have been idle for more than 20 minutes and have more than 100 MB of memory </li></ul></ul>VCell meets OSG My Project
    • 17. Examples VCell meets OSG My Project
    • 18. Using Heuristics <ul><li>Running jobs and ranking sites </li></ul><ul><li>Send small jobs </li></ul><ul><li>Profile them </li></ul><ul><li>Over time we’ll have a good understanding of our portion of grid </li></ul><ul><li>Definitely harder </li></ul><ul><li>But, we can be smarter about deploying jobs </li></ul>VCell meets OSG My Project
    • 19. Example Job Count chart for BNL_ATLAS_1 Source: Gratia VCell meets OSG My Project
    • 20. Running Jobs <ul><li>Two major ways </li></ul><ul><ul><li>Condor-G </li></ul></ul><ul><ul><li>Globus Toolkit </li></ul></ul>VCell meets OSG My Project
    • 21. Condor-G <ul><li>Submit directly to condor pool on remote site </li></ul><ul><ul><li>Doesn’t always work </li></ul></ul><ul><ul><li>In our case, we use PBS </li></ul></ul><ul><ul><ul><li>Condor ClassAds take care of this but a little work upfront </li></ul></ul></ul><ul><ul><li>I tried to use OSG Matchmaker and let it sort things. </li></ul></ul>VCell meets OSG My Project
    • 22. Globus Toolkit <ul><li>Provides a middleware for deploying and monitoring Jobs </li></ul><ul><li>Provides Java, C and Python APIs </li></ul><ul><ul><li>jGlobus makes sense to deploy </li></ul></ul><ul><ul><li>Better integration with existing codebase </li></ul></ul><ul><li>We can design complete workflows using these tools </li></ul>VCell meets OSG My Project
    • 23. What I’d like to do <ul><li>Select a site </li></ul><ul><li>Start a job </li></ul><ul><li>Attach a listener that polls as a PBS job for changes </li></ul><ul><li>Pull incremental progresses as the job is running </li></ul><ul><li>Keep a transactional status of the job on the Oracle Database </li></ul>VCell meets OSG My Project
    • 24. What I was almost able to do <ul><li>Select a site using Condor Matchmaker </li></ul><ul><ul><li>It seems to select Harvard SBGrid almost all the time </li></ul></ul><ul><ul><li>So, I’m taking a random site for testing </li></ul></ul><ul><li>Record the URL Globus Gatekeeper provides in MySQL </li></ul><ul><li>Poll for status and wait for DONE signal </li></ul><ul><li>If it is done, pull the output </li></ul>VCell meets OSG My Project
    • 25. Conclusion <ul><li>It really is feasible to run jobs like Virtual Cell on OSG </li></ul><ul><li>Just not in 10 weeks from start to finish </li></ul><ul><li>OSG is an evolving system </li></ul><ul><ul><li>Our decisions have to be flexible </li></ul></ul><ul><li>There are a lot of architecture decisions we need to resolve. </li></ul>VCell meets OSG My Project
    • 26. Future <ul><li>Continuity of the project </li></ul><ul><li>Being able to run from Vcell cluster </li></ul><ul><li>Keeping a dynamic view of the grid as a whole </li></ul><ul><li>Feedback to the user </li></ul>VCell meets OSG My Project
    • 27. Acknowledgments <ul><li>Dr. Ion Moraru </li></ul><ul><li>Jeff Dutton </li></ul><ul><li>James Schaff </li></ul><ul><li>Dr. Greg Huber </li></ul><ul><li>Mats Rynge (RENCI) </li></ul><ul><li>Arvind Gopu (Indiana University, OSG-GOC) </li></ul><ul><li>Peter Doherty (Harvard SBGrid) </li></ul>

    ×