11. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
12. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
13. 1. Principles of job submission and execution management Vision UNiformInterface to COmputingResources seamless, secure, and intuitive History 08/1997 – 12/2002: UNICORE and UNICORE Plus projects Initial development started in two German projects funded by the German ministry of education and research (BMBF) Continuation in different EU projects since 2002 Open Source community development since summer 2004
14. http://www.unicore.eu UNICORE 6 Guiding Principles, Implementation Strategies Open source under BSD license with software hosted on SourceForge Standards-based: OGSA-conform, WS-RF 1.2 compliant Open, extensible Service-Oriented Architecture (SOA) Interoperable with other Grid technologies Seamless, secure and intuitive following a vertical end-to-end approach Mature Security: X.509, proxy and VO support Workflow support tightly integrated while being extensible for different workflow languages and engines for domain-specific usage Application integration mechanisms on the client, services and resource level Variety of clients: graphical, command-line, API, portal, etc. Quick and simple installation and configuration Support for many operating systems (Windows, MacOS, Linux, UNIX) and batch systems (LoadLeveler, Torque, SLURM, LSF, OpenCCS) Implemented in Java to achieve platform-independence
15. scientific clientsand applications URCEclipse-based Rich client HiLAProgrammingAPI UCCcommand-line client Portal e.g. GridSphere X.509, Proxies, SOAP, WS-RF, WS-I, JSDL web service stack Gateway central services running in WS-RF hosting environments ServiceRegistry WorkflowEngine OGSA-RUS, UR,GLUE 2.0 ServiceOrchestrator CISInfoService Gateway – Site 1 Gateway – Site 2 authentication UNICOREWS-RFhostingenvironment UNICOREWS-RFhostingenvironment OGSA-ByteIO, OGSA-BES, JSDL, HPC-P, OGSA-RUS, UR UNICORE Atomic Services OGSA-* UNICORE Atomic Services OGSA-* UVOSVO Service Grid services hosting XNJS – Site 1 XNJS – Site 2 IDB IDB job incarnation X.509, XACML, SAML, Proxies XACML entity XACML entity XUUDB XUUDB authorization Target System Interface – Site 1 Target System Interface – Site 2 DRMAA ExternalStorage Local RMS (e.g. Torque, LL, LSF, etc.) Local RMS (e.g. Torque, LL, LSF, etc.) GridFTP, Proxies USpace USpace data transfer to external storages http://www.unicore.eu
16. http://www.unicore.eu Workflows in Two layer architecture for scalability Workflow engine Based on Shark open-source XPDLengine Pluggable, domain-specific workflow languages Service orchestrator Job execution and monitoring Callback to workflow engine Brokering based on pluggable strategies Clients GUI client based on Eclipse Commandline submission of workflows is also possible
17. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
18. High-Throughput Computing Large amount of tasks that can be executed independently Parameter Studies Monte Carlo or Stochastic Methods Genome Sequencing (matching) Analysis of LHC data : Starting from this Looking for this (1 in 1013)
19. 2. Principles of high-throughput computing Vision Condor provides high-throughput computing in a variety of environments Local dedicated clusters (machine rooms) Local opportunistic (desktop) computers) Grid environments; Can submit jobs to other systems Can run workflows of jobs Can run parallel jobs Independently parallel (lots of single jobs) Tightly coupled (such as MPI)
20. 2. Principles of high-throughput computing History and Activity Distributed Computing research performed by a team of ~35 faculty, full time staff and students who Established in 1985 Faces software/middleware engineering challenges in a UNIX/Linux/Windows/OS X environment, Involved in national and international collaborations, Interacts with users in academia and industry, Maintains and support a distributed production environment (more than 5000 CPUs at UW), Educates and trains students.
21. Condor Project:Main Threads of Activities Distributed Computing Research – develop and evaluate new concepts, frameworks and technologies Develop and maintain Condor; support our users More on next slide The Open Science Grid (OSG) – build and operate a national High Throughput Computing infrastructure The Grid Laboratory Of Wisconsin (GLOW) – build, maintain and operate a distributed computing and storage infrastructure on the UW campus The NSF Middleware Initiative (NMI) - Develop, build and operate a national Build and Test facility powered by Metronome (ETICS-II)
22. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
23. Web Services XML DCE RPC DCOM RMI CORBA “Web services has dramatically reduced the programming and management cost of publishing and receiving information” Jim Gray, Microsoft Research EMBRACE – 4yr EU project to establish services for the bioinformatics community
24. 3. Principles of service-oriented architectures Vision Provide the fundamental components to get the grid working History Starting point in I-WAY, a distributed high-performance network demonstrated at the SuperComputing '95 conference and exhibition
25. …14 Years Later 4 major versions Components to address the original problems Many new fields recent hot topics: service oriented science, virtualization Diverse application areas recently: lots of bioinformatics and medical apps others include: earthquakes, particle physics, earth sciences
26. 21 Globus Software now – many components Globus Projects OGSA-DAI GT4 MPICH- G2 Data Rep Replica Location Java Runtime MyProxy Delegation GridWay GridFTP MDS4 CAS C Runtime GSI- OpenSSH Incubator Mgmt Reliable File Transfer GRAM Python Runtime C Sec GT4 Docs Incubator Projects Cog WF GAARDS VirtWkSp MEDICUS Others... Metrics OGRO GDTE UGP GridShib Dyn Acct Gavia JSC DDM LRMA HOC-SA PURSE Introduce WEEP Gavia MS SGGC ServMark Security Execution Mgmt Info Services Common Runtime Other Data Mgmt
27. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
29. EGEE Project Overview 17000 users 136000 LCPUs (cores) 25Pb disk 39Pb tape 12 million jobs/month +45% in a year 268 sites +5% in a year 48 countries +10% in a year 162 VOs +29% in a year Technical Status - Steven Newhouse - EGEE-III First Review 24-25 June 2009 24
36. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
37.
38. 5. Principles of using distributed and high performance systems ARC middleware (Advanced Resource Connector) open source out-of-the-box Grid solution software which enables production quality computational and data Grids (released in May 2002) development is coordinated by NDGF emphasis is put on scalability, stability, reliability and performance builds upon standard OS solutions,OpenLDAP, OpenSSL, SASL and Globus Toolkit adds services not provided by Globus extends or completely replaces some Globus components
52. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
53. 6. Higher level APIs: OGSA-DAI, SAGA and metadata management (S-OGSA) OGSA-DAI Vision is to enable the sharing of data resources to enable collaboration, to support: Data access - access to structured data in distributed heterogeneous data resources. Data transformation e.g. expose data in schema X to users as data in schema Y. Data integration e.g. expose multiple databases to users as a single virtual database Data delivery - delivering data to where it's needed by the most appropriate means e.g. web service, e-mail, HTTP, FTP, GridFTP
54. 6. Higher level APIs: OGSA-DAI, SAGA and metadata management (S-OGSA) OGSA-DAI History The OGSA-DAI project started in February 2002 as part of the UK e-Science Grid Core Program Is today part of OMII-UK, a partnership between: OMII, The University of Southampton myGrid, The University of Manchester OGSA-DAI, The University of Edinburgh
55. 6. Higher level APIs: OGSA-DAI, SAGA and metadata management (S-OGSA) Vision of a Simple API for Grid Application - SAGA Provide simple programmatic interface that is widely-adopted, usable and available for enabling applications for the grid Simplicity: easy to use, install, administer and maintain Uniformity: provides support for different application programming languages as well as consistent semantics and style for different Grid functionality Scalability: Contains mechanisms for the same application (source) code to run on a variety of systems ranging from laptops to HPC resources Genericity: adds support for different grid middleware, even concurrent ones Modularity: provides a framework that is easily extendable
56. 6. Higher level APIs: OGSA-DAI, SAGA and metadata management (S-OGSA) Metadata management: Make metadata Princess in the kingdom of Semantic Web
57. Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
58. 7. Workflows Organize your work e.g: Gather initial data Pre-processing of data Define computing job(s) Initiate job(s) Gather results Post-processing of results : Repeat During the school you will understand how you can do this in different ways with the systems studied. But, this can also be done with specific workflow systems: Taverna, P-Grade Portal,…
59. Motivations for developing P-GRADE portal P-GRADE portal should Give an answer for all the questions of an e-scientist Hide the complexity of the underlying grid middlewares Provide a high-level graphical user interface that is easy-to-use for e-scientists Support many different grid programming approaches (see Morris Riedel’s talk): Simple Scripts & Control (sequential and MPI job execution) Scientific Application Plug-ins (based on GEMLCA) Complex Workflows Parameter sweep applications: both on job and workflow level Interoperability: transparent access to grids based on different middleware technology Support three levels of parallelism
60. Short History of P-GRADE portal Parallel Grid Application and Development Environment Initial development started in the Hungarian SuperComputing Grid project in 2003 It has been continuously developed since 2003 Detailed information: http://portal.p-grade.hu/ Open Source community development since January 2008: https://sourceforge.net/projects/pgportal/
61. Integrating Practical Principles of service-oriented architecture Principles of high-throughput computing Principles of distributed data management Principles of job submission and execution management Principles of using distributed and high performance systems Higher level APIs: OGSA-DAI, SAGA and metadata management Workflows
Editor's Notes
Yellow – gLite, Green – externally supported components, gLite consortium