Data Grid


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data Grid

  1. 1. Grid Computing from a solid past to a bright future? David Groep NIKHEF 2002-08-28
  2. 2. The Grid: a vision? <ul><li>Imagine that you could plug your computer </li></ul><ul><li>into the wall and have direct access to huge computing resources immediately, </li></ul><ul><li>just as you plug in a lamp to get instant light. </li></ul><ul><li>… </li></ul><ul><li>Far from being science-fiction, this is the idea </li></ul><ul><li>the XXXXXX project is about to make into reality. </li></ul><ul><li>… </li></ul>from a project brochure in 2001
  3. 3. <ul><li>Physics @ CERN </li></ul><ul><li>LHC particle accellerator </li></ul><ul><li>operational in 2007 </li></ul><ul><li>5-10 Petabyte per year </li></ul><ul><li>150 countries </li></ul><ul><li>> 10000 Users </li></ul><ul><li>lifetime ~ 20 years </li></ul>level 1 - special hardware 40 MHz (40 TB/sec) level 2 - embedded level 3 - PCs 75 KHz (75 GB/sec) 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) data recording & offline analysis The Need for Grids: LHC
  4. 4. CPU & Data Requirements Estimated CPU Capacity at CERN 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 5,000 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 year K SI95 Moore’s law – some measure of the capacity technology advances provide for a constant number of processors or investment Jan 2000: 3.5K SI95 LHC experiments Other experiments < 50% of the main analysis capacity will be at CERN Estimated CPU capacity required at CERN
  5. 5. More Reasons Why ENVISAT <ul><li>3500 MEuro programme cost </li></ul><ul><li>10 instruments on board </li></ul><ul><li>200 Mbps data rate to ground </li></ul><ul><li>400 Tbytes data archived/year </li></ul><ul><li>~100 `standard’ products </li></ul><ul><li>10+ dedicated facilities in Europe </li></ul><ul><li>~700 approved science user projects </li></ul>
  6. 6. And More … <ul><li>For access to data </li></ul><ul><ul><li>Large network bandwidth to access computing centers </li></ul></ul><ul><ul><li>Support of Data banks replicas (easier and faster mirroring) </li></ul></ul><ul><ul><li>Distributed data banks </li></ul></ul><ul><li>For interpretation of data </li></ul><ul><ul><li>GRID enabled algorithms BLAST on distributed data banks, distributed data mining </li></ul></ul>Bio-informatics
  7. 7. And even more … <ul><li>financial services, life sciences, strategy evaluation, … </li></ul><ul><li>instant immersive teleconferencing </li></ul><ul><li>remote experimentation </li></ul><ul><li>pre-surgical planning and simulation </li></ul>
  8. 8. Why is the Grid successful? <ul><li>Applications need large amounts of data or computation </li></ul><ul><li>Ever larger, distributed user community </li></ul><ul><li>Network grows faster than compute power/storage </li></ul>
  9. 9. Network bandwidth growth <ul><ul><li>Source: The Informal Supercomputer by Mark Baker (5/96) </li></ul></ul>
  10. 10. Distributed Time Line 1978 RPC conceived (by Per Hansen) 1970 1980 1990 2000 1973 Ethernet 1965 DARPA starts network research 1969 ARPA net with 4 hosts 1972 TELNET 1969 Creeper & Reaper 1974 links to UK, TCP 1985 Condor ~ 1980 “commodity computing” 1990 TBL creates the Web 1986 1 st IETF 1994 W3C 1997 CORBA 1999 1 st GF 2001 1 st GGF 1997 Globus 2002 EDG 1.2 2001 WSDL
  11. 11. Inter-domain communication <ul><li>The Internet community spawned 3360 RFCs (as of August 2 nd , 2002) </li></ul><ul><li>Myriad of different protocols and APIs </li></ul><ul><li>Be strict in what you send be liberal in what you accept </li></ul><ul><li>Inter-domain by nature </li></ul><ul><li>Increasing focus on security </li></ul>
  12. 12. Intra-domain tools <ul><li>RPC proved hugely successful within domains </li></ul><ul><ul><li>YP </li></ul></ul><ul><ul><li>Network File System </li></ul></ul><ul><ul><li>Typical client-server stuff… </li></ul></ul><ul><li>CORBA </li></ul><ul><ul><li>Extension of RPC to OO design model </li></ul></ul><ul><ul><li>Diversification </li></ul></ul><ul><li>Latest trend: web services </li></ul>
  13. 13. The beginnings of the Grid <ul><li>Grown out of distributed computing </li></ul><ul><ul><li>Gigabit network test beds & meta-computing </li></ul></ul><ul><ul><li>Supercomputer sharing (I-WAY) </li></ul></ul><ul><ul><li>Condor ‘flocking’ </li></ul></ul><ul><li>Focus shifts to inter-domain operations </li></ul>GUSTO meta-computing test bed in 1999
  14. 14. The Grid Ian Foster and Carl Kesselman, editors, “The Grid: Blueprint for a New Computing Infrastructure,” Morgan Kaufmann, 1999
  15. 15. The One-Liner <ul><li>R esource sharing and coordinated problem solving in dynamic multi-institutional virtual organisations </li></ul>
  16. 16. Standards Requirements <ul><li>Standards are key to inter-domain operations </li></ul><ul><li>GGF established in 2001 </li></ul><ul><li>Approx. 40 working & research groups </li></ul>
  17. 17. Protocol Layers & Bodies Physical Data Link Network Transport Session Presentation Application Standard body: IEEE Standard body: IETF Standard bodies: GGF W3C Application Fabric Connectivity Resource Collective Internet Transport Application Link Internet Protocol Architecture
  18. 18. Grid Architecture (v1) Application Fabric “ Controlling things locally”: Access to, & control of, resources Connectivity “ Talking to things”: communication (Internet protocols) & security Resource “ Sharing single resources”: negotiating access, controlling use Collective “ Coordinating multiple resources”: ubiquitous infrastructure services, app-specific distributed services Internet Transport Application Link Internet Protocol Architecture
  19. 19. Grid Architecture Grid Services GRAM Grid Security Infrastructure (GSI) Grid Fabric Condor MPI PBS Internet Linux Application Toolkits DUROC MPICH-G2 Condor-G GridFTP MDS SUN VLAM-G Make all resources talk standard protocols Promote interoperability of application toolkit, similar to interoperability of networks by Internet standards ReplicaSrv Applications
  20. 20. What should the Grid provide? <ul><li>Dependable , consistent and pervasive access </li></ul><ul><li>Interoperation among organisations </li></ul><ul><li>Challenges: </li></ul><ul><ul><li>Complete transparency for the user </li></ul></ul><ul><ul><li>Uniform access methods for computing, data and information </li></ul></ul><ul><ul><li>Secure, trustworthy environment for providers </li></ul></ul><ul><ul><li>Accounting (and billing) </li></ul></ul><ul><ul><li>Management-free ‘ Virtual Organizations ’ </li></ul></ul>
  21. 21. <ul><li>Globus Project started 1997 </li></ul><ul><li>Current d e-facto standard </li></ul><ul><li>Reference implementation of Global Grid Forum standards </li></ul><ul><li>Toolkit `bag-of-services' approach </li></ul><ul><li>Several middleware projects: </li></ul><ul><ul><li>EU DataGrid </li></ul></ul><ul><ul><li>CrossGrid, DataTAG, PPDG, GriPhyN </li></ul></ul><ul><ul><li>In NL: ICES/KIS Virtual Lab, VL-E </li></ul></ul>Grid Middleware
  22. 22. Condor <ul><li>Scavenging cycles off idle work stations </li></ul><ul><li>Leading themes: </li></ul><ul><ul><li>Make a job feel `at home’ </li></ul></ul><ul><ul><li>Don’t ever bother the resource owner! </li></ul></ul><ul><li>Bypass redirect data to process </li></ul><ul><li>ClassAds matchmaking concept </li></ul><ul><li>DAGman dependent jobs </li></ul><ul><li>Kangaroo file staging & hopping </li></ul><ul><li>NeST allocated `storage lots’ </li></ul><ul><li>PFS Pluggable File System </li></ul><ul><li>Condor-G reliable job control for the Grid </li></ul>
  23. 23. Application Toolkits <ul><li>Collect and abstract services in an order fashion </li></ul><ul><li>Cactus : plug-n-play numeric simulations </li></ul><ul><li>Numeric propulsion system simulation NPSS </li></ul><ul><li>Commodity Grid Toolkits ( CoGs ): JAVA, CORBA, … </li></ul><ul><li>NIMROD-G : parameter sweeping simulations </li></ul><ul><li>Condor : high-throughput computing </li></ul><ul><li>GENIUS , VLAM-G , … (web) portals to the Grid </li></ul>
  24. 24. Grids Today
  25. 25. Grid Protocols Today <ul><li>Based on the popular protocols on the ’Net </li></ul><ul><li>Use common Grid Security Infrastructure : </li></ul><ul><ul><li>Extensions to TLS for delegation (single sign-on) </li></ul></ul><ul><ul><li>Uses GSS-API standard where possible </li></ul></ul><ul><li>GRAM (resource allocation): attrib/value pairs over HTTP </li></ul><ul><li>GridFTP (bulk file transfer): FTP with GSI and high-throughput extras (striping) </li></ul><ul><li>MDS (monitoring and discovery service): LDAP + schemas </li></ul><ul><li>…… </li></ul>
  26. 26. Getting People Together Virtual Organisations <ul><li>The user community `out there’ is huge & highly dynamic </li></ul><ul><li>Applying at each individual resource does not scale </li></ul><ul><li>Users get together to form Virtual Organisations : </li></ul><ul><ul><li>Temporary alliance of stakeholders (users and/or resources) </li></ul></ul><ul><ul><li>Various groups and roles </li></ul></ul><ul><ul><li>Managed out-of-band by (legal) contracts </li></ul></ul><ul><li>Authentication, Authorization , Accounting (AAA) </li></ul>
  27. 27. Grid Security Infrastructure <ul><li>Requirements: </li></ul><ul><ul><li>Strong authentication and accountability </li></ul></ul><ul><ul><li>Trace-ability </li></ul></ul><ul><ul><li>“ Secure”! </li></ul></ul><ul><ul><li>Single sign-on </li></ul></ul><ul><ul><li>Dynamic VOs: “proxying”, “delegation” </li></ul></ul><ul><ul><li>Work everywhere (“easyEverything”, airport kiosk, handheld) </li></ul></ul><ul><ul><li>Multiple roles for each user </li></ul></ul><ul><ul><li>Easy! </li></ul></ul>
  28. 28. Authentication & PKI <ul><li>EU DataGrid PKI: 1 PMA, 13 Certification Authorities </li></ul><ul><li>Automatic policy evaluation tools </li></ul><ul><li>Largest Grid-PKI in the world (and growing  ) </li></ul>Alice ( e , n ) CommonName=‘Alice’ Organization=‘KNMI’ Certificate Request CA private key CA self-signed certificate Alice… The CA will check identifier in the request against the identity of the requestor CA operator signs the request with the CA key CA ships the new certificate to Alice Alice generates a key pair and send the public key to CA ( d , n ) Private Key
  29. 29. Site A (Kerberos) Site B (Unix) Site C (Kerberos) Computer User Single sign-on via “grid-id” & generation of proxy cred. Or: retrieval of proxy cred. from online repository User Proxy Proxy credential Computer Storage system GSI-enabled FTP server Authorize Map to local id Access file Remote file access request* GSI-enabled GRAM server GSI-enabled GRAM server Remote process creation requests* * With mutual authentication Process Kerberos ticket Restricted proxy Process Restricted proxy Local id Local id Authorize Map to local id Create process Generate credentials Ditto GSI in Action “Create Processes at A and B that Communicate & Access Files at C” Communication*
  30. 30. Authorization <ul><li>Authorization poses main scaling problem </li></ul><ul><li>Conflict between accountability and ease-of-use / ease-of-management </li></ul><ul><li>By getting rid of “local user” concept ease support for large, dynamic VOs: </li></ul><ul><ul><li>Temporary account leasing: pool accounts à la DHCP </li></ul></ul><ul><ul><li>Grid ID-based file operations: slashgrid </li></ul></ul><ul><ul><li>Sandbox-ing applications </li></ul></ul><ul><ul><li>Direction of EU DataGrid and PPDG </li></ul></ul>
  31. 31. Looking for Resources <ul><li>Resource Brokerage based on matchmaking (Condor) </li></ul><ul><li>Information Services Mesh </li></ul><ul><ul><li>Meta-computing directory </li></ul></ul><ul><ul><li>Replica Catalogues </li></ul></ul><ul><ul><li>Hierarchies of GRISs and GIISs </li></ul></ul>
  32. 32. Locating a Replica <ul><li>Grid Data Mirror Package </li></ul><ul><li>Moves data across sites </li></ul><ul><li>Replicates both files and individual objects </li></ul><ul><li>Catalogue used by Broker </li></ul><ul><li>Replica Location Service (giggle) </li></ul><ul><li>Read-only copies “owner” by the Replica Manager. </li></ul><ul><li> </li></ul>
  33. 33. Mass Data Transport <ul><li>Need for efficient, high-speed protocol: GridFTP </li></ul><ul><li>All storage elements share common interface disk caches, tape robots, … </li></ul><ul><li>Also supports GSI & single sign-on </li></ul><ul><li>Optimize for high-speed networks (>1 Gbit/s) </li></ul><ul><li>Data source striping through parallel streams </li></ul><ul><li>Ongoing work on “better TCP” </li></ul>
  34. 34. Grid Data Bases ?! <ul><li>Database Access and Integration (DAI)-WG </li></ul><ul><ul><li>OGSA-DAI integration project </li></ul></ul><ul><ul><li>Data Virtualisation Services </li></ul></ul><ul><ul><li>Standard Data Source Services </li></ul></ul><ul><ul><li>Early Emerging Standards: </li></ul></ul><ul><ul><li>Grid Data Service specification (GDS) </li></ul></ul><ul><ul><li>Grid Data Service Factory (GDSF) </li></ul></ul><ul><ul><li>Largely spin-off from the UK e-Science effort & DataGrid </li></ul></ul>
  35. 35. Grid Access to Databases <ul><li>SpitFire (standard data source services) uniform access to persistent storage on the Grid </li></ul><ul><li>Multiple roles support </li></ul><ul><li>Compatible with GSI (single sign-on) though CoG </li></ul><ul><li>Uses standard stuff: JDBC, SOAP, XML </li></ul><ul><li>Supports various back-end data bases </li></ul>
  36. 36. Spitfire security model <ul><li>Standard access to DBs </li></ul><ul><li>GSI SOAP protocol </li></ul><ul><li>Strong authentication </li></ul><ul><li>Supports single-signon </li></ul><ul><li>Local role repository </li></ul><ul><li>Connection pool to </li></ul><ul><li>Multiple backend DBs </li></ul><ul><li>Version 1.0 out, </li></ul><ul><li>WebServices version in alpha </li></ul>
  37. 37. A Bright Future?
  38. 38. OGSA: new directions <ul><li>Open Grid Services Architecture … … cleaning up the protocol mess </li></ul><ul><li>Concept from the `web services’ world </li></ul><ul><li>Based on common standards: </li></ul><ul><ul><li>SOAP, WSDL, UDDI </li></ul></ul><ul><ul><li>Running over “upgraded” Grid Security Infra (GSI) </li></ul></ul><ul><li>Adds Transient Services: </li></ul><ul><ul><li>State of distributed activities </li></ul></ul><ul><ul><li>Workflow, multi-media, distributed data analysis </li></ul></ul>
  39. 39. OGSA Roadmap <ul><li>Introduced at GGF4 (Toronto, March 2002) </li></ul><ul><li>New services already web-services based (Spitfire 2, etc.) </li></ul><ul><li>Alpha-version of Globus Toolkit v3 : expected December 2002. </li></ul><ul><li>Huge industrial commitment </li></ul>
  40. 40. EU DataGrid <ul><li>Middleware research project (2001-2003) </li></ul><ul><li>Driving applications: </li></ul><ul><ul><li>HE Physics </li></ul></ul><ul><ul><li>Earth Observation </li></ul></ul><ul><ul><li>Biomedicine </li></ul></ul><ul><li>Operational testbed </li></ul><ul><ul><li>21 sites </li></ul></ul><ul><ul><li>6 VOs </li></ul></ul><ul><ul><li>~ 200 users, growing with ~100/month! </li></ul></ul>
  41. 41. EU DataGrid Test Bed 1 <ul><li>DataGrid TB1: </li></ul><ul><ul><li>14 countries </li></ul></ul><ul><ul><li>21 major sites </li></ul></ul><ul><ul><li>CrossGrid : 40 more sites </li></ul></ul><ul><ul><li>Growing rapidly… </li></ul></ul><ul><li>Submitting Jobs: </li></ul><ul><ul><li>Login only once, run everywhere </li></ul></ul><ul><ul><li>Cross administrative boundaries in a secure and trusted way </li></ul></ul><ul><ul><li>Mutual authorization </li></ul></ul>
  42. 42. DutchGrid Platform Amsterdam Utrecht KNMI Delft Nijmegen Enschede <ul><li>DutchGrid: </li></ul><ul><ul><li>Test bed coordination </li></ul></ul><ul><ul><li>PKI security </li></ul></ul><ul><ul><li>Support </li></ul></ul><ul><li>Participation by </li></ul><ul><ul><li>NIKHEF, KNMI, SARA </li></ul></ul><ul><ul><li>DAS-2 (ASCI): TUDelft, Leiden, VU, UvA, Utrecht </li></ul></ul><ul><ul><li>Telematics Institute </li></ul></ul><ul><ul><li>FOM, NWO/NCF </li></ul></ul><ul><ul><li>Min. EZ, ICES/KIS </li></ul></ul><ul><ul><li>IBM, KPN, … </li></ul></ul>Leiden ASTRON
  43. 43. A Bright Future! You could plug your computer into the wall and have direct access to huge computing resources almost immediately (with a little help from toolkits and portals) … It may still be science – although not fiction – but we are about to make this into reality!