• Like
Computing Outside The Box September 2009
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Computing Outside The Box September 2009

  • 635 views
Published

Keynote talk at Parco 2009 in Lyon, France. An updated version of http://www.slideshare.net/ianfoster/computing-outside-the-box-june-2009.

Keynote talk at Parco 2009 in Lyon, France. An updated version of http://www.slideshare.net/ianfoster/computing-outside-the-box-june-2009.

Published in Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
635
On SlideShare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
16
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • I am here is because “cloud” is hot and the organizers thought it would be interesting to hear some perspectives on this new technology and what it means for supercomputing. In my mind, cloud is the latest phase in a long transformation of computing from a box to a service. This is what a colleague meant when he said to be recently:
  • Since at least the time of Newton, people have been working to avoid computational tasks. One early approach was to employ smart people to do this work for you, as shown here at the Harvard Observatory in 1890.
  • Then they worked out how automate the more mundane computations. This is the ADIVAC, installed in 1953 at Argonne National Laboratory High operational and maintenance costs. Considerable expertise required. To use these systems, you travelled to the site where they were installed.
  • Early on, people realized that it didn’t make sense for people to travel to computers—that we should be able to compute outside the box. For example, AI pioneer John McCarthy spoke in these terms in 1961, at the launch of Project MAC (?) Here he is a couple of years ago, as such an industry is just emerging. It takes a while.
  • Why the interest in remote computing? Some reasons are not that different to power. Elasticity: ability to acquire as much computing as needed, on demand. Better performance (e.g., reliability) than centralized solution Reduced cost relative to the substantial capital expenses required to build. Suggests a need for on-demand computing. However, while time sharing business (IBM etc.) emerged, not a huge industry. Reasons: limited networking, computers, applications.
  • Why now? To my mind, driven by technology. At some point, networks were fast enough to permit on-demand access to computing
  • Core of COTB is a separation of concerns between producer and consumer—of e.g. computing or storage. Thus a lot of work on modeling of activities, activity initiation and management, representation of agreements, authentication/authorization policies. All, ultimately, in a Web service framework.
  • We define Grid architecture in terms of a layered collection of protocols. Fabric layer includes the protocols and interfaces that provide access to the resources that are being shared, including computers, storage systems, datasets, programs, and networks. This layer is a logical view rather then a physical view. For example, the view of a cluster with a local resource manager is defined by the local resource manger, and not the cluster hardware. Likewise, the fabric provided by a storage system is defined by the file system that is available on that system, not the raw disk or tapes. The connectivity layer defines core protocols required for Grid-specific network transactions. This layer includes the IP protocol stack (system level application protocols [e.g. DNS, RSVP, Routing], transport and internet layers), as well as core Grid security protocols for authentication and authorization. Resource layer defines protocols to initiate and control sharing of (local) resources. Services defined at this level are gatekeeper, GRIS, along with some user oriented application protocols from the Internet protocol suite, such as file-transfer. Collective layer defines protocols that provide system oriented capabilities that are expected to be wide scale in deployment and generic in function. This includes GIIS, bandwidth brokers, resource brokers,…. Application layer defines protocols and services that are parochial in nature, targeted towards a specific application domain or class of applications. These are are are … arrgh
  • Having developed those methods, natural to use them to organize infrastructure that we use to provision services—to provide scalability, resilience, performance. Thus SOI. Provide a basis for interoperable infrastructures. Some success.
  • Substantial “infrastructure as a service” deployments, e.g., TeraGrid, OSG, EGEE Services: -- Authentication -- Attribute management and authorization -- Monitoring -- Task dispatch -- Workflow execution
  • That’s SOI. Meanwhile, HP networks have also been motivating development of innovative applications…
  • Those same SOA methods can be used to construct applications. Thus SO applications.
  • Data integration here: integration of different data types: tissue, (pre)clinical, genomics, proteomics….
  • 09/04/09 Test Built using the same mechanisms used to build SOI. -- PKI, delegation, attribute-based authorization -- Registries, monitoring Operating a service is a pain! Would be nice to outsource. But they need to be near the data, which also has privacy concerns. So things become complicated.
  • 09/04/09 Test Workflows are becoming a widespread mechanism for coordinating the execution of scientific services and linking scientific resources. Analytical and data processing pipelines. Is this stuff real? EBI 3 million+ web service API submissions in 2007 A lot? We want to publish workflows as services. Think of caBIG services as service providers that then invoke grid services to execute services. (E.g., via TeraGrid gateways.)
  • Overall status -- Decent grid service providers, offering basic security, registry, compute, data services: but with reliability and usability mixed (difficult!) -- Powerful applications built on the same methods, but not using infrastructure, because services are not advanced enough; and struggling with costs
  • A substantial energy barrier among both service providers and service users (Obstacles: approval process, accounts, policies) Few users  infrastructure is hard to use, not robust
  • A substantial energy barrier among both service providers and service users (Obstacles: approval process, accounts, policies) Few users  infrastructure is hard to use, not robust How to fix? Apply $$. I’m told Euro also work.
  • A substantial energy barrier among both service providers and service users (Obstacles: approval process, accounts, policies) Few users  infrastructure is hard to use, not robust
  • Fortunately, two factors: Services + Virtualization enable deployments in industry with strong positive returns to scale for both users and providers
  • Simple Queue Service: … Simple Storage Service: … Elastic Compute Cloud: … SimpleDB: simplified relational DB Cloudfront: content distribution network for S3 data [LIGO data solution!?]
  • Many interesting questions. What is the right mix of services at the platform level? How can we leverage such offerings to build innovative applications? Legal, business model issues. What will system look like? “5 data centers” (Papadopoulos) RAID: Redundant Array of Inexpensive Data Centers
  • Many interesting questions. What is the right mix of services at the platform level? How can we leverage such offerings to build innovative applications? Legal, business model issues. What will system look like? “5 data centers” (Papadopoulos) RAID: Redundant Array of Inexpensive Data Centers
  • Many interesting questions. What is the right mix of services at the platform level? How do we build services that meet scalability, performance, reliability needs? How can we leverage such offerings to build innovative applications? Legal, business model issues.
  • Use of distributed computing research. Used to meet very specific requirements Can also point to Google—GFS, MapReduce, etc.—very sophisticated, very scalable, very specialized.
  • Individuals build increasingly sophisticated applications, often using SOA principles. Communities (in eResearch at least) build SOA, to some extend composing elements from multiple providers (but not to the extent we expect) Or they mashup things (a different from of composition) eResearch builds SOIs, somewhat open Companies build internal SOIs—often sophisticated (e.g., Amazon, Google), but proprietary and closed. Good if a particular provider meets your specific requirements. Not so good if it does not—or if you want to avoid lock in.
  • Another example, also illustrating service composition.
  • GridFTP = high-perf data movement, multiple protocols, credential delegation, restart RLS = P2P system, soft state, Bloom filters, BUT: the services themselves are operated by the LIGO community. Running persistent, reliable, scalable services is expensive and difficult
  • A slide from 2003 or so …
  • Handle system—name resolution and metadata. Want local servers for performance But for persistence, reliability … Hybrid solution
  • That’s one perspective on clouds: an ecosystem of providers of infrastructure, platform, and software as a service capabilities. As this is an SC conference, I’d like to focus on the question of whether clouds are any good for supercomputing. Conventional wisdom, I think, is that there are two sorts of applications and two sorts of computers, and each is made for each other.
  • Performance studies might seem to confirm some part of the conventional wisdom. Running 8 of the NAS parallel benchmarks on multiprocessor nodes—Amazon is slightly slower when running on single nodes, with OpenMP.
  • But MUCH slower when running across 32 nodes using MPI.
  • The reason is the poor bandwith
  • And latency. We may conclude that EC2 is unusable for science. But what if we can consider the end-to-end latency? We can use the QBETS Queue Bounds Estimation from Time Serries tool developed by Rich Wolski to do that.
  • For example, the MG application runs for 3 secs on Abe and 8 seconds on EC2. What if we ask QBETS to estimate the chance that we will start on Abe within 100 secs …
  • "docking" is the identification of the low-energy binding modes of a small molecule (ligands) within the active site of a macromolecule (receptor) whose structure is known A compound that interacts strongly with (i.e. binds) a receptor associated with a disease may inhibit its function and thus act as a drug Typical Workload: Application Size: 7MB (static binary) Static input data: 35MB (binary and ASCII text) Dynamic input data:10KB (ASCII text) Output data: 10KB (ASCII text) Expected execution time: 5~5000 seconds Parameter space: 1 billion tasks
  • More precisely, step 3 is “GCMC + hydration.” Mike Kubal say: “This task is a Free Energy Perturbation computation using the Grand Canonical Monte Carlo algorithm for modeling the transition of the ligand (compound) between different potential states and the General Solvent Boundary Partition to explicitly model the water molecules in the volume around the ligand and pocket of the protein. The result is a binding energy just like the task at the top of the funnel; it is just a more rigorous attempt to model the actual interaction of protein and compound. To refer to the task in short hand, you can use "GCMC + hydration". This is a method that Benoit has pioneered.”
  • Application Efficiency was computed between the 16 rack and 32 rack runs. Sustained Utilization is the utilization achieved during the part of the experiment while there was enough work to do, 0 to 5300 sec. Overall utilization is the number of CPU hours used divided by total number of CPU hours allocated. The experiment included the caching of the 36 MB (52MB uncompressed) archive on each of the 1 st access per node We use “dd” to move data to and from GPFS…. The application itself had some bad I/O patterns in the write, which prevented it from scaling well, so we decided to write to RAM, and then dd back to GPFS. For this particular run, we had 464 Falkon services running on 464 I/O nodes, 118K workers (256 per Falkon service), and 1 client on a login node. The 32 rack job took 15 minutes to start. It took the client 6 minutes to establish a connection and setup the corresponding state with all 464 Falkon services. It took the client 40 seconds to dispatch 118K tasks to 118K CPUs. The rest can be seen from the graph and slide text…
  • Because we are still mostly computing inside the box
  • Why now? Law of unexpected consequences—like Web: not just Tim Berners-Lee’s genius, but also disk drive capacity What will happen when ubiquitous high-speed wireless means we can all reach any service anytime—and powerful tools mean we can author our own services? Fascinating set of challenges -- What sort of services? Applications? -- What does openness mean in this context? -- How do we address interoperability, portability, composition? -- Accounting, security, audit?
  • Greg Papadopoulos (Sun) – The world will have five data centers. An interesting image from the Economist, reminding us that nothing is ever simple. RAID: Redundant Array of Inexpensive Data Centers Will something similar happen for computing? Need standards!

Transcript

  • 1. Ian Foster Computation Institute Argonne National Lab & University of Chicago
  • 2. Abstract
    • The past decade has seen increasingly ambitious and successful methods for outsourcing computing. Approaches such as utility computing, on-demand computing, grid computing, software as a service, and cloud computing all seek to free computer applications from the limiting confines of a single computer. Software that thus runs "outside the box" can be more powerful (think Google, TeraGrid), dynamic (think Animoto, caBIG), and collaborative (think FaceBook, myExperiment). It can also be cheaper, due to economies of scale in hardware and software. The combination of new functionality and new economics inspires new applications, reduces barriers to entry for application providers, and in general disrupts the computing ecosystem. I discuss the new applications that outside-the-box computing enables, in both business and science, and the hardware and software architectures that make these new applications possible.
  • 3.
    • “ I’ve been doing cloud computing since before it was called grid.”
  • 4. 1890
  • 5. 1953
  • 6. “ Computation may someday be organized as a public utility … The computing utility could become the basis for a new and important industry.” John McCarthy (1961)
  • 7.  
  • 8. Time Connectivity (on log scale) Science “ When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder, 2001) Grid
  • 9. Application Infrastructure
  • 10. Layered grid architecture (“The Anatomy of the Grid,” 2001) Application Fabric “ Controlling things locally”: Access to, & control of, resources Connectivity “ Talking to things”: communication (Internet protocols) & security Resource “ Sharing single resources”: negotiating access, controlling use Collective “ Managing multiple resources”: ubiquitous infrastructure services User “ Specialized services”: user- or appln-specific distributed services Internet Transport Application Link Internet Protocol Architecture
  • 11. Application Infrastructure Service oriented infrastructure
  • 12.  
  • 13. www.opensciencegrid.org
  • 14. www.opensciencegrid.org
  • 15. Application Infrastructure Service oriented infrastructure
  • 16. Application Service oriented applications Infrastructure Service oriented infrastructure
  • 17.  
  • 18. As of Oct 19 , 2008: 122 participants 105 services 70 data 35 analytical
  • 19. Microarray clustering using Taverna
    • Query and retrieve microarray data from a caArray data service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/CaArrayScrub
    • Normalize microarray data using GenePattern analytical service node255.broad.mit.edu:6060/wsrf/services/cagrid/PreprocessDatasetMAGEService
    • Hierarchical clustering using geWorkbench analytical service: cagridnode.c2b2.columbia.edu:8080/wsrf/services/cagrid/HierarchicalClusteringMage
    Workflow in/output caGrid services “ Shim” services others Wei Tan
  • 20. Infrastructure Applications
  • 21. Energy Progress of adoption
  • 22. Energy Progress of adoption $$ $$ $$
  • 23. Energy Progress of adoption $$ $$ $$
  • 24. Time Connectivity (on log scale) Science Enterprise “ When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder, 2001) Grid Cloud
  • 25.  
  • 26.  
  • 27. US$3
  • 28. Credit: Werner Vogels
  • 29. Credit: Werner Vogels
  • 30. Animoto EC2 image usage Day 1 Day 8 0 4000
  • 31. Software Platform Infrastructure Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
  • 32. Software Platform Infrastructure Amazon, GoGrid, Sun, Microsoft, … Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
  • 33. Software Platform Infrastructure Amazon, GoGrid, Microsoft, Flexiscale, … Google, Microsoft, Amazon, … Salesforce.com, Google, Animoto, …, …, caBIG, TeraGrid gateways
  • 34.  
  • 35. Dynamo: Amazon’s highly available key-value store (DeCandia et al., SOSP’07)
    • Simple query model
    • Weak consistency, no isolation
    • Stringent SLAs (e.g., 300ms for 99.9% of requests; peak 500 requests/sec)
    • Incremental scalability
    • Symmetry
    • Decentralization
    • Heterogeneity
  • 36. Technologies used in Dynamo Problem Technique Advantage Partitioning Consistent hashing Incremental scalability High Availability for writes Vector clocks with reconciliation during reads Version size is decoupled from update rates Handling temporary failures Sloppy quorum and hinted handoff Provides high availability and durability guarantee when some of the replicas are not available Recovering from permanent failures Anti-entropy using Merkle trees Synchronizes divergent replicas in the background Membership and failure detection Gossip-based membership protocol and failure detection. Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information
  • 37. Using IaaS for elastic capacity Nimbus Local cluster STAR nodes Kate Keahey et al. Amazon EC2 STAR nodes
  • 38. Application Service oriented applications Infrastructure Service oriented infrastructure
  • 39. The Globus-based LIGO data grid Birmingham • Replicating >1 Terabyte/day to 8 sites >100 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory
    • Cardiff
    AEI/Golm
  • 40.
    • Pull “missing” files to a storage system
    Data replication service List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP “ Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location Data Replication
  • 41. Specializing further … User D S1 S2 S3 Service Provider “ Provide access to data D at S1, S2, S3 with performance P” Resource Provider “ Provide storage with performance P1, network with P2, …” D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3
  • 42. Using IaaS in biomedical informatics My servers Chicago Chicago handle.net BIRN Chicago IaaS provider Chicago BIRN Chicago
  • 43. Clouds and supercomputers: Conventional wisdom? Too slow Too expensive Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
  • 44. Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 45. Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 46. Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 47. Ed Walker, Benchmarking Amazon EC2 for high-performance scientific computing, ;Login, October 2008.
  • 48. D. Nurmi, J. Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
  • 49. D. Nurmi, J. Brevik, R. Wolski: QBETS: queue bounds estimation from time series. SIGMETRICS 2007: 379-380
  • 50.  
  • 51. Clouds and supercomputers: Conventional wisdom? Good for rapid response Too expensive Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
  • 52. Loosely coupled problems
    • Ensemble runs to quantify climate model uncertainty
    • Identify potential drug targets by screening a database of ligand structures against target proteins
    • Study economic model sensitivity to parameters
    • Analyze turbulence dataset from many perspectives
    • Perform numerical optimization to determine optimal resource assignment in energy problems
    • Mine collection of data from advanced light sources
    • Construct databases of computed properties of chemical compounds
    • Analyze data from the Large Hadron Collider
    • Analyze log data from 100,000-node parallel computations
  • 53. Many many tasks: Identifying potential drug targets 2M+ ligands Protein x target(s) (Mike Kubal, Benoit Roux, and others)
  • 54. start report DOCK6 Receptor (1 per protein: defines pocket to bind to) ZINC 3-D structures ligands complexes NAB script parameters (defines flexible residues, #MDsteps) Amber Score: 1. AmberizeLigand 3. AmberizeComplex 5. RunNABScript end BuildNABScript NAB Script NAB Script Template Amber prep: 2. AmberizeReceptor 4. perl: gen nabscript FRED Receptor (1 per protein: defines pocket to bind to) Manually prep DOCK6 rec file Manually prep FRED rec file 1 protein (1MB) PDB protein descriptions For 1 target: 4 million tasks 500,000 cpu-hrs (50 cpu-years) 6 GB 2M structures (6 GB) DOCK6 FRED ~4M x 60s x 1 cpu ~60K cpu-hrs Amber ~10K x 20m x 1 cpu ~3K cpu-hrs Select best ~500 ~500 x 10hr x 100 cpu ~500K cpu-hrs GCMC Select best ~5K Select best ~5K
  • 55.  
  • 56. DOCK on BG/P: ~1M tasks on 118,000 CPUs
    • CPU cores: 118784
    • Tasks: 934803
    • Elapsed time: 7257 sec
    • Compute time: 21.43 CPU years
    • Average task time: 667 sec
    • Relative Efficiency: 99.7%
    • (from 16 to 32 racks)
    • Utilization:
      • Sustained: 99.6%
      • Overall: 78.3%
    • GPFS
      • 1 script (~5KB)
      • 2 file read (~10KB)
      • 1 file write (~10KB)
    • RAM (cached from GPFS on first task per node)
      • 1 binary (~7MB)
      • Static input data (~45MB)
    Ioan Raicu Zhao Zhang Mike Wilde Time (secs)
  • 57. Managing 160,000 cores Slower shared storage High-speed local “disk” Falkon
  • 58. Scaling Posix to petascale … . . . Large dataset CN-striped intermediate file system  Torus and tree interconnects  Global file system Chirp (multicast) MosaStore (striping) Staging Intermediate Local LFS Compute node (local datasets) LFS Compute node (local datasets)
  • 59. Efficiency for 4 second tasks and varying data size (1KB to 1MB) for CIO and GPFS up to 32K processors
  • 60. “ Sine” workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node Ioan Raicu
  • 61. “ Sine” workload, 2M tasks, 10MB:10ms ratio, 100 nodes, GCC policy, 50GB caches/node Ioan Raicu
  • 62. Same scenario, but with dynamic resource provisioning
  • 63. Same scenario, but with dynamic resource provisioning
  • 64. Data diffusion sine-wave workload: Summary
    • GPFS  5.70 hrs, ~8Gb/s, 1138 CPU hrs
    • DD+SRP  1.80 hrs, ~25Gb/s, 361 CPU hrs
    • DD+DRP  1.86 hrs, ~24Gb/s, 253 CPU hrs
  • 65. Clouds and supercomputers: Conventional wisdom? Good for rapid response Excellent Clouds/ clusters Super computers Loosely coupled applications Tightly coupled applications ✔ ✔
  • 66. “ The computer revolution hasn’t happened yet.” Alan Kay, 1997
  • 67. Time Connectivity (on log scale) Science Enterprise Consumer “ When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder, 2001) Grid Cloud ????
  • 68. Energy Internet The Shape of Grids to Come?
  • 69. Thank you! Computation Institute www.ci.uchicago.edu