Cloud Computing for Developers and Architects - QCon 2008 Tutorial


Published on

Stuart Charlton's tutorial on Cloud Computing at QCon SF 2008.

Published in: Technology, Business
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cloud Computing for Developers and Architects - QCon 2008 Tutorial

  1. 1. Cloud Computing for Developers & Architects Stuart Charlton Chief Software Architect, Elastra San Francisco 2008
  2. 2. Tutorial Objectives Provide an overview of the emerging cloud industry, the jargon, the trends, and a model to help sort through the mess Dig into a couple of specific examples on how to provision and operate a cloud environment, conveying practical insight Explore cloud computing architectures, looking at whether they change traditional system architectures San Francisco 2008 2
  3. 3. About Your Presenter Stuart Charlton • Canadian, now in San Francisco Chief Architect, Elastra • Responsible for technical direction & long-term product strategy In prior lives... • BEA Systems, Rogers Communications, Financial Services, global training & consulting Stu Says Stuff San Francisco 2008 3
  4. 4. Agenda - Part 1 A Look at the Clouds • (Good Luck) Defining Cloud Computing • Qualities of a Cloud • The Cloud Computing Industry - Late 2008 • A Cloud Reference Model Amazon Web Services Tutorial • Simple Storage Service (S3) • Elastic Compute Cloud (EC2) • Elastic Block Storage (EBS) • Covering APIs, Tools, and Experiences San Francisco 2008 4
  5. 5. Agenda - Part 2 Managing & Operating Cloud Systems • Whither IT Service Management? • The Hope for Cloud Standards • The Puppet Administrative System • A Preview of Elastra Cloud Services Cloud Architecture • Common Patterns • Integrating applications, networks, and data • Scalability and Monitoring Q&A and Open Discussion San Francisco 2008 5
  6. 6. Caveats The technology is a (very) moving target • Expect this to increase as the industry tries to drive a new round of retooling & spending • Lots to cover; we’ll try to scratch a reasonable amount of surface Much cloud technology is quite proprietary • Too early to dive into committee-land • Even if it’s open source, only one distribution may eventually be problematic The “definition game” is only fun for so long • Fondly recall the crisp and concise industry definitions such as SOA, OO, Components, etc... San Francisco 2008 6
  7. 7. A Look at the Clouds San Francisco 2008
  8. 8. San Francisco 2008 8
  9. 9. (Good Luck) Defining Cloud Computing Software-as-a-Service • “My customer resource management (CRM) system is out on the Internet!” Grids vs. Clouds • Shared Virtual Resources • Batch Jobs vs. Online Applications • Different Approaches to State Management Network Diagrams • A service is “on a cloud somewhere” Virtualization Platforms & APIs • Hardware can be manipulated with software San Francisco 2008 9
  10. 10. Qualities of a Cloud On-Demand • Lowered requirement to call-ahead forecasts • Demand trends are predicted by the provider Usage-metered (i.e. an operating expense) • Pay-by-the-drink or over time, not up front Self-service • Resources directly/indirectly reserved with a GUI or API Elastic Scalability • Grow or shrink resources as required Mandatory Network • The network is essential to consume the service San Francisco 2008 10
  11. 11. A Subset of the Cloud Landscape Software Vendors Mid-Size Providers Large Providers San Francisco 2008 11
  12. 12. The Cloud Provider Continuum “Retail Ecosystem” “Supplier Ecosystem” Closer to the Closer to the Developer/User SysAdmin/Ops Platform-as-a-Service Infrastructure-as-a-Service San Francisco 2008 12
  13. 13. A Cloud Technology Reference Model Begin with the Basic Data Center Testing, Monitoring, Facilities & Diagnostics, Logistics and Software & Hardware Infrastructure Verification San Francisco 2008 1
  14. 14. A Cloud Technology Reference Model Add easy software access to: Elements - HW/SW/Network/Storage Settings, Installations, and Configurations Resources - Reservations from a pool of excess capacity in storage, computing, and network Element Resource Testing, Management Management Monitoring, Facilities & Diagnostics, Logistics and Software & Hardware Infrastructure Verification San Francisco 2008 1
  15. 15. A Cloud Technology Reference Model Add some visibility: A Web of Metadata (What uses or contains what other things?) Lifecycle (when and how can things change?) Lifecycle (Birth, Growth, Failure, Recovery, Death) Web of Metadata Testing, Categories, Capabilities, Configurations & Dependencies Monitoring, Diagnostics, Element Resource and Facilities & Management Management Verification Logistics Software & Hardware Infrastructure San Francisco 2008 1
  16. 16. A Cloud Technology Reference Model Add some real-world context: Governance (Who has authority / responsibility to change, and how?) Architecture Views (How are my concerns addressed?) Architectural Views Governance (e.g. scalability, availability, recovery, data quality, security) Testing, Monitoring, Diagnostics, Lifecycle and (Birth, Growth, Failure, Recovery, Death) Verification Web of Metadata Categories, Capabilities, Configurations & Dependencies San Francisco 2008 1
  17. 17. A Cloud Technology Reference Model Your Application Governance Architectural Views Lifecycle (Birth, Growth, Failure, Recovery, Death) Testing, Monitoring, Web of Metadata Diagnostics, Categories, Capabilities, Configurations & Dependencies and Verification Element Resource Facilities & Management Management Logistics Software & Hardware Infrastructure San Francisco 2008 1
  18. 18. Infrastructure Clouds Start Here: Your Application Governance Architectural Views Testing, Your Monitoring, Diagnostics, Problem Lifecycle and (Birth, Growth, Failure, Recovery, Death) Verification Web of Metadata Categories, Capabilities, Configurations & Dependencies Element Management Resource Their Facilities & Operating System Images Management Basic Problem Logistics Monitoring Software & Hardware Infrastructure San Francisco 2008 1
  19. 19. “Cloud Servers” Try to Extend Infra: Your Your Application problem Governance Architectural Views Testing, Monitoring, Diagnostics, Cloud Lifecycle and Verification servers (Birth, Growth, Failure, Recovery, Death) Web of Metadata Categories, Capabilities, Configurations & Dependencies Cloud Element Management Resource Infra Facilities & (Split Responsibility) Management Basic (private or Logistics Monitoring public) Software & Hardware Infrastructure San Francisco 2008
  20. 20. Cloud Platforms, As Perceived Today Application- lol, Your Application Governance Level (Insert Code Here) Monitoring DON’T WORRY YOUR PRETTY HEAD, WE HAVE THE REST UNDER CONTROL San Francisco 2008 20
  21. 21. How Cloud Platforms Likely Will Evolve Your Application App-Level Governance Scalability, Integration, Testing, Backup & Recovery, Security Views Monitoring, Diagnostics, Application Lifecycle and (Birth, Growth, Failure, Recovery, Death) Verification BLACK BOX OF INTRIGUE San Francisco 2008 21
  22. 22. Amazon Web Services Tutorial San Francisco 2008
  23. 23. AWS Registration and Security Create an AWS account • • Attachable to your existing account Creating an Access Key ID and Secret Key San Francisco 2008 23
  24. 24. Simple Storage Service (S3) Web-Based Media Storage • Scalable, Redundant, Reliable, and Fast • XML-Based Metadata over RESTful Web Interface • Available over HTTP, HTTPS, and BitTorrent Official 99.9% availability SLA (per month) • 10% service credit when between 99% and 99.9% • 25% service credit when less than 99% Available in United States and Europe Pricing (U.S.) - November 2008 • Storage Rates: starting at $0.15 per GB monthly • Usage Rates: $0.10 inbound, $0.17 outbound • Request Rates: $0.01 per 10k GET, 1k POST, PUT, etc. • Rates are reduced as volume increases (multi-TB) San Francisco 2008 24
  25. 25. S3 Conceptual Model /2008-11-08/QCon.html S3 Key Protected S3 Objects by ACL QConPages S3 Bucket Mapped into: San Francisco 2008 25
  26. 26. S3 RESTful Interactions Creating Buckets as Resources PUT /qconpages HTTP/1.1 Host: Date: Mon, 17 Nov 2008 09:15:00 PST Authorization: AWS <AccessKeyID:signature> Content-Length: 0 Response HTTP/1.1 200 OK Location: /qconpages Date: Mon, 17 Nov 2008 09:15:01 PST Content-Length: 0 San Francisco 2008 26
  27. 27. S3 RESTful Interactions Writing objects in buckets PUT /qconpages/QCon.html HTTP/1.1 Host: Date: Mon, 17 Nov 2008 09:15:16 PST Authorization: AWS <AccessKeyID:signature> Content-Length: 104 Content-Type: text/html <html> <head> <title>QCon San Francisco 2008</title> </head> <body><p>Welcome!</p></body> </html> San Francisco 2008 27
  28. 28. S3 RESTful Interactions Retrieving Objects GET /HugeFile HTTP/1.1 Host: Date: Mon, 17 Nov 2008 09:15:16 PST Accept: */* Range: bytes=0-1048579 (Range is an optional, standard HTTP, way to retrieve subsets and/or to resume broken transfers) San Francisco 2008 28
  29. 29. Transfer Considerations HTML Form Uploads • Content type is multipart/form-data • Hidden form fields can pass other parameters Object Key, Authorization Signature, etc. BitTorrent Access • Request /bucket/key?torrent for .torrent file • Object needs to be available by anonymous users • Other downloaders will contribute to the Torrent, S3 will act as a seeder San Francisco 2008 29
  30. 30. AWS Authorization Format Ensures that requests were not tampered with and was authorized by the AWS account holder • An HMAC-SHA1 Algorithm applied to several canonicalized HTTP headers and and content Passed as an Authorization header Optionally can be passed as URI parameters for pre-signed, expiry-based signatures San Francisco 2008 30
  31. 31. Elastic Compute Cloud (EC2) Resizable Compute Capacity in the Cloud CPU, Memory, Storage, and Network • Storage is “ephemeral” ; is lost on termination Supports Linux, OpenSolaris, and Windows Server 2003 Free data transfer • Between S3 and EC2 • Among EC2 instances In/Outbound data transfer similar price to S3 Baseline CPU Speed is 1.0-1.2 Ghz AMD Opteron • aka. Elastic Compute Unit (ECU) San Francisco 2008 31
  32. 32. EC2 Sizes Size Cores / Speed Storage Memory Cost 1 Core, 1 ECU $0.10/hr (*NIX) Small 160 GB 1.7 GB (32-bit) $0.125/hr (Windows) 2 Core, 2 ECU $0.40/hr (*NIX) Large 850 GB 7.5 GB (64-bit) $0.50/hr (Windows) 4 Core, 2 ECU $0.80/hr (*NIX) X-Large 1690 GB 15 GB (64-bit) $1.00/hr (Windows) High 2 Core, 2.5 ECU $0.20/hr (*NIX) CPU 350 GB 1.7 GB Medium (32-bit) $0.30/hr (Windows) High 8 Core, 2.5 ECU $0.80/hr (*NIX) CPU 1690 GB 7 GB X-Large (64-bit) $1.20/hr (Windows) San Francisco 2008 32
  33. 33. The Lazy Developer’s Tool: Elasticfox San Francisco 2008 33
  34. 34. EC2 Authorization Keypairs Amazon EC2 uses an x.509 Certificate and Private Key pair to enable authorization On Linux & UNIX: • Passwordless-SSH On Windows: • Keypair is used to access administrator password Generate your own (e.g. Elasticfox), or use Amazon’s web interface San Francisco 2008 34
  35. 35. Image Management Amazon Machine Images (AMIs) • A copy of the OS filesystem, minus the kernel • Chunked up into smaller pieces, uploaded to S3 • After uploading, can be registered with EC2 Library of AMIs available through EC2 API • Amazon-provided AMIs e.g. Fedora 8, Windows Server 2003 • Publically-available 3rd Party AMIs e.g. OpenSolaris, various Linux distros • Paid-AMIs • Private (your own) AMIs San Francisco 2008 35
  36. 36. Instance Management Launching an AMI • Select the min/max number of instances desired • Choose security groups • Choose instance size • Ensure OS fits the size (i.e. 32 vs 64-bit) • Choose the registered keypair for authentication San Francisco 2008 36
  37. 37. Availability Zones A grouping of the data centre infrastructure that’s isolated from other infrastructure • Could be in the same data centre, just redundant power, HVAC, etc. Generally, failures in one zone will not impact the other zones (except for catastrophic failure) In future, regions will also be available for planned disaster recovery. San Francisco 2008 37
  38. 38. EC2 Query API Intuitive Functions • Describe* AvailabilityZones Images Instances KeyPairs SecurityGroups • RunInstances • TerminateInstances Constructed via URI (not RESTful, though) • Action=RunInstances&ImageId=ami-60a54009.. San Francisco 2008 38
  39. 39. Image Bundling Bundling Images on Linux & UNIX • ec2-bundle-vol utility run on the instance • ec2-upload-bundle utility to send to S3 Bundling Images on Windows • ec2-bundle-instance API wrapper cmd San Francisco 2008 39
  40. 40. Example of Launching an Instance Using the Typica Toolkit (Java Wrapper) List<String> params = new ArrayList<String>(); List<ImageDescription> images = ec2.describeImages(params); for (ImageDescription img : images) { if (img.getImageId().equals(“ami-2a5fba43”)) ReservationDescription = ec2.runInstances(img.getImageId(), 1 /*min*/, 1 /*max*/, securityGroups, “”, “mykeypair”); } San Francisco 2008 40
  41. 41. EC2 Security Groups Virtual Group-Based Firewalls in the EC2 Data Center CIDR-based group firewall for Load external clients (e.g. Balancer Web Security Group App Database Server Data Security Group San Francisco 2008 41
  42. 42. EC2 Networking Each instance is given a Public Dynamic Host: • e.g. And a Private Host for within EC2: • e.g. domU-10-21-18-00-69-D5.compute-1.internal Cross-Instance Traffic should almost always use the Private Host No UDP Broadcast or IP Multicast is allowed Elastic IP • Static public IP address, allocated within 24 hours • Attaching an Elastic IP may take ~15 minutes • Note that it asynchronously replaces your public dynamic host name & IP address without warning San Francisco 2008 42
  43. 43. Elastic Block Storage Persistent, highly-available, block storage • (Similar experience to a SAN) Released August 2008 Volumes between 1GB to 1TB • Multiple volumes allowed RAID striping allowed (bandwidth constrained at ~100+ MB/sec) Supports snapshots to S3 for later restore • Snapshots are asynchronous and take a long time • Restores, on the other hand, are relatively quick San Francisco 2008 43
  44. 44. Elastic Block Storage API Create/DeleteVolume Attach/DetachVolume Create/DeleteSnapshot DescribeVolumes EBS Storage is normally provisioned very quickly (seconds) Initial writes will be slow, as with ephemeral stores; All EBS volumes must be formatted with a file system prior to use San Francisco 2008 44
  45. 45. End of Part 1 San Francisco 2008
  46. 46. Agenda - Part 2 Managing & Operating Cloud Systems • Whither IT Service Management? • The Hope for Cloud Standards • Tutorial - The Puppet Administrative System • A Preview of Elastra Cloud Services Cloud Architecture Topics • Common Patterns • Integrating applications, networks, and data • Security (Identity, Privacy, etc.) • Scalability and Monitoring Q&A and Open Discussion San Francisco 2008 46
  47. 47. Managing & Operating Cloud Systems San Francisco 2008
  48. 48. How have we managed our IT? Developer-led • Concurrent Versioning, Unit Testing, Maven, Ant, Capistrano • Focused on code-promotion ; sometimes database transform Manager-led • One extreme: firefighting • The other extreme: bureaucracy Architect-led • Round-trip modeling tools (e.g. Rational UML, Together, etc.) • Gated reviews (i.e. “The technology cops”) Operations-led • Management suites (OpenView, Tivoli, etc.) • Runbook Automation (e.g. HP/OpsWare, BMC/BladeLogic, Opalis) San Francisco 2008 48
  49. 49. IT Infrastructure Library (ITIL) v3: The Current Best Practice? San Francisco 2008 49
  50. 50. Dependency Management vs. Uniformity The “Google Secret Sauce” Theory: • Always available, scalable, fast • Computing as fungible commodity • Reliability is enabled by architecture • But you have to rewrite your software Does a seemingly magical architecture reduce or eliminate the need for configuration & dependency If I spill this on a management? server, who Does this architecture match classic is affected, and by how much? enterprise requirements? San Francisco 2008 50
  51. 51. EC2 is great, but... That’s a lot of images! That’s a heckuva lot of instances! How do I change many machines at once? • Scripts that wrap SSH? Do I need to re-image every time I add/ update software? How do I detect configuration drift? San Francisco 2008 51
  52. 52. The Puppet Administrative System An Open Source Runtime System and Domain-Specific Language (DSL) for managing Linux, BSD, & UNIX servers • Maintained by Reductive Labs since 2005 • Founded by Luke Kanies, ex-BladeLogic Encapsulates cross-package installation, configuration setting, permissions, etc. in a transactional runtime San Francisco 2008 52
  53. 53. Puppet Architecture Puppetmasterd • Maintains central configuration repostiory Puppetd • Agent on each client, polls the puppetmaster every 30 minutes (adjustable) San Francisco 2008 53
  54. 54. Puppet Manifest Example class mysql::server { $mountpath = $mydc::constants::ebs_mount $datadir = quot;${mountpath}/mysqlquot; package { quot;mysql-serverquot;: ensure => installed, } include amazon::ebs file { $datadir: ensure => directory, require => Exec[quot;Mount Devicequot;] } } San Francisco 2008 54
  55. 55. Puppet Sites & Nodes node “” { include apachewebserver } node “mysql1”, “mysql2” { include mysql::server } Declaratively Adds Infrastructure to Nodes San Francisco 2008 55
  56. 56. Security Puppetmasterd provides a form of PKI for deployments • Clients are authenticated via keypairs • Can act as a Self-Signed Certificate Authority or use a registered certificate Current encrypted XML-RPC being transitioned to RESTful HTTP in a future release San Francisco 2008 56
  57. 57. Inventory and Drift Control Puppet includes Facter, a system inventory tool • Returns facts about nodes e.g. hostnames, kernel, IP addresses, etc. • Facts can then be used in Puppet configurations • Detects changes and updates information San Francisco 2008 57
  58. 58. Elastra Cloud Services Today • Load Balanced, Clustered & Recoverable MySQL, PgCluster, and Apache Tomcat 5.5 • Turn-Key Deployment on Amazon EC2 • Private beta support for VMWare or Eucalyptus In Early 2009 •Elastra Cloud Suite v2.0 Enterprise cloud server for IT services management •Open Cloud Services Resource Provisioning API Configuration Management API Administrative Tools & Utilities San Francisco 2008 58
  59. 59. Elastra Design & Deploy Lifecycle Wire Funds Tomcat Web App V 5.5 Msg Bus Mule ESB 1.6 Acct WLS Wire Svc Lombardi 10.1 Process 6 DB MySQL Application Deployment Design Markup For Iterative Design (ECML) (EDML) Desired-State Each Role, Design GUI management interface San Francisco 2008 6 59
  60. 60. Helping Drive a Collaborative IT Process Business Business Unit A IT Unit B ECML ECML EDML ECML EDML EDML Application Architect Reuse Mechanisms, Application Architect Standards, Mortgage Best Practices Private Banking Application Systems System Application Architects Admins Standard Infrastructure Images Configured Configured Infrastructure App A App B Infrastructure Parameters Business Unit Parameters Instance Architects Focus Instance On Business Logic San Francisco 2008 60
  61. 61. Lifecycle-Managed Architectures PgCluster Load Balancer Scalability Policy Load Balancing Connector Resource PgCluster Data PgCluster Data Allocation Component Component Strategy Replication Connector Monitoring Policy PgCluster Replication Component San Francisco 2008 61
  62. 62. Cloud Architecture Topics San Francisco 2008
  63. 63. Recurring Topics and Patterns Some design decisions and tradeoffs are continually associated with the cloud Some designs are due to fundamentals • e.g. CAP Tradeoffs (Consistency, Availability, Partitioning) Others are due to out-of-date software • Assuming a single machine • ...On a local area network • ...With reliable nodes San Francisco 2008 63
  64. 64. Availability > Consistency Increasingly common way of handling higher loads Locks & distributed transactions reduce availability • If my data is locked, it’s not available! A variety of techniques enable this • Caching everywhere (e.g. Memcached, Gigaspaces) • Distributed Replication (e.g. MySQL slaves) • Compensating transactions San Francisco 2008 64
  65. 65. Stateless Web / Application Servers What? Servers do not maintain state between requests (pushed to database or client) Why? • Scalability - smaller working set to manage; session replication becomes hard at scale • Reliability - easier to recover when there is no conversational state • Support - EC2 doesn’t support multicast for session replication Danger: Most enterprise web application development still makes heavy use of sessions San Francisco 2008 65
  66. 66. Partitioned Databases What? • Partitioning, also known as Sharding, or (loosely) Shared-Nothing, spreads the load across multiple instances by having each manage a subset of data Why? • Scale-up breaks down fairly quickly when dealing with spikes; scale out becomes the viable option • Shared-disk databases tend to be commercial and require high-end SANs Danger: • Cross-partition communication is very slow - must have good data locality or heavily denormalize • Doesn’t help scale “hot” write-intensive data! • Quite unfamiliar to enterprises used to large-SMP Oracle databases San Francisco 2008 66
  67. 67. Stateless Workers The most common case for elastic scalability • e.g. Animoto’s 50 -> 3600 -> 100 servers Appropriate for computationally intensive processing Though much of Enterprise IT’s processing needs are I/O-bound, not CPU-bound San Francisco 2008 67
  68. 68. Federated Identity From Lookup to Assertions SAML, WS-Federation, OAuth Public Cloud Delegated Identity Identity Private Other Cloud Cloud Major Feature of Windows Azure San Francisco 2008 68
  69. 69. Auto-Scale, Monitoring and Diagnosis The Journey of Monitoring • From Log Management & Search • ... to Aggregation and Statistics • ... to Event Correlation • ... to Complex Event Analysis How in-depth is necessary depends on how predictable or unique your application design is! San Francisco 2008 69
  70. 70. Auto-Scale, Monitoring and Diagnosis Application Deployment Failed Wire Transfers Design Design (effect) Configured Software Infrastructure 1. Aggregating Monitoring Monitoring Service 3. Correlating Data Cloud Deployment Events for Diagnosis Monitoring Service 2. Log Mining Virtualization Layer Out of Memory Errors (cause) San Francisco 2008 8 70
  71. 71. Conclusion Cloud Computing comes in many shapes and sizes • From Infrastructure • Middleware • Entire Platforms Reduces Lead Time to Deploy Systems • With varying degrees of visibility The full impact on IT Management & Operations is still unknown • Chances are it won’t eliminate what we do today Cloud architectures promote what were secondary problems to a higher status (e.g. integration, security) San Francisco 2008 71
  72. 72. Thank You Stuart Charlton Chief Software Architect, Elastra San Francisco 2008
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.