Architecting for the cloud intro, virtualization, iaa s


Published on

These are the first two lectures for a course titled Architecting for the Cloud given at the Univeersit de Los Andes in July, 2014

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Architecting for the cloud intro, virtualization, iaa s

  1. 1. Architecting for the Cloud Len and Matt Bass Introduction
  2. 2. Personal Introduction Matthew Bass • Currently – Teaching Professor for Carnegie Mellon in SE Department – Associate Director of Professional Software Engineering Programs for Alumni and Corporate Relations – Consultant • Previously – Siemens: Member of the Software Architecture Group at Siemens Corporate Research – SEI: Resident affiliate at Software Engineering Institute – SEI: Member of the technical staff – 15+ years experience as an architect and software engineer – Domains including Medical, Building Automation, Automotive, …
  3. 3. Personal Introduction II • Len Bass • My first computer (1964) IBM 7094 • 25 years at the SEI – Working on Software Architecture since ~1990 – Wide variety of domains • 2 ½ years at NICTA – Working on problems associated with operations 3
  4. 4. Introductions • Who are you? – Current position? – Background – Expectations for the course?
  5. 5. • Does anyone remember the original – It has been recently resurrected by ToysRUs • What kind of company was this? • What did it take to get the company off the ground? • What were some of the issues?
  6. 6. Launch • Founded in November 1996 by Toby Lenk – He was an employee of Disney at the time • He raised $15 million to found the company – Remember this was during the boom • He used this money to secure advertising deals and create the initial infrastructure • The company launched in Oct 1997 – It spent the upfront time building the infrastructure
  7. 7. Growth • eToys had roughly $700,000 in sales in 1997 • By 1998 they had about 100 employees • In 1998 they had about $30 million in sales – They were, however, operating at a loss • In 1999 they had about $150 million in sales – With about 1000 employees • Their break even point was about $900 million
  8. 8. Demise • eToys filed for bankruptcy in 2001 • They had $257 million in debt • One reason for their failure was the high cost of operations – The supply chain infrastructure was significant – A large part of the cost, however, was the technological infrastructure
  9. 9. A More Recent Example • How many people have heard of Pinterest? • How about Instagram? – Instagram was founded in 2010 – The initial application was developed and launched by the two founders – It was purchased 2 years later for $1 Billion
  10. 10. Instagram: Growth • Instagram had 1 million users within 2 months of launching • Within one year they had 15 million users • By April of 2012 they had 30 million users – 1 Billion photos uploaded – 5 million photos per day – 81 comments per second • Instagram had 13 employees in September of 2012
  11. 11. Pinterest: Growth • Launched in March 2010 • Had 10,000 users by December 2010 • By December 2011 it had 11 million visits a week • By March of 2012 it was the 3rd largest social networking site – Behind Facebook and Twitter • It had 10 employees at the time
  12. 12. Differences? • What are the key differences between these examples? – What kind of upfront investment was required? – What technical knowledge was required? – What resources were needed?
  13. 13. What Enabled This? • eToys had to build their own infrastructure – Required a data center – Built their own order processing capability • Pinterest and Instagram utilized an existing infrastructure – This infrastructure had all of the capabilities to support growth • This allowed Pinterest and Instagram to focus exclusively their primary applications
  14. 14. Existing Infrastructure • The Pinterest and Instagram teams could focus exclusively on their applications • The existing infrastructure supported unlimited: – Computation – Network capacity – Storage
  15. 15. Reduced Costs • Not only does it require less upfront effort, but it’s less of an upfront investment • There is essentially no capitol investment • There are operational expenses only when the system is deployed – The operational expenses are inline with the use
  16. 16. Cloud Computing • This situation exemplifies much of the promise of cloud computing – We will define cloud computing a bit later • “Cloud Computing” promises things like: – Economies of scale – Reduced capitol investment – Reduced time to market – Lower operational costs – …
  17. 17. The Benefits are Real • Organizations have in fact seen: – Increased productivity – Reduced labor costs – Reduced infrastructure costs – Improved agility – …
  18. 18. It’s Here To Stay… • Today 8 of 10 companies use some form of cloud computing* • Estimates for annual revenue from cloud services range from $20 – $100 Billion1 • In 2011 cloud budgets represented 15% of worldwide IT spending1 *CompTIA’s third annual trends cloud computing study
  19. 19. What Is Cloud Computing? “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”* * National Institute of Standards and Technology - Publication 800-145
  20. 20. Five Characteristics – NIST Definition • On-demand Self-Service – A consumer can provision computing capabilities without human interaction • Broad network access – Computing capabilities are available over the network and accessed through standard mechanisms • Resource pooling – Provider’s computing resources are pooled to serve multiple consumers with different resources dynamically assigned according to consumers’ demands • Rapid elasticity – Computing capabilities can be rapidly and elastically provisioned to quickly scale out and rapidly released to scale in • Measured service – Resource usage can be monitored, controlled, and reported. Providing transparency for both the provider and consumer
  21. 21. Measured Service • With the cloud you pay only for what you use • This is much like a utility – Think of electricity, natural gas, or water • The roots of this notion go back to the 1960’s – When computers were large and expensive
  22. 22. Resource Pooling • Resources are “pooled” to serve multiple customers • This means you likely don’t have dedicated hardware – You are sharing physical resources with others • Gives rise to things like “virtualization” and “multi- tenancy” – As we will see later on this results in some of the tradeoffs that need to be managed
  23. 23. Rapid Elasticity • As demand grows and shrinks computing capability grows and shrinks • Infrastructure providers have to be able to provision resources • The ability to provision these resources rapidly is very important • Automation is a key component of elasticity
  24. 24. Cloud Service Models • There are three primary cloud service models – Software as a Service (SaaS) – Platform as a Service (PaaS) – Infrastructure as a Service (IaaS)
  25. 25. Software as a Service • Software delivery service where software and associated data is hosted in the cloud • Most of us use some form of SaaS – Dropbox – Gmail – Google Calendar – Facebook
  26. 26. SaaS Uses • The primary “customer” for SaaS is the application consumer • These could be individuals that use email, file sharing, or social media application • They could also be business that use customer relationship management, accounting, or supply chain management solutions • The consumer doesn’t worry about the installation, deployment, or management of the service
  27. 27. Platform as a Service • The vendor provides the computing platform and solution stack as a service • This computing platform is hosted in the cloud • The consumer creates the software using tools and/or libraries from the provider • These include things like: – Application design and development – Testing and deployment – Team collaboration – …
  28. 28. PaaS Uses • The developer and administrator is the consumer of PaaS • PaaS is used to develop, deploy, and monitor cloud based applications • These services typically include things like: – Billing – Monitoring – Storage – Integration – … • These services are again not installed or administered by the consumers
  29. 29. Infrastructure as a Service • Providers offer computing resources to the consumers • Consumers can then deploy their applications on these resources • Examples include – Amazon AWS – Rackspace – Microsoft Azure
  30. 30. Uses of IaaS • Infrastructure as a service is used by developers, administrators, and organizations • IaaS includes basic computing resources – CPU – Storage – Network • The consumers don’t have direct access to the physical resources – Instead they get access to virtualized resources (more about this in the next section)
  31. 31. Others • Today you’ll see many other “service models” eg: – Data as a Service (DaaS) – Business Process as a Service (BPaaS) – … • Not sure what they all mean • This is often more marketing terminology than anything else
  32. 32. Cloud Deployment Models • Cloud services can be deployed in three primary ways – Public deployment – Private deployment – Hybrid • We’ll look briefly at these in turn
  33. 33. Public Cloud • The “public cloud” is infrastructure that is publically available – This includes both the service and the network • The resources are from a large resource pool that is shared by many • Amazon AWS, Windows Azure, and Google Compute Engine are all examples of public cloud infrastructures
  34. 34. Private Cloud • As the name implies a “private cloud” is a cloud infrastructure that’s available only to a dedicated organization • The infrastructure lives within the boundaries of the organization – It utilizes the same technical approach as the public cloud • Allows organizations to maintain control over data and infrastructure
  35. 35. Hybrid Clouds • A hybrid cloud is a set of cloud services that live on both a public and private cloud • It’s not uncommon for organizations to keep some portion of their system and data in- house • They might, however, deploy another portion in the public cloud
  36. 36. Community Clouds • Today there are many vertical segments with similar specialized needs e.g. – Government entities – Banking – Utilities – … • Providers have started to establish infrastructures to meet the needs of given communities – For example Amazon now has a region only for government entities
  37. 37. Why aren’t all Systems in the Cloud? • Given all the options and benfits, why aren’t all systems in the cloud? • Issues do exist, however, such as: – Loss of control – Privacy concerns – Difficulty complying with regulations – Security concerns – Performance problems – Availability issues – Licensing issues – …
  38. 38. Performance? Availability? • Isn’t this taken care of in the cloud? – In short, no … • Many think the cloud will automatically provide the scalability & availability needed – While the cloud infrastructure can scale that doesn’t mean your application will scale • The cloud is built from faulty components – At any point in time some parts of the cloud are not working
  39. 39. Recent Outages • This morning (July 14th, 2014) I looked up recent outages • Google: – Outages in China over the last few weeks – Outages in the US last week in search – Outages currently in Canada and Australia • Amazon and Microsoft have similar reports
  40. 40. Serving Many • The performance of the cloud is notoriously volatile • The infrastructure is shared – This means that the demand imposed is beyond your control • As a result you might not always get the resources required
  41. 41. Designing for the Cloud • Does this mean you can’t achieve properties such as availability and scalability in the cloud? – Again, no • It does mean, however, that they need to be designed into the system – In order to achieve the objectives of the organization you need to be explicit about designing the system to promote required properties
  42. 42. Course Goals • Know what the cloud is • Understand the basic structure of the cloud infrastructure • Understand the implications of the decisions taken in the cloud • Know what options exist when designing a system for the cloud • Know how to evaluate the impact of specific decisions
  43. 43. What This Course Is NOT • A course that focuses on the architecture process – We have another course that focuses on this • A course that teaches you how to use specific solutions – We are agnostic with respect to technologies, service hosts, and so forth – We don’t talk about implementation level details
  44. 44. Focus of the Course • Architectural concepts – Availability – Performance – Security • Structure of the cloud – Major components – Design decisions • Options for achieving desired properties
  45. 45. Course Overview • The course is split into three sections – Fundamentals – Infrastructure – Architecting for the Cloud
  46. 46. Fundamentals • In order to understand the decisions and related impact we need to understand some basic concepts • In order to make sure we all have the same understanding we’ll be talking about: – Availability – Scalability/Performance – Security
  47. 47. Infrastructure • In this section we will describe the infrastructure of the cloud itself • We’ll discuss: – The key concepts – The key design decisions – The benefits of these decisions – The tradeoffs associated with these decisions
  48. 48. Architecting For The Cloud • We will then talk about architecting for the cloud • We will discuss what options exist for achieving architectural concerns • We will talk about the tradeoffs and considerations when selecting these options • We’ll also look at operational concerns
  49. 49. Questions?
  50. 50. References 1 Cloud Computing Issues and Impacts Ernst and Young GTD series
  51. 51. Architecting for the Cloud Len and Matt Bass Virtualization and IaaS
  52. 52. Overview • Virtual machine • Virtual network • IaaS
  53. 53. Overview • Virtual machine • Virtual network • Other related topics
  54. 54. What does a virtual machine look like from a user’s perspective? • We explore this question using VirtualBox – an open source tool from Oracle.
  55. 55. Getting Started • Download VirtualBox • Create new image • “Boot it” • Brings up terminal but does not proceed further since there is no boot disk.
  56. 56. Going further • Download machine image with desired software – I used ubuntu 12.04.4 • Import machine image into VirtualBox
  57. 57. Virtual Box with imported image
  58. 58. Boot it • Double click on machine description • Brings up Ubuntu with Firefox and Libre (Office suite) • Point Firefox to the NICTA web site.
  59. 59. Executing VM • Middle of the screen is the Ubuntu image in the virtual machine. The rest is Windows
  60. 60. What do we have? • Fully functional Ubuntu system is running within VirtualBox. • Downloading other machine image would result in different OS or different software. • IP addresses: – Windows 2402:1800:1:2801:4492:2f34:dd2e:1079 – Ubuntu: 08:00:27:51:7f:09 • Ubuntu system is “sandboxed” from Windows – Cannot import or export files or data directly. – Could probably import/export through file sharing, e.g. Google Drive.
  61. 61. Key concepts • Computer – “virtualized” • Machine image – set of bits that are loaded into the virtualized computer • Result gets an IP address that is distinct from the IP address of the host.
  62. 62. How is this different from a VM in the cloud? • Not much. • The “cloud” is a publically accessible platform with 100000s of computers. • My Windows host is 1 computer. • You interact with VirtualBox through desktop peripherals – keyboard, – VirtualBox directly paints on the screen • You interact with the cloud through http. – Could be through a browser – Could be through an app on your device (desktop, laptop, mobile)
  63. 63. Virtual Machine instances • Virtual Memory • Hypervisor • Virtual machines • Virtual machine images
  64. 64. Virtual Memory address translation • Hardware enables trapping instructions that are outside of current address space. Target address of next instruction Physical address inside current address space Fetch next instruction from physical address Physical address outside current address space Fetch next instruction from interrupt handler Page table used to convert target address to physical address CPU
  65. 65. Hypervisor and Virtual Machine • Target address goes through two different page tables to fetch the next instruction. • First points to virtual machine page table • Second points to address of next instruction • Hardware is set up to support this process • Hypervisor is supervisory program that manages page tables and scheduling of Virtual Machines VMn VM1 CPU Target address of next instruction Page Table Page Table next instruction Host Page table points to VM page table
  66. 66. Virtual Machine • Computer with bare hardware • Instruction set is the same as the host computer • Address space is guaranteed private from other virtual machines (through the addressing mechanism) • Available memory may be less than that in the host machine • Processor is shared across all virtual machines on a single host machine. Virtual Machine
  67. 67. Virtual machine images • Bare (virtual) hardware may be all that is necessary for some uses. E.g. operating system revisions. • For other uses it is useful to have an operating system and possibly some applications. Application licensing is, typically, by virtual machine. • The cloud infrastructure provides the capability to preload a virtual machine with an image. This image can be from a library or from something created by the user on a previous visit to the cloud. Sample image might be LAMP – Linux, Apache Server, MySQL, PhP • Furthermore, it might be that a memory image is saved by an application to allow for restart in the case of failure.
  68. 68. Overview • Virtual machine • Virtual network • Other related topics
  69. 69. Virtual networks • DNS server • IP addresses • IP messages • IP management
  70. 70. Domain Name Server (DNS) Domain Name Server Client sends URL to DNS DNS takes as input a URL and returns an IP address Client uses IP address to send message to a site
  71. 71. Complications • In reality, messages being transmitted from one computer to another is more complicated. • The picture showed a single DNS server. – There are multiple DNS servers – There is a hierarchy of DNS servers. • The picture showed a single line from client to server. – There is a network for routers to transmit messages – Shares load – Hierarchy based on IP number.
  72. 72. DNS Hierarchy • Consider URL – If one server held all DNS -> IP mappings, it would both get overloaded and hold over 200 million mappings. • DNS is arranged as a hierarchy. • There is an “authorative” name server that holds all of the final suffixes (e.g. .au, .edu, .com, .co) • It is replicated for performance reasons
  73. 73. Finding • The final suffix DNS has the IP of the .au DNS. • The .au DNS has the IP of the DNS • The DNS has the IP of • The DNS, in turn, has IP for various local DNSs that are under NICTA’s control. • This allows NICTA to change the IP of the various local DNSs without changing anything up the hierarchy. • This becomes important when we discuss business continuity options.
  74. 74. IPv4 and IPv6 • An IP (Internet Protocol) address is a numerical label that identifies a “device” on the internet. • IPv4 is 32 bits long and gives a four digit sequence - • 32 bits is insufficient and so IPv6 was created in 1995 and it has 128 bits. • For legacy reasons, IPv6 has had a very slow adoption. IPv4 numbers have been exhausted. This is causing more conversion to IPv6. June 8, 2011 was designated as world IPv6 day where top websites and internet providers provided a 24 hour test of IPv6 infrastructure. This test was successful. • Google publishes statistics for percentage of users that access Google over IPv6. It is now around 3.25%
  75. 75. Assigning IP addresses • Every “device” on the internet includes virtual machines in a cloud. • Every VM gets an IP address when it is created. This IP address can be • Private and not seen outside of the cloud. • Public and directly addressable from outside of the cloud. • An IP message has a header and a payload. The header includes – IP address of the source – IP address of the destination
  76. 76. Private and Public IP addresses Private IP addresses: – If IPA sends message to IPB, i.e., IPA+payload -> IPB , a gateway can make it look like the message comes from the gateway. i.e. IPgateway+payload -> IPB – In this case the gateway must maintain a table so that it can manage the response from IPB Public IP addresses The VM manager is given a range of IP addresses that it can assign to VM instances. An assignment only lasts as long as the instance does, then it can be re- assigned. Messages from the instances come from the assigned IP address and recipient can respond directly to instance. What does this have to do with DNS servers?
  77. 77. Getting a message to VM inside the cloud Cloud Internet gateway Public IP? Yes No Translate to private IP
  78. 78. Overview • Virtual machine • Virtual network • IaaS
  79. 79. File space • Virtual machine is allocated space on host computer’s disk. • This is local disk available to the VM. • More extensive disk space is available through other features. We return to this when we discuss various file options.
  80. 80. Multi-tenancy • One physical computer hosts multiple virtual machines. • Messages to/from virtual machine go through hypervisor. • Host machine’s disk is shared among hypervisor and VMs hosted on that machine. • Multi-tenancy has implications with respect to performance and security.
  81. 81. Machine image • Set of bits that are constitute execution environment. • As with VirtualBox could be – Empty – Operating system – Operating system + middleware – Operating system + middle ware + application – Operating system + middleware + application + data • Why put data in a machine image? – Configuration values such as image ID – History such as where image came from – Location of other configuration parameters
  82. 82. Allocating VM • Cloud management system – Chooses which physical host has capacity for new VM – Assigns IP address and keeps mapping of IP address to physical host in internal routing table. – Tells hypervisor to allocate new VM and sends hypervisor IP address and pointer to machine image. • Hypervisor on chosen physical host – Creates page table for new VM – Allocates disk space – Keeps internal mapping from IP address to VM – loads machine image into allocated VM.
  83. 83. Removing instance • Instance removal is a matter of undoing the steps involved in allocating an instance. • Local disk should be cleared so that information stored on it is no longer available • Public IP addresses may cause a problem since IP address may be reallocated to a different VM. – Amazon allows you to map an IP address to different instances under program control. • Clients of the application that know the IP address may use it to send messages that will arrive at a different VM.
  84. 84. Configuration parameters Applications running in the cloud require hundreds or thousands of configuration parameters. – Hadoop has 206 options – Hbase has 64 • Place configuration parameters in persistent storage • Build knowledge of location into application • We will return to configuration parameter issue when we discuss deployment pipeline.
  85. 85. Environments • An environment for a system consists of – The system + – its configuration parameters + – The external systems with which it interacts • Now suppose all external systems are defined through configuration parameters • Then the environment can be changed by changing the configuration parameters • These are architectural decisions that need to be made.
  86. 86. Defining Environments • Keep all configuration parameters in a database read at system initialization • Then moving from one environment to another is a matter of changing the database from which the configuration parameters are read.
  87. 87. Testing Environment Production Environment DNS Server Test database Production database E.g, Moving from test environment to production environment is a matter of changinga single database pointer
  88. 88. Using the cloud • The environments in which a system lives include – Development (usually on your desktop) – Integration (in the cloud) – Staging for performance and user acceptance – Production • Keeping all configuration parameters in databases and reading the relevant database at initialization allows for easy movement from one environment to another.
  89. 89. IaaS Issues – 1 • Reliability. – IaaS providers provide sophisticated reliability mechanisms. – Instances may fail – Consumers must perform risk analysis to determine the extent to which they wish to supplement providers reliability mechanisms with additional mechanisms. • Performance. – Multi-tenancy impacts performance compared to individual machines because of the sharing of the CPU and the overhead of the virtualization mechanisms – Allocating additional instances for scaling is the responsibility of the consumer either explicitly or through setting rules for allocation. – All access to the cloud is through the internet introducing latency delays over when the data is stored locally.
  90. 90. IaaS issues - 2 • Security – Normal types of attacks through the internet are no different in the cloud. – Customers must trust the IaaS provider to respect the privacy of data and computations. – Multi-tenancy allows for other types of attacks based on information leakage. E.g. a side channel attack can use cache timing information to detect keys. • Interoperability – Each cloud provider has their own set of interfaces and standards. This introduces significant risk of vendor lock in.
  91. 91. IaaS issues - 3 • Law/regulations with respect to data location. – Some jurisdictions require that data not leave their jurisdiction – e.g. EU has different privacy laws than the US. Following scenarios cause concern: • EU data stored in a US data center – disallowed by EU law? • The same data stored in two different locations may mean that one set of data is available to a government entity • Disclosure laws when someone accesses protected data differ in different jurisdictions. • Some jurisdictions require that the cloud provider make available keys and passwords.
  92. 92. Summary • Internet as a service has compelling economic justification. • The architecture for IaaS is based on having – virtual machines, – virtual networks, and – virtual file systems. managed by a cloud management system • The concept of an environment for a system simplifies moving a system from development to production • IaaS platform has different set of issues from local platforms and architect must be aware of these issues.
  93. 93. QUESTIONS?