From the NIST Cloud Computing On-demand self-service. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service’s provider.Broad network access. Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, laptops, and PDAs).Resource pooling.The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter). Examples of resources include storage, processing, memory, network bandwidth, and virtual machines.This is different than virtual private hosting which is constrained to a single host or hosted Exchange server with fixed storage limits. Rapid elasticity.Capabilities can be rapidly and elastically provisioned, in some cases automatically, to quickly scale out, and rapidly released to quickly scale in. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be purchased in any quantity at any time.Measured Service. Cloud systems automatically control and optimize resource use by leveraging a metering capability1 at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.
Cloud Software as a Service (SaaS) – The Application CloudThe capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure. The applications are accessible from various client devices through a thin client interface such as a web browser (e.g., web-based email). The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings.Cloud Platform as a Service (PaaS) – The Development Cloud The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly application hosting environment configurations.Cloud Infrastructure as a Service (IaaS). – Systems CloudThe capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, deployed applications, and possibly limited control of select networking components (e.g., host firewalls).
Private cloudThe cloud infrastructure is operated solely for an organization. It may be managed by the organization or a third party and may exist on premise or off premise.Public cloudThe cloud infrastructure is made available to the general public or a large industry group and is owned by an organization selling cloud services.Hybrid cloudThe cloud infrastructure is a composition of two or more clouds (private, community, or public) that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load balancing between clouds).
Derived from the NIST Diagram Physical Resources NetworkingComputeStorageBios/FirmwareSoftware KernelOperating Systems with Type II HypervisorsVM Manager (VMM) – Type 1 Hypervisors Virtualized Resources NetworkingComputeStorageVirtualized ResourcesMetadataVirtual Machine Images
OVFAn OVF package consists of several files, placed in one directory. A one-file alternative is the OVA package, which is a TAR file with the OVF directory inside.OVF is a packaging format for software appliances. From a technical point of view, an OVF is a transport mechanism for virtual machine templates. One OVF may contain a single VM, or many VMs (it is left to the software appliance developer to decide which arrangement best suits their application). OVFs must be installed before they can be run; a particular virtualization platform may run the VM from the OVF, but this is not required. If this is done, the OVF itself can no longer be viewed as a “golden image” version of the appliance, since run-time state for the virtual machine(s) will pervade the OVF. Moreover the digital signature that allows the platform to check the integrity of the OVF will be invalidAn Amazon Machine Image (AMI) is a special type of virtual appliance which is used to instantiate (create) a virtual machine within the Amazon Elastic Compute Cloud. It serves as the basic unit of deployment for services delivered using EC2..Amazon AMI An Amazon Machine Image (AMI) is a special type of virtual appliance which is used to instantiate (create) a virtual machine within the Amazon Elastic Compute Cloud. It serves as the basic unit of deployment for services delivered using EC2. Like all virtual appliances, the main component of an AMI is a read-only filesystem image which includes an operating system (e.g., Linux, UNIX, or Windows) and any additional software required to deliver a service or a portion of it.The AMI filesystem is compressed, encrypted, signed, split into a series of 10MB chunks and uploaded into Amazon S3 for storage. An XML manifest file stores information about the AMI, including name, version, architecture, default kernel id, decryption key and digests for all of the filesystem chunks.An AMI does not include a kernel image, only a pointer to the default kernel id, which can be chosen from an approved list of safe kernels maintained by Amazon and its partners (e.g., RedHat, Canonical, Microsoft). Users may choose kernels other than the default when booting an AMI.QCOW2 – QEMU “Copy on Write” Version 2qcow stands for "QEMU Copy On Write" and denotes a disk storage optimization strategy that delays allocation of storage until it is actually needed. QEMU is an emulator and virtual machine container, and it can use a variety of virtual disk images which are generally associated with specific guests operating systems.qcow2 is a newer version of the qcow format. QEMU can use a base image which is read-only, and store all writes to the qcow2 image. Among the QEMU supported formats, this is the most versatile format. Features include smaller images (useful if the filesystem does not support holes, for example on FAT32), optional AES encryption, zlib based compression and support of multiple VM snapshots. qemu and xen have retained the qcow format for backwards compatibility. Users can easily convert qcow disk images to the qcow2 format.VMDK - Virtual Machine Disk VMDK (Virtual Machine Disk) is a file format used for virtual appliances developed for VMware products. The format is a container for virtual hard disk drives to be used in virtual machines like VMware Workstation or Virtualbox. VMDK is an open format.IMGThe IMG file extension is used by files which are standardized raw dumps of a disk, and by files in various formats created by different imaging programs.Xen can use raw disk images and physical disks as filesystems for a Xen based domainU. Another option is to use the disk images used by QEMU. VHD – Virtual Hard Disk Virtual Hard Disk format started by Connectix (now part of Microsoft) made open through the Microsoft Open Specification Promise.VHDs are implemented as files that reside on the native host file system. The following types of VHD formats are supported by Microsoft Virtual PC and Virtual Server:Fixed hard disk image: a file that is allocated to the size of the virtual disk. Fixed VHDs consist of a raw disk image followed by a VHD footer (512 or formerly 511 bytes).Dynamic hard disk image: a file that at any given time is as large as the actual data written to it, plus the size of the header and footer. Dynamic and differencing VHDs begin with a copy of the VHD footer (padded to 512 bytes), and for dynamic or differencing VHDs created by Microsoft products this results in a VHD-cookie string conectix at the begin of the VHD file.Differencing hard disk image: a set of modified blocks (maintained in a separate file referred to as the "child image") in comparison to a parent image. The Differencing hard disk image format allows the concept of Undo Changes: when enabled, all changes to a hard drive contained within a VHD (the parent image) are stored in a separate file (the child image). Options are available to undo the changes to the VHD, or to merge them permanently into the VHD. Different child images based on the same parent image also allow "cloning" of VHDs; at least the globally unique identifier (GUID) must be different.Linked to a hard disk: a file which contains a link to a physical hard drive or partition of a physical hard drive
Software appliances are like toasters, they do one thing very well. BitnamiBitNami Cloud Images allow BitNami Stacks to run in a cloud computing environment. BitNami offers Amazon Machine Images (AMIs) for running BitNami Stacks on the Amazon Cloud, as well as BitNami Cloud Hosting, a service that simplifies the process of running open source applications on Amazon EC2.BoxGrinderBoxGrinder supports many virtualization and Cloud platforms like EC2, Xen, KVM, VMware. You can create an appliance based on Fedora, Red Hat Enterprise Linux or CentOS. You are of course free to write your own plugin to support any other virtualization platform or operating system.SUSE StudioSuSE Studio allows you to use a hosted build service and a on premise virtual build system. Has a RESTful API to make calls to SUSE Studio openSUSE, SUSE Enterprise Linux (SuSE) and JeOSIntegrates with SUSE Lifecycle Management Server and WebYASTCan Share Images in the SUSE Studio Gallery
Top choices for Cloud Computing are Xen and KVM.OpenVZ, container virtualization for Linux, is an interesting option as it has a very minimal overhead to scale application space similar to containers like BSD Jails. Advantage is that memory allocation is soft and unutilized memory can be used by other applications.
CloudStack – www.cloudstack.org - CloudStack is a sponsored by Citrix systems released under GPLv3 that provides a highly capable IaaS solution for service providers and enterprises. Robust Web Interface Comprehensive APISecure-Single Sign-OnDynamic Workload ManagementXenserver, Xen Cloud Platform, KVM, VMware, OracleVM supportSecure AJAX Console for VMsNetworking-as-a-Service (Create VLANs to segregate traffic)EC2 API Compatibility Usage MeteringEucalyptus– http://open.eucalyptus.com - IaaS platform originally targeted to provide migration path from Amazon EC2 to private cloud. Amazon AWS Interface CompatibilitySupports Amazon AMIHigh AvailabilityNetwork Management, Security Groups, Traffic IsolationSelf Service S3 compatible Storage Bucket-Based StorageXen and KVM Hypervisor Support (VMware in Enterprise Edition)User Group and Role-Based ManagementOpenStack– www.openstack.org - Sponsored by Rackspace, a hosting provider is made up by three primary projects. OpenStack Compute (Nova) – Nova is a cloud orchestration platform similar to Amazon EC2 Orchestration of popular hypervisors (Xen, Xenserver, KVM, Hyper-V, VMware, Linux Containers)Floating IP Addresses (keep IPs and DNS correct when restarting VMs)VNC proxy through the WebApache 2.0 License Android/iOS ClientsBlock Storage Support (AOE, iSCSI, Sheepdog)OpenStack Storage (Swift) – Is a EBS style solution used for long term storage not real time. Swift is used creating redundant, scalable object storage using clusters of standardized servers to store petabytes of accessible data.Features:Store and Manage files ProgrammaticallyCreate public and private folders Using Commodity HardwareFault tolerant (Nodes/HDD)Scale-out, Scale-UpOpenStack Image Service(Glance) - OpenStack Image Service (code-named Glance) provides discovery, registration, and delivery services for virtual disk images.Features:Provides images-as-a-serviceSupports Raw, VHD, VDI, qcow2, VMDK, OVF Restful APIBackend Options – Swift, Local, S3, HTTPVersion Control and LoggingOpenNebula – http://www.opennebula.org/ – Cloud Computing Toolkit Apache license
Scale Up Scale Out
GlusterFS is an open source scale-out NAS solution. The software is a powerful and flexible solution that simplifies the task of managing unstructured file data whether you have a few terabytes of storage or multiple petabytes.Ceph is a distributed network storage and file system designed to provide excellent performance, reliability, and scalability. Ceph is based on a reliable and scalable distributed object store, with a distributed metadata management cluster layered on top to provide a distributed file system with POSIX semantics. There are a variety of ways to interact with the systemOpenStack Object Storage (code-named Swift) is open source software for creating redundant, scalable object storage using clusters of standardized servers to store petabytes of accessible data. It is not a file system or real-time data storage system, but rather a long-term storage system for a more permanent type of static data that can be retrieved, leveraged, and then updated if necessary. Primary examples of data that best fit this type of storage model are virtual machine images, photo storage, email storage and backup archiving. Having no central "brain" or master point of control provides greater scalability, redundancy and permanence.Sheepdog is a distributed storage system for QEMU/KVM. It provides highly available block level storage volumes that can be attached to QEMU/KVM virtual machines. Sheepdog scales to several hundreds nodes, and supports advanced volume management features such as snapshot, cloning, and thin provisioning.
Types of Tasks Accomplished by an APIProvisioning (creating, re-creating, moving, or deleting components e.g. virtual machines, vlans)Configuration (assigning or changing attributes of the architecture such as security and network settings)Cloud ProvidersJclouds – java API Abstraction Libcloud – started by CloudKick (now Rackspace) to abstract clouds, Apache incubator projectDeltacloud – started by Red Hat to abstract clouds, Apache incubator projectFog - provider and abstraction level API across compute and storage, written in Ruby
CloudFoundryCloud Foundry, a VMware-led project, for building a Platform as a Service (PaaS) offering. Cloud Foundry provides a platform for building, deploying, and running cloud apps using Spring for Java developers, Rails and Sinatra for Ruby developers, Node.js and other JVM frameworks including Grails.OpenShiftA free Platform-as-a-Service that enables developers to deploy apps written in multiple frameworks and languages across clouds. Open source licensing is forthcoming. PHPFogThe PHP Fog application stack is designed to provide reliability, ease of use, scalability, and speed. From the incoming HTTP request to the delivery of your critical data and features, we’ve baked in redundancies and optimizations in every piece of the stack to deliver reliability and speed. We’ve talked to thousands of customers to understand the pain points and build an infrastructure that automates scalability and makes deployment and management of applications easy. Developers love us, and IT departments need us.StackatoStackato enables you to create a private PaaS hosted on the cloud of your choice (your own or with a hosting provider) to empower your developers to deploy, run, and manage their applications in the cloud. Stackato includes:Multi-choice cloud application platform with automatic provisioning:choice of language (Java, Python, PHP, Ruby, Perl, Node.js, Erlang, Scala, Clojure)choice of framework (popular frameworks for each of the languages above, such as Spring, Django, Pyramid, Rails, Mojolicious, Catalyst and more)choice of data service (MySQL, PostgreSQL, Redis, MongoDB) plus ability to connect to othersWSO2 The WSO2 middleware platform offers a full range of core services: application server, enterprise service bus (ESB), governance registry and repository, identity and access management, business process management (BPM), business activity monitor (BAM), portal server and more. WSO2 Stratos monitors CPU, memory and bandwidth utilization, and SLAs. Then it automatically scales up or down depending on the load. When new resources are needed, WSO2 Stratos transparently adds services and when load goes down, WSO2 Stratos automatically brings services down. Dynamic discovery enables services to automatically detect when resource allocations change; there is no need for manual monitoring or reconfiguration.
MeatCloud, Can’t Keep up with Cloud ComputingDevops & Agile IT PhilosophyScript Repetitive TasksAutomate, Automate, Automate
Other disciplines like back-up, log management, performance and security (virus,intrusion detection) are important but not core to the delivery of cloud computing systems
Ideally for the cloud you create management toolchains that automate the management of your cloud. So that the output of one tool informs the input of another.
These tools are all appropriate for Linux guest operating systems, Windows operating system provisioning is not well addressed in OSS. CobblerCobbler is a Linux installation server that allows for rapid setup of network installation environments. It glues together and automates many associated Linux tasks so you do not have to hop between lots of various commands and applications when rolling out new systems, and, in some cases, changing existing ones. With a simple series of commands, network installs can be configured for PXE, reinstallations, media-based net-installs, and virtualized installs (supporting Xen, qemu, KVM, and some variants of VMware). Cobbler uses a helper program called 'koan' (which interacts with Cobbler) for reinstallation and virtualization support. SpacewalkSpacewalk manages software content updates for Red Hat derived distributions such as Fedora, CentOS, and Scientific Linux, within your firewall. You can stage software content through different environments, managing the deployment of updates to systems and allowing you to view at which update level any given system is at across your deployment. A clean central web interface allows viewing of systems and their software update status, and initiating update actions.CrowbarBare metal provisioning for CloudStack developed by Dell using Opscode Chef.
CfengineCFEngine is a policy-based configuration management system written by Mark Burgess at Oslo University College. Its primary function is to provide automated configuration and maintenance of computers, from a policy specification. The CFEngine project was started in 1993 as a reaction to the complexity and non-portability of shell scripting for Unix configuration management, and continues today. The aim was to absorb frequently used coding paradigms into a declarative, domain-specific language that would offer self-documenting configuration.Cfengine 3.0 Nova latest version October 2011. Native Windows support, on the fly support for Hupervisor configuration KVM/Xen using libvirt (in commercial version)Opscode Chef With Chef, you write abstract definitions as source code to describe how you want each part of your infrastructure to be built, and then apply those descriptions to individual servers. The result is a fully automated infrastructure: when a new server comes on line, the only thing you have to do is tell Chef what role it should play in your architecture. Chef performs actions defined in recipes to configure systems. Recipes are written in Ruby with specific domain specific language (DSL) extensions to specify configuration resources. A Recipe describes a series of resources that should be in a particular state on a particular part of a server (such as Apache, MySQL, or Hadoop). This might include packages that should be installed, services that should be running, or files that should be written. When Recipes are run, Chef makes sure that each resource is properly configured, only taking corrective action when it's necessary. The result is a safe, flexible mechanism for making sure your servers are always running exactly how you want them to be.PuppetPuppet, an automated administrative engine for your *nix systems, performs administrative tasks (such as adding users, installing packages, and updating server configurations) based on a centralized specification.SaltStackSalt is a powerful remote execution manager that can be used to administer servers in a fast and efficient way.Salt allows commands to be executed across large groups of servers. This means systems can be easily managed, but data can also be easily gathered. Quick introspection into running systems becomes a reality.Remote execution is usually used to set up a certain state on a remote system. Salt addresses this problem as well, the salt state system uses salt state files to define the state a server needs to be in.Between the remote execution system, and state management Salt addresses the backbone of cloud and data center management.
CactiCacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices.RRDToolRRDtool is the OpenSource industry standard, high performance data logging and graphing system for time series data. RRDtool can be easily integrated in shell scripts, perl, python, ruby, lua or tcl applications.Graphite Graphite is a highly scalable real-time graphing system. As a user, you write an application that collects numeric time-series data that you are interested in graphing, and send it to Graphite's processing backend, carbon, which stores the data in Graphite's specialized database. The data can then be visualized through graphite's web interfaces.
CapistranoCapistrano is a developer tool for deploying web applications. It is typically installed on a workstation, and used to deploy code from your source code management (SCM) to one, or more servers.Capistrano recently added classes capabilities that match cobbler. RunDeckRunDeck is cross-platform open source software that helps you automate ad-hoc and routine procedures in data center or cloud environments. RunDeck allows you to run tasks on any number of nodes from a web-based or command-line interface. RunDeck also includes other features that make it easy to scale up your scripting efforts including: access control, workflow building, scheduling, logging, and integration with external sources for node and option data.FuncFunc allows for running commands on remote systems in a secure way, like SSH, but offers several improvements. Func allows you to manage an arbitrary group of machines all at once. Func automatically distributes certificates to all "slave" machines. There's almost nothing to configure. Func comes with a command line for sending remote commands and gathering data. There are lots of modules already provided for common tasks. Anyone can write their own modules using the simple Python module API. Everything that can be done with the command line can be done with the Python client API. The hack potential is unlimited. You'll never have to use "expect" or other ugly hacks to automate your workflow. It's really simple under the covers. Func works over XMLRPC and SSL. Since func uses certmaster, any program can use func certificates, latch on to them, and take advantage of secure master-to-slave communication. There are no databases or crazy stuff to install and configure. Again, certificate distribution is automatic too. McollectiveThe Marionette Collective AKA mcollective is a framework to build server orchestration or parallel job execution systems.Mcollective is used as a means of programmatic execution of Systems Administration actions on clusters of servers. MCollective use modern tools like Publish Subscribe Middleware and modern philosophies like real time discovery of network resources using meta data and not hostnames. Delivering a very scalable and very fast parallel execution environment.
Automated Toolchain(For Linux guests) Bootstrapped image is launched fro a template in the cloud provider, then searches for the Cobbler server.Post Install from Cobbler kicks off Puppet with defined management class to configure server using rolesAfter cobbler runs kicks off configuration management in Puppet. Then services can be started and stopped with RunDeck or post-install scriptsThen RunDeck can insert new hosts in Zenoss or NagiosFinally as the network conditions change Zenoss can remediate via other tools based on situational awareness
OpenCloudConf: It takes an (Open Source) Village to Build a Cloud
It Takes an (Open Source)Village to Build a Cloud Mark R. Hinkle Senior Director, Cloud Computing Community Citrix Systems Inc.
7Cloud Computing Service Models USER CLOUD a.k.a. SOFTWARE-AS-A-SERVICE Single application, multi-tenancy, network-based, one-to-many delivery of applications, all users have same access to features. DEVELOPMENT CLOUD a.k.a. PLATFORM-AS-A-SERVICE Application developer model, Application deployed to an elastic service that auto-scales, low administrative overhead. No concept of virtual machines or operating system. Code it and deploy it. SYSTEMS CLOUD a.k.a INFRASTRUCTURE-AS-A-SERVICE Servers and storage are made available in a scalable way over a network.
9Cloud Still RequiresArchitectural Design• Cloud Computing isn’t a magical solution apps need to be able to scale out• Design your architecture with the end in mind• Make your infrastructure easily replicable
12Why Open Source?• User-Driven Solutions to Real Problems• Lower barrier to participation• Larger user base, users helping users• Aggressive release cycles stay current with the state- of-the-art• Open data, Open standards, Open APIs
13Open Virtual Machine FormatsOpen Virtualization Format (OVF) is an openstandard for packaging and distributing virtualappliances or more generally software to be runin virtual machines. • • • • •
14Sourcing Open Source SoftwareVMs and Cloud Appliances
15HypervisorsOpen Source• Xen, Xen Cloud Platform (XCP)• KVM – Kernel-based Virtualization• VirtualBox* - Oracle supported Virtualization Solutions• OpenVZ* - Container-based, Similar to Solaris Containers or BSD Zones• LXC – User Space chrooted installsProprietary• VMware• Citrix Xenserver• Microsoft Hyper-V• OracleVM (Based on OS Xen)
17Scale-Up or Scale-OutVertical Scaling (Scale-Up) – Allocate additionalresources to VMs, requires areboot, no need fordistributed app logic, single-point of OS failureHorizontal Scaling(Scale-Out) – Applicationneeds logic to work indistributed fashion (e.g. HA-Proxy and Apache, Hadoop)
22Automation Unlocksthe Potential of the Cloud
234 Types of Management ToolsProvisioningInstallation of operating systems and othersoftwareConfiguration ManagementSets the parameters for servers, can specifyinstallation parametersOrchestration/AutomationAutomate tasks across systemsMonitoringRecords errors and health of IT infrastructure