From the NIST (http://collaborate.nist.gov/twiki-cloud-computing/bin/view/CloudComputing/) Cloud Computing Definition On-demand self-service. A consumer can unilaterally provision computing capabilities, such as server time and network storage, as needed automatically without requiring human interaction with each service’s provider.Broad network access. Capabilities are available over the network and accessed through standard mechanisms that promote use by heterogeneous thin or thick client platforms (e.g., mobile phones, laptops, and PDAs).Resource pooling.The provider’s computing resources are pooled to serve multiple consumers using a multi-tenant model, with different physical and virtual resources dynamically assigned and reassigned according to consumer demand. There is a sense of location independence in that the customer generally has no control or knowledge over the exact location of the provided resources but may be able to specify location at a higher level of abstraction (e.g., country, state, or datacenter). Examples of resources include storage, processing, memory, network bandwidth, and virtual machines.This is different than virtual private hosting which is constrained to a single host or hosted Exchange server with fixed storage limits. Rapid elasticity.Capabilities can be rapidly and elastically provisioned, in some cases automatically, to quickly scale out, and rapidly released to quickly scale in. To the consumer, the capabilities available for provisioning often appear to be unlimited and can be purchased in any quantity at any time.Measured Service. Cloud systems automatically control and optimize resource use by leveraging a metering capability1 at some level of abstraction appropriate to the type of service (e.g., storage, processing, bandwidth, and active user accounts). Resource usage can be monitored, controlled, and reported, providing transparency for both the provider and consumer of the utilized service.
Cloud Software as a Service (SaaS) – The Application CloudThe capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure. The applications are accessible from various client devices through a thin client interface such as a web browser (e.g., web-based email). The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings.Cloud Platform as a Service (PaaS) – The Development Cloud The capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages and tools supported by the provider. The consumer does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possibly application hosting environment configurations.Cloud Infrastructure as a Service (IaaS). – Systems CloudThe capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer is able to deploy and run arbitrary software, which can include operating systems and applications. The consumer does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, deployed applications, and possibly limited control of select networking components (e.g., host firewalls).
Private cloudThe cloud infrastructure is operated solely for an organization. It may be managed by the organization or a third party and may exist on premise or off premise.Public cloudThe cloud infrastructure is made available to the general public or a large industry group and is owned by an organization selling cloud services.Hybrid cloudThe cloud infrastructure is a composition of two or more clouds (private, community, or public) that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load balancing between clouds).
An OVF package consists of several files, placed in one directory. A one-file alternative is the OVA package, which is a TAR file with the OVF directory inside.OVF is a packaging format for software appliances. From a technical point of view, an OVF is a transport mechanism for virtual machine templates. One OVF may containa single VM, or many VMs (it is left to the software appliance developer to decide which arrangement best suits their application). OVFs must be installed before they can be run; a particular virtualization platform may run the VM from the OVF, but this is not required. If this is done, the OVF itself can no longer be viewed as a “golden image” version of the appliance, since run-time state for the virtual machine(s) will pervade the OVF. Moreover the digital signature that allows the platform to check the integrity of the OVF will be invalid.
BitnamiBitNami Cloud Images allow BitNami Stacks to run in a cloud computing environment. BitNami offers Amazon Machine Images (AMIs) for running BitNami Stacks on the Amazon Cloud, as well as BitNami Cloud Hosting, a service that simplifies the process of running open source applications on Amazon EC2.BoxGrinderBoxGrinder supports many virtualization and Cloud platforms like EC2, Xen, KVM, VMware. You can create an appliance based on Fedora, Red Hat Enterprise Linux or CentOS. You are of course free to write your own plugin to support any other virtualization platform or operating system.SUSE StudioSuSE Studio sponsored allows users to create customized virtual machines and cloud images. SUSE Studio enables developers to quickly create, test and deploy virtual applications for all major hypervisors, including VMware, KVM and Xen as well as industry standards like OVF.
Most popular cloud computing platforms are are Xen and KVM.Xen The Xen Cloud Platform (XCP) is an open source enterprise-ready server virtualization and cloud computing platform, delivering the Xen Hypervisor with support for a range of guest operating systems including Windows® and Linux® network and storage support, management tools in a single, tested installable image.XCP was originally derived from Citrix XenServer. Today, the XCP code is licensed under the GNU General Public License (GPL2) and is available at no charge in both source and binary format. XCP is, and always will be, open sourced, uniting the industry and the Xen ecosystem to speed the adoption of virtualization and cloud technologies and actively works with open source and open standards to help solve challenges in cloud mobility.OpenVZOpenVZ, container virtualization for Linux, is an interesting option as it has a very minimal overhead to scale application space similar to containers like BSD Jails. Advantage is that memory allocation is soft and unutilized memory can be used by other applications.
Two primary Compute Clouds I would recommend are CloudStack and OpenStack.
Project Home Page – www.cloudstack.org FeaturesMultiple Hypervisors SupportMixed Hypervisors in the same cloud Multi-Tenant cloud computing platformCompatible with Commodity or Enterprise ComponentsBroad Hypervisor Support (Xenserver, KVM, VMware vSphere)Scalable Architecture (manage thousands of hosts and virtual machine guests)High Availability configurations to provide automatic fail-over for virtual machinesEasy-to-Use AJAX-enabled web interfaceConfigurable to deploy public, private and hybrid clouds Virtual Networking to segment network traffic into VLANsRobust API Amazon EC2 Compatibility layerWritten in Java for proven reliabilityAbility to define service level definitions with specific resource footprintsOpen Source, available under the GPL version 3
Project Home Page – www.cloudstack.orgOpenStack FeatureManage virtualized commodity server resourcesCPU, memory, disk, and network interfacesManage Local Area Networks (LAN)Flat, Flat DHCP, VLAN DHCP, IPv6API with rate limiting and authenticationDistributed and asynchronous architecture
CloudFoundryCloud Foundry, a VMware-led project, for building a Platform as a Service (PaaS) offering. Cloud Foundry provides a platform for building, deploying, and running cloud apps using Spring for Java developers, Rails and Sinatra for Ruby developers, Node.js and other JVM frameworks including Grails.OpenShiftA free Platform-as-a-Service that enables developers to deploy apps written in multiple frameworks and languages across clouds. Open source licensing is forthcoming. WSO2 Java PaaS. WSO2 Stratos provides the core cloud services and essential building blocks for example federated identity and single sign-on, data-as-a-service and messaging-as-a-service and more, required for developing SaaS and cloud applications.
GlusterFS is an open source scale-out NAS solution. The software is a powerful and flexible solution that simplifies the task of managing unstructured file data whether you have a few terabytes of storage or multiple petabytes.Ceph is a distributed network storage and file system designed to provide excellent performance, reliability, and scalability. Ceph is based on a reliable and scalable distributed object store, with a distributed metadata management cluster layered on top to provide a distributed file system with POSIX semantics. There are a variety of ways to interact with the systemOpenStack Object Storage (code-named Swift) is open source software for creating redundant, scalable object storage using clusters of standardized servers to store petabytes of accessible data. It is not a file system or real-time data storage system, but rather a long-term storage system for a more permanent type of static data that can be retrieved, leveraged, and then updated if necessary. Primary examples of data that best fit this type of storage model are virtual machine images, photo storage, email storage and backup archiving. Having no central "brain" or master point of control provides greater scalability, redundancy and permanence.Sheepdog is a distributed storage system for QEMU/KVM. It provides highly available block level storage volumes that can be attached to QEMU/KVM virtual machines. Sheepdog scales to several hundreds nodes, and supports advanced volume management features such as snapshot, cloning, and thin provisioning.
Types of Tasks Accomplished by an APIProvisioning (creating, re-creating, moving, or deleting components e.g. virtual machines, vlans)Configuration (assigning or changing attributes of the architecture such as security and network settings)Cloud ProvidersJclouds – java API Abstraction Libcloud – started by CloudKick (now Rackspace) to abstract clouds, Apache incubator projectDeltacloud – started by Red Hat to abstract clouds, Apache incubator projectFog - provider and abstraction level API across compute and storage, written in Ruby
Derived from the NIST Diagram
Cloud computing promises highly available systems, but if you have a reactive approach you won’t achieve that goal. If you want a five nines service level you have 5.26 minutes to find, fix and recoverBuild redundant, highly environment systems
Other disciplines like back-up, log management, performance and security (virus,intrusion detection) are important but not core to the delivery of cloud computing systems
Ideally for the cloud you create management toolchains that automate the management of your cloud.
CobblerCobbler is a Linux installation server that allows for rapid setup of network installation environments. It glues together and automates many associated Linux tasks so you do not have to hop between lots of various commands and applications when rolling out new systems, and, in some cases, changing existing ones. With a simple series of commands, network installs can be configured for PXE, reinstallations, media-based net-installs, and virtualized installs (supporting Xen, qemu, KVM, and some variants of VMware). Cobbler uses a helper program called 'koan' (which interacts with Cobbler) for reinstallation and virtualization support. SpacewalkSpacewalk manages software content updates for Red Hat derived distributions such as Fedora, CentOS, and Scientific Linux, within your firewall. You can stage software content through different environments, managing the deployment of updates to systems and allowing you to view at which update level any given system is at across your deployment. A clean central web interface allows viewing of systems and their software update status, and initiating update actions.CrowbarBare metal provisioning for CloudStack developed by Dell using Opscode Chef.
CfengineCFEngine is a policy-based configuration management system written by Mark Burgess at Oslo University College. Its primary function is to provide automated configuration and maintenance of computers, from a policy specification. The CFEngine project was started in 1993 as a reaction to the complexity and non-portability of shell scripting for Unix configuration management, and continues today. The aim was to absorb frequently used coding paradigms into a declarative, domain-specific language that would offer self-documenting configuration.Opscode Chef With Chef, you write abstract definitions as source code to describe how you want each part of your infrastructure to be built, and then apply those descriptions to individual servers. The result is a fully automated infrastructure: when a new server comes on line, the only thing you have to do is tell Chef what role it should play in your architecture. Chef performs actions defined in recipes to configure systems. Recipes are written in Ruby with specific domain specific language (DSL) extensions to specify configuration resources. A Recipe describes a series of resources that should be in a particular state on a particular part of a server (such as Apache, MySQL, or Hadoop). This might include packages that should be installed, services that should be running, or files that should be written. When Recipes are run, Chef makes sure that each resource is properly configured, only taking corrective action when it's necessary. The result is a safe, flexible mechanism for making sure your servers are always running exactly how you want them to be.PuppetPuppet, an automated administrative engine for your *nix systems, performs administrative tasks (such as adding users, installing packages, and updating server configurations) based on a centralized specification.
CapistranoCapistrano is a developer tool for deploying web applications. It is typically installed on a workstation, and used to deploy code from your source code management (SCM) to one, or more servers.RunDeckRunDeck is cross-platform open source software that helps you automate ad-hoc and routine procedures in data center or cloud environments. RunDeck allows you to run tasks on any number of nodes from a web-based or command-line interface. RunDeck also includes other features that make it easy to scale up your scripting efforts including: access control, workflow building, scheduling, logging, and integration with external sources for node and option data.FuncFunc allows for running commands on remote systems in a secure way, like SSH, but offers several improvements. Func allows you to manage an arbitrary group of machines all at once. Func automatically distributes certificates to all "slave" machines. There's almost nothing to configure. Func comes with a command line for sending remote commands and gathering data. There are lots of modules already provided for common tasks. Anyone can write their own modules using the simple Python module API. Everything that can be done with the command line can be done with the Python client API. The hack potential is unlimited. You'll never have to use "expect" or other ugly hacks to automate your workflow. It's really simple under the covers. Func works over XMLRPC and SSL. Since func uses certmaster, any program can use func certificates, latch on to them, and take advantage of secure master-to-slave communication. There are no databases or crazy stuff to install and configure. Again, certificate distribution is automatic too. MCollectiveThe Marionette Collective AKA mcollective is a framework to build server orchestration or parallel job execution systems.Mcollective is used as a means of programmatic execution of Systems Administration actions on clusters of servers. MCollective use modern tools like Publish Subscribe Middleware and modern philosophies like real time discovery of network resources using meta data and not hostnames. Delivering a very scalable and very fast parallel execution environment.
LogstashWith all your various types of new infrastructure in the cloud and old infrastructuremyCloud is an innovative new solution from RightScale, leveraging Cloud.com's private cloud technology, that provides a simple and efficient method for organizations to leverage existing on-premise hardware to build a private cloud infrastructure that can then be managed alongside public cloud resources. As part of the myCloud offering, Cloud.com is delivering a powerful, fully turnkey private cloud solution that delivers the pay-as-you-grow economics and can be launched in minutes by leveraging existing on-premise hardware to quickly build a private cloud.
Example ToolchainBootstrap image starts image. That image connects to the Cobbler provisioning server via koan (kickstart over a network) then upon completion of the install, kicks of configuration via puppet then services are started with RunDeck and then Zenoss autodiscovers the new infrastructure. If the infrastructure fails, automatic remediation capabilities in Zenoss can call RunDeck to restart services or Puppet to reconfigure the infrastructure.
Transcript of "Delivering IaaS with Open Source Software"
Delivering Infrastructure-as-a-Service (IaaS) with Open Source Software<br />Mark R. Hinkle<br />Director, Cloud Computing Community<br />Citrix Systems Inc.<br />Twitter: @mrhinkle<br />Email: email@example.com<br />
Agenda<br />Introduction<br />Quick Cloud Computing Overview<br />Open Source Building Blocks for Cloud Computing <br />Open Source Tools for Cloud Management<br />Questions<br />
Mark Hinkle, Director ,Cloud Computing Community, Citrix<br /><ul><li>Responsible for Driving Adoption of CloudStack Open Source Cloud Computing Software
Joined Citrix via Cloud.com acquisition July 2011
Former manager of Zenoss Open Source Monitoring project100,000 users, 1.5 million downloads
Author - “Windows to Linux Business Desktop Migration” – Thomson
NetDirectorProject - Open Source Configuration Management Project
Sometimes Author and Blogger at SocializedSoftware.com/NetworkWorld</li></li></ul><li>Cloud Computing Overview<br />
Cloud Still Requires Architectural Design<br />Cloud Computing isn’t a “magical solution”<br />Need to design your architecture with the end in mind <br />As you build it make your infrastructure easily replicable<br />
Five Characteristics of Clouds<br />On-Demand Self-Service<br />Broad Network Access<br />Resource Pooling<br />Rapid Elasticity<br />Measured Service<br />
Cloud Computing Service Models<br />USER CLOUD a.k.a. SOFTWARE AS A SERVICE<br />Single application, multi-tenancy, network-based, one-to-many delivery of applications, all users have same access to features.<br />Examples: Salesforce.com, Google Docs, Red Hat Network/RHEL<br />DEVELOPMENT CLOUD a.k.a. PLATFORM-AS-A-SERVICE<br />Application developer model, Application deployed to an elastic service that autoscales, low administrative overhead. No concept of virtual machines or operating system. Code it and deploy it. <br />Examples: Google AppEngine, Windows Azure, Rackspace Site, Red Hat Makara<br />SYSTEMS CLOUD a.k.a INFRASTRUCTURE-AS-A-SERVICE<br />Servers and storage are made available in a scalable way over a network. <br />Examples: EC2,Rackspace CloudFiles, OpenStack, CloudStack, Eucalyptus, Ubuntu Enterprise Cloud, OpenNebula<br />SaaS<br />PaaS<br />IaaS<br />
Building Compute Cloudswith Open Source Software<br />
Why Open Source?<br />Typically user-drivensolutions to real problems<br />Larger user base, users helping users<br />Lower barrier to participation<br />Aggressive release cycles can stay current with the state-of-the-art<br />Try before you “buy”, no brochureware, no “PowerPoint software<br />Open data, Open standards, Open APIs<br />
Open Virtual Machine Formats<br />Open Virtualization Format (OVF) is an open standard for packaging and distributing virtual appliances or more generally software to be run in virtual machines. Standardization is still in process. <br />Popular Virtual Formats:<br /><ul><li>Amazon – AMI (Amazon Machine Image)
Open Source Hypervisors<br />Open Source<br />Xen, Xen Cloud Platform (XCP)<br />KVM – Kernel-based Virtualization<br />VirtualBox* - Oracle supported Virtualization Solutions <br />OpenVZ* - Container-based, Similar to Solaris Containers or BSD Zones<br />LXC – User Space chrooted installs<br />Proprietary<br />VMware<br />Citrix Xenserver<br />Microsoft Hyper-V<br />OracleVM (Based on OS Xen)<br />
Open Source Compute Clouds<br />Other open source compute software include Abiquo, Red Hat’s CloudForms and OpenNebula<br />Numerous companies are building cloud software on OpenStack including Nebula, Piston Inc.<br />
CloudStack <br />Cloud Cloud Compute<br />Multi-Hypervisor Support<br />Robust Web Interface<br />Advanced Networking Capabilities<br />High Availability<br />Multiple-Roles for Admins and Users<br />Extensive API<br />GPL Licensed<br />www.cloudstack.org<br />
OpenStack<br />Three Projects (Compute, Object Storage, Image Service)<br />Rapid Development<br />Next Release Diablo Q3 , 2011<br />Large community of developers and partners<br />Numerous channels for commercial support<br />Command Line Interface (CLI)<br />Apache License<br />
Open Source Cloud Computing Storage<br />GlusterFS – Scale Out NAS system aggregating storage over Ethernet or Infiniband<br />Ceph – Distributed file storage system developed by DreamHost to handle data at petabyte scale<br />OpenStack Object Storage (SWIFT) – Long-term storage object storage system<br />Sheepdog – Distributed storage for KVM hypervisors<br />OpenFiler - Openfiler is a browser-based network storage software distribution to create aNetworkAttached Storage (NAS) and block-based Storage Area Networking in a single framework<br />NFS – Old standby, tried and true, not designed for cloud scale or performance <br />
Automate, Automate, Automate</li></li></ul><li>What Makes Tools Cloudy?<br />Network Capable<br />Cloud “Aware” <br />Easy-to-Integrate<br />Adhere to Open Standards<br />Lend Themselves to Automation<br />
The Myth of the Nines<br />Average polling interval for monitoring? 5 minutes? <br />Even superhuman operations people can’t be alerted and take action in under 5 minutes. <br />One outage per year could drop service level to three nines or worse. <br />
4 Types of Management Tools<br />Provisioning<br />Installation of operating systems and other software<br />Configuration Management<br />Sets the parameters for servers, can specify installation parameters<br />Orchestration/Automation<br />Automate tasks across systems<br />Monitoring<br />Records errors and health of IT infrastructure<br />
Management Toolchains<br />Toolchain (n):<br />A set of tools where the output of one tool becomes the input of another tool <br />
Open Source Automation/Orchestration Tools<br />
Miscellania<br />logstash is a tool for managing your logs. W<br />It helps you take logs and other event data from your systems and move it into a central place. logstash is open source and completely free. <br />You can get support for logstash via a hosted version from http://loggly.com/<br />myCloud is a free service that allows you to manage up to five virtualized hosts via a hosted version of Cloudstack complimented by RightScale, a cloud management company. While this is not open source it site a free <br />