Orchestration:
The Next Frontier for Cloud Applications


         John Willis, Opscode Inc.
         Alex Honor, ControlTier Project
         Mark Hinkle, Zenoss Inc.
         Duncan Johnston-Watt, CloudSoft Corporation
         Damon Edwards, DTO Solutions Inc.
You worried about…
You worried about…
Now you worry about…
Now you worry about…
How do you make all of that work
    together in the cloud?
Orchestration!
The Path to Orchestration
The Path to Orchestration

1. Bring “Dev”, “Ops”, and “Biz” points-of-view
   and practices into alignment
The Path to Orchestration

1. Bring “Dev”, “Ops”, and “Biz” points-of-view
   and practices into alignment
        See also: #DevOps
The Path to Orchestration

1. Bring “Dev”, “Ops”, and “Biz” points-of-view
   and practices into alignment
        See also: #DevOps

2. Fully automated infrastructure
The Path to Orchestration

1. Bring “Dev”, “Ops”, and “Biz” points-of-view
   and practices into alignment
        See also: #DevOps

2. Fully automated infrastructure
        See also: “Infrastructure as Code”
Agenda

                 John Willis (Opscode)

                 Alex Honor (ControlTier)

                 Mark Hinkle (Zenoss)

                 Duncan Johnston-Watt (CloudSoft)


Q&A with Panel   Damon Edwards (DTO Solutions)
                             Moderator
Orchestration and
System Administration



         John Willis
VP of Services - Opscode, Inc.
Infrastructure is Hard
Fully Automated
Infrastructure is Hard!
1999
  Inventory, packaged file transfers and desktops
2005
  Unattended bare metal servers “very very” hard
  7k Nodes took 5 days w/90 success
2007
  Unattended bare metal in under 10 minutes
  Fully configured in under 3 mins
2008
  Unattended server in 2 minutes
  5000 servers in a week
2010
   10k Nodes in under 5 minutes
Managing Infrastructure Is Hard
                              Has Always Been
Proprietary Solutions
                    • Solve very little of the problem...
            1980    • Reach just a handful of large,
                        enterprise customers
            1989    • Require custom implementations
                        with large professional services
            1999
                    • Deployed exclusively on-premise
            2001    • Acquired by companies with large
                        consulting organizations (IBM, HP,
                        CA)
Open Source Solutions
Cfengine
Started in 1993 by Mark Burgess. He created a scientific approach to
model systems and set a new paradigm for CM. DSL based,
declarative, abstract, convergent and self documenting configuration
management.
Puppet
Founded in 2005 by Luke Kanies. Frustrated with Cfengine syntax and
ability to adapt to real world configuration management, he made a
quantum leap in making a DSL easier to use for declarative, abstract,
convergent and self documenting configuration management.
Chef
Founded in 2008 by Adam Jacob. A community leader working with
Puppet on massively scalable fully automated infrastructures, Saw the
problem as a “systems Integration” problem first and configuration
management as a subcomponent.
Industry Shifts
Infrastructure is changing
• Easier to get (good!)
     ...but harder to manage (bad!)

• Demand is dynamic

• Developers are crucial to Operations

• Web / Cloud services are proliferating
     ...and Enterprise is following along

• Manual configuration no longer a crutch

• Few tools to solve a ubiquitous problem
Core Principles

• System Integration

• Infrastructure as Code

• Infrastructure API

• Community involvement

• Zero touch
Infrastructure as Code
Nodes -- Where recipes are applied

Roles -- Allow you to group together nodes

Cookbooks -- Recipes, Definitions, Attributes, Libraries, Files
   and Templates

Resources -- The basic unit of work in Chef - a resource might
   be a package, file or service

Providers -- A provider takes actions on resources. A node
   decides what provider should be used by default.

Metadata -- Defines cookbook dependencies and additional
   parts.
Cookbooks

Distribution

Recipes, Attributes

Assets

Definitions, LWRP,
   Libraries

Metadata
Roles
Load Balancer Example
Orchestration and
     Application Administration



                  Alex Honor
Project Leader, ControlTier Open Source Project
Classic Application Administration Problem
Classic Application Administration Problem
Classic Application Administration Problem




                              Complexity!
Classic Application Administration Problem




                              Complexity!
                              Changing procedures!
Classic Application Administration Problem




                              Complexity!
                              Changing procedures!
                              Environment differences!
Classic Application Administration Problem




                              Complexity!
                              Changing procedures!
                              Environment differences!
                              Lack of repeatability!
Clouds Make it Worse
Clouds Make it Worse




                       Complexity!
                       Changing procedures!
                       Environment differences!
                       Lack of repeatability!
                                  +
Clouds Make it Worse




                       Complexity!
                       Changing procedures!
                       Environment differences!
                       Lack of repeatability!
                                   +
                       Transient infrastructure!
Clouds Make it Worse




                       Complexity!
                       Changing procedures!
                       Environment differences!
                       Lack of repeatability!
                                   +
                       Transient infrastructure!
                       Dynamic scale!
Clouds Make it Worse




                       Complexity!
                       Changing procedures!
                       Environment differences!
                       Lack of repeatability!
                                   +
                       Transient infrastructure!
                       Dynamic scale!
                       Multiple providers!
Command Dispatcher
Command Dispatcher
Command Dispatcher
Command Dispatcher Provides

• Abstraction at several levels
Command Dispatcher Provides

• Abstraction at several levels
  – Nodes
Command Dispatcher Provides

• Abstraction at several levels
  – Nodes
  – Services
Command Dispatcher Provides

• Abstraction at several levels
  – Nodes
  – Services
  – Management Procedures
Command Dispatcher Provides

• Abstraction at several levels
  – Nodes
  – Services
  – Management Procedures


• Sequenced or parallel execution
Command Dispatcher Provides

• Abstraction at several levels
  – Nodes
  – Services
  – Management Procedures


• Sequenced or parallel execution

• Plug-in control modules
Example: Cluster Management

• Coordinate actions within a larger procedure

• Roll sets of tasks across sets of nodes

• Manage as whole or logical slices
Example: Scale Differences




                        Wednesday 04:00
Example: Scale Differences




                        Wednesday 04:00

                        Wednesday 11:00
Example: Self-Service
Command Dispatcher Projects

Example command dispatchers…

      • Capistrano (capify.org)

      • Fabric (fabfile.org)

      • Func (fedorahosted.org/func)
Command Dispatcher Projects

Example command dispatchers (cont’d)…

      • ControlTier (controltier.org)
        –   Workflow system on top of dispatcher
        –   Web-based GUI and command line tools
        –   Fine-grain access controls
        –   Logging and reporting framework
        –   Integrated with CMDB
Orchestration and
     Monitoring



        Mark Hinkle
VP of Community, Zenoss Inc.
Legacy IT




                                                                                     Different perspective, lack of coordination


Cartoon originally copyrighted by the authors; G. Renee Guzlas, artist
Legacy Monitoring Perspective


Types of Monitoring                               Data Collection
•   Availability Monitoring – Binary, Moment in
    Time                                          • SNMP
•   Performance Monitoring – Two
    Dimensions, Time and State
                                                  • SSH
•   Change Management – Comparisons of
    states in Time
                                                  • WMI
•   Event Management – Normalizing                • Syslog
    Randomness
•   Synthetic Transactions – Simulated            • Proprietary Agents
    Experiences
•   Business Service Management (BSM) –
    $$$ Consequences of IT Performance
The Myth of the Nines

Availability %                    Downtime per Year            Downtime per Month              Downtime per Week

99.9% (three nines)               8.76 hours                    43.2 minutes                   10.1 minutes
99.95%                            4.38 hours                   21.56 minutes                   5.04 minutes
99.99% (four nines)               52.6 minutes                 4.32 minutes                    1.01 minutes
99.999% (five nines)              5.26 minutes                 25.9 seconds                    6.05 minutes
99.9999% (six nines)              31.5 seconds                 2.59 seconds                    .0605 seconds

    •Average polling interval for monitoring? 5 minutes?
    •Even super human operations people can’t be alerted and take action in under 5 minutes.
    •One outage per year could drop service level to three nines.
Legacy Systems Management:
                        Fragmented Awareness
   Global dashboard is a difficult mash-up of
    disparate systems or doesn’t exist. No
       communication, No automation
                                                                                                   database




             Provisioning                                            Configuration Management                       Performance & Availability Management




                                                                                       Analytics                                             Analytics
                                                                                        server                                                server




Process server                                            Process server           Configurtation                  Process server
                             database                                                                                                        database
                                                                                     Database




                                                                      Multiple data models across disciplines with no              Each management discipline
                                                                                  common object model                                 managed has its own
             Agent            Agent             Agent
                                                                                                                                      separate product (UI,
                                                                                                                                      process, database, and
                                           Multiple agents required for each                                                        domain specific language)
                                                discipline and platform
Unlegacy Systems Management:
  Integrated Model, Interactive, Automated




                                        Application                  Application



                                        Op. System                    Op. System


                                      Virtual Machine               Virtual Machine




                            Physical/Virtual/Cloud Infrastructure
Example – Broadcast Company

Large premium television content provider serves national cable network with
content served from Linux servers.

 •   Servers are automatically built using configuration
     management software
 •   As servers are brought into service configuration
     management inserts hosts into CMDB used by
     monitoring database
 •   One way interaction between configuration
     management and monitoring system
 •   Reports are generated to determine which
     systems are compliant
Example - Geeknet

Hundreds of servers, serving web, databases, and other infrastructure for some
of the world’s most highly trafficked websites – over 40 million visitors per
month.

 •   Servers are automatically built using configuration
     management software
 •   Discovery tool finds infrastructure and populates a
     CMDB then spits out information to scripts that
     translate information to BIND configurations for
     DNS
 •   Monitoring tool adds hosts to polling tool to start
     monitoring servers for availability
 •   As infrastructure changes systems are updated
     automatically
 •   Servers can be spun up and managed in
     minutes, not hours automatically with little or
     no human interaction
Unlegacy Future: Devops


    Development   Operations
Orchestral Manoeuvres in
       the Cloud



    Duncan Johnston-Watt
  CEO, Cloudsoft Corporation
The Application Mobility Manifesto
• Application mobility is the ability to …
   – Dynamically change some or all of the infrastructure that an
     application is using without any disruption of service

   – Optimize the location of application components in the cloud

   – Bridge the gap between your private cloud and trusted third
     party cloud services providers

• Application mobility is achieved by orchestrating the
  cloud

• Application mobility is the “Missing Link” in Cloud
  Computing
Demo: Application Mobility in Action

• EzBrokerage is implemented using CloudSoft’s
  Monterey middleware platform

• EzBrokerage benefits from two complementary
  policies
  – Workload policy: ensures the service is adequately resourced
    based on server demand by managing the size of a pool and
    distribution of workload across it

  – Geolocation policy: ensures the service is hosted in the right
    region based on client demand by managing the overall
    distribution of workload across multiple resource pools or
    clouds
John Willis (opscode.com)
    @botchagalupe


Alex Honor (controltier.org)
    @alexhonor


Mark Hinkle (zenoss.com)
    @mrhinkle


Duncan Johnston-Watt (cloudsoftcorp.com)
    @duncanjw



Damon Edwards (dtosolutions.com)
    @damonedwards

Orchestration Panel at Cloud Connect 2010

  • 1.
    Orchestration: The Next Frontierfor Cloud Applications John Willis, Opscode Inc. Alex Honor, ControlTier Project Mark Hinkle, Zenoss Inc. Duncan Johnston-Watt, CloudSoft Corporation Damon Edwards, DTO Solutions Inc.
  • 2.
  • 3.
  • 4.
    Now you worryabout…
  • 5.
    Now you worryabout…
  • 6.
    How do youmake all of that work together in the cloud?
  • 7.
  • 8.
    The Path toOrchestration
  • 9.
    The Path toOrchestration 1. Bring “Dev”, “Ops”, and “Biz” points-of-view and practices into alignment
  • 10.
    The Path toOrchestration 1. Bring “Dev”, “Ops”, and “Biz” points-of-view and practices into alignment See also: #DevOps
  • 11.
    The Path toOrchestration 1. Bring “Dev”, “Ops”, and “Biz” points-of-view and practices into alignment See also: #DevOps 2. Fully automated infrastructure
  • 12.
    The Path toOrchestration 1. Bring “Dev”, “Ops”, and “Biz” points-of-view and practices into alignment See also: #DevOps 2. Fully automated infrastructure See also: “Infrastructure as Code”
  • 13.
    Agenda John Willis (Opscode) Alex Honor (ControlTier) Mark Hinkle (Zenoss) Duncan Johnston-Watt (CloudSoft) Q&A with Panel Damon Edwards (DTO Solutions) Moderator
  • 14.
    Orchestration and System Administration John Willis VP of Services - Opscode, Inc.
  • 15.
  • 16.
  • 17.
    1999 Inventory,packaged file transfers and desktops 2005 Unattended bare metal servers “very very” hard 7k Nodes took 5 days w/90 success 2007 Unattended bare metal in under 10 minutes Fully configured in under 3 mins 2008 Unattended server in 2 minutes 5000 servers in a week 2010 10k Nodes in under 5 minutes
  • 18.
    Managing Infrastructure IsHard Has Always Been Proprietary Solutions • Solve very little of the problem... 1980 • Reach just a handful of large, enterprise customers 1989 • Require custom implementations with large professional services 1999 • Deployed exclusively on-premise 2001 • Acquired by companies with large consulting organizations (IBM, HP, CA)
  • 19.
    Open Source Solutions Cfengine Startedin 1993 by Mark Burgess. He created a scientific approach to model systems and set a new paradigm for CM. DSL based, declarative, abstract, convergent and self documenting configuration management. Puppet Founded in 2005 by Luke Kanies. Frustrated with Cfengine syntax and ability to adapt to real world configuration management, he made a quantum leap in making a DSL easier to use for declarative, abstract, convergent and self documenting configuration management. Chef Founded in 2008 by Adam Jacob. A community leader working with Puppet on massively scalable fully automated infrastructures, Saw the problem as a “systems Integration” problem first and configuration management as a subcomponent.
  • 20.
  • 21.
    Infrastructure is changing •Easier to get (good!) ...but harder to manage (bad!) • Demand is dynamic • Developers are crucial to Operations • Web / Cloud services are proliferating ...and Enterprise is following along • Manual configuration no longer a crutch • Few tools to solve a ubiquitous problem
  • 22.
    Core Principles • SystemIntegration • Infrastructure as Code • Infrastructure API • Community involvement • Zero touch
  • 23.
    Infrastructure as Code Nodes-- Where recipes are applied Roles -- Allow you to group together nodes Cookbooks -- Recipes, Definitions, Attributes, Libraries, Files and Templates Resources -- The basic unit of work in Chef - a resource might be a package, file or service Providers -- A provider takes actions on resources. A node decides what provider should be used by default. Metadata -- Defines cookbook dependencies and additional parts.
  • 24.
  • 25.
  • 26.
  • 27.
    Orchestration and Application Administration Alex Honor Project Leader, ControlTier Open Source Project
  • 28.
  • 29.
  • 30.
  • 31.
    Classic Application AdministrationProblem Complexity! Changing procedures!
  • 32.
    Classic Application AdministrationProblem Complexity! Changing procedures! Environment differences!
  • 33.
    Classic Application AdministrationProblem Complexity! Changing procedures! Environment differences! Lack of repeatability!
  • 34.
  • 35.
    Clouds Make itWorse Complexity! Changing procedures! Environment differences! Lack of repeatability! +
  • 36.
    Clouds Make itWorse Complexity! Changing procedures! Environment differences! Lack of repeatability! + Transient infrastructure!
  • 37.
    Clouds Make itWorse Complexity! Changing procedures! Environment differences! Lack of repeatability! + Transient infrastructure! Dynamic scale!
  • 38.
    Clouds Make itWorse Complexity! Changing procedures! Environment differences! Lack of repeatability! + Transient infrastructure! Dynamic scale! Multiple providers!
  • 39.
  • 40.
  • 41.
  • 42.
    Command Dispatcher Provides •Abstraction at several levels
  • 43.
    Command Dispatcher Provides •Abstraction at several levels – Nodes
  • 44.
    Command Dispatcher Provides •Abstraction at several levels – Nodes – Services
  • 45.
    Command Dispatcher Provides •Abstraction at several levels – Nodes – Services – Management Procedures
  • 46.
    Command Dispatcher Provides •Abstraction at several levels – Nodes – Services – Management Procedures • Sequenced or parallel execution
  • 47.
    Command Dispatcher Provides •Abstraction at several levels – Nodes – Services – Management Procedures • Sequenced or parallel execution • Plug-in control modules
  • 48.
    Example: Cluster Management •Coordinate actions within a larger procedure • Roll sets of tasks across sets of nodes • Manage as whole or logical slices
  • 49.
  • 50.
    Example: Scale Differences Wednesday 04:00 Wednesday 11:00
  • 51.
  • 52.
    Command Dispatcher Projects Examplecommand dispatchers… • Capistrano (capify.org) • Fabric (fabfile.org) • Func (fedorahosted.org/func)
  • 53.
    Command Dispatcher Projects Examplecommand dispatchers (cont’d)… • ControlTier (controltier.org) – Workflow system on top of dispatcher – Web-based GUI and command line tools – Fine-grain access controls – Logging and reporting framework – Integrated with CMDB
  • 54.
    Orchestration and Monitoring Mark Hinkle VP of Community, Zenoss Inc.
  • 55.
    Legacy IT Different perspective, lack of coordination Cartoon originally copyrighted by the authors; G. Renee Guzlas, artist
  • 56.
    Legacy Monitoring Perspective Typesof Monitoring Data Collection • Availability Monitoring – Binary, Moment in Time • SNMP • Performance Monitoring – Two Dimensions, Time and State • SSH • Change Management – Comparisons of states in Time • WMI • Event Management – Normalizing • Syslog Randomness • Synthetic Transactions – Simulated • Proprietary Agents Experiences • Business Service Management (BSM) – $$$ Consequences of IT Performance
  • 57.
    The Myth ofthe Nines Availability % Downtime per Year Downtime per Month Downtime per Week 99.9% (three nines) 8.76 hours 43.2 minutes 10.1 minutes 99.95% 4.38 hours 21.56 minutes 5.04 minutes 99.99% (four nines) 52.6 minutes 4.32 minutes 1.01 minutes 99.999% (five nines) 5.26 minutes 25.9 seconds 6.05 minutes 99.9999% (six nines) 31.5 seconds 2.59 seconds .0605 seconds •Average polling interval for monitoring? 5 minutes? •Even super human operations people can’t be alerted and take action in under 5 minutes. •One outage per year could drop service level to three nines.
  • 58.
    Legacy Systems Management: Fragmented Awareness Global dashboard is a difficult mash-up of disparate systems or doesn’t exist. No communication, No automation database Provisioning Configuration Management Performance & Availability Management Analytics Analytics server server Process server Process server Configurtation Process server database database Database Multiple data models across disciplines with no Each management discipline common object model managed has its own Agent Agent Agent separate product (UI, process, database, and Multiple agents required for each domain specific language) discipline and platform
  • 59.
    Unlegacy Systems Management: Integrated Model, Interactive, Automated Application Application Op. System Op. System Virtual Machine Virtual Machine Physical/Virtual/Cloud Infrastructure
  • 60.
    Example – BroadcastCompany Large premium television content provider serves national cable network with content served from Linux servers. • Servers are automatically built using configuration management software • As servers are brought into service configuration management inserts hosts into CMDB used by monitoring database • One way interaction between configuration management and monitoring system • Reports are generated to determine which systems are compliant
  • 61.
    Example - Geeknet Hundredsof servers, serving web, databases, and other infrastructure for some of the world’s most highly trafficked websites – over 40 million visitors per month. • Servers are automatically built using configuration management software • Discovery tool finds infrastructure and populates a CMDB then spits out information to scripts that translate information to BIND configurations for DNS • Monitoring tool adds hosts to polling tool to start monitoring servers for availability • As infrastructure changes systems are updated automatically • Servers can be spun up and managed in minutes, not hours automatically with little or no human interaction
  • 62.
    Unlegacy Future: Devops Development Operations
  • 63.
    Orchestral Manoeuvres in the Cloud Duncan Johnston-Watt CEO, Cloudsoft Corporation
  • 64.
    The Application MobilityManifesto • Application mobility is the ability to … – Dynamically change some or all of the infrastructure that an application is using without any disruption of service – Optimize the location of application components in the cloud – Bridge the gap between your private cloud and trusted third party cloud services providers • Application mobility is achieved by orchestrating the cloud • Application mobility is the “Missing Link” in Cloud Computing
  • 65.
    Demo: Application Mobilityin Action • EzBrokerage is implemented using CloudSoft’s Monterey middleware platform • EzBrokerage benefits from two complementary policies – Workload policy: ensures the service is adequately resourced based on server demand by managing the size of a pool and distribution of workload across it – Geolocation policy: ensures the service is hosted in the right region based on client demand by managing the overall distribution of workload across multiple resource pools or clouds
  • 66.
    John Willis (opscode.com) @botchagalupe Alex Honor (controltier.org) @alexhonor Mark Hinkle (zenoss.com) @mrhinkle Duncan Johnston-Watt (cloudsoftcorp.com) @duncanjw Damon Edwards (dtosolutions.com) @damonedwards