Orchestrating Clusters
with Ironfan and Chef
    Robert J. Berger - CTO Runa, Inc.
          rberger@runa.com
Hassles of Big Data Stack
      Deployments
• Lots of Moving Parts
 •   Hadoop/HBase just one sub-system
 •   Heterogeneous Tech
 •   Monitoring & Metrics Everywhere
 •   Details obscure the big picture
• Need repeatability & variations on themes
The Forest for the Trees
Forest
Dash        App
                     Monitor
board      Servers




Elastic              Hadoop
           AMQP
Search                M/R




Session
           HBase     MySQL
 Store
MySQL




                                                                                       Slave
                                                                 Slave               Slave
                                                               Slave             Master
                   GDash
                Statsd
               Sensu                                    Slave
          Logstash                                     Slave
         Ganglia                                   Slave
                        Monitoring




        Graphite                                  Slave
        Web                                     Slave                                                           Regionsrvr
                                               Slave                                                      Regionsrvr
                                           Slave
                                                          Hadoop
                                                           M/R




                                          Slave                                                  Regionsrvr
Trees




                                      Sec Master
                                                                                               Regionsrvr
                                      Master
                                                                                             Regionsrvr
                                                                                          Regionsrvr
               Clj App                                                                  Regionsrvr
              Clj App
                                                                                       Regionsrvr
          Clj App
                                                                                     Regionsrvr
                        App Servers




          Web
                                           Rabbit                                Regionsrvr
         Web
                                          Rabbit        AMQP                    Regionsrvr
        ELB
                                                                               HB Master




                                                                                                HBase
                                                                             HB Master
                                                                            ZooKeeper
                                                                          ZooKeeper
                                                                         ZooKeeper
                                           ES Server
                                          ES Server
          Rails App                     ES Server




                                                       Search
                                                       Elastic
                          Dashboard
         Rails App                     ES Server
         Web
                                                                                     Redis
                                                                                 Redis




                                                                                               Session
                                                                                                Store
                                                                                Redis
Leaves
Dashboard                        App Servers                                   Monitoring
 Nginx         MySQL Client       Elastic Load Balancer MySQL Client              Nginx             Ganglia Server
 Reverse Proxy Upstart Config      Nginx                 Upstart Config             Reverse Proxy     Ganglia Web
 Unicorn       Logstash Client    Reverse Proxy         Logstash Client           Java              Statsd Server
 Rails                            Swarmiji                                        Leiningen         Graphite Server
 Dashboard App                    Java                                            Jark              Grpahite Web
 Java                             Leiningen                                       Postfix            Python
 Postfix                           Jark                                            Cron jobs         Logstash
 Cron jobs                        Clojure Apps                                    Sensu Server      MySQL Client
 Sensu client                     HBase Client                                    Sensu Web         Upstart Config
 Sensu plugins                    Postfix                                          Sensu Client      Logstash Client
                                  Cron jobs                                       Sensu Plugins
                                  Sensu client
                                  Sensu plugins

Elastic Search                   AMQP                                     Hadoop M/R
 Elastic Search Server            RabbitMQ                                 Namenode
 Java                             RabbitMQ Plugins                         Secondary Namenode
 Cron jobs                        Cluster Config                            Tasktracker
 Sensu client                     Erlang                                   Jobtrackers
 Sensu plugins                    Cron jobs                                Bootstrap Namenmode
 Upstart Config                    Sensu client                             Java
                                  Sensu plugins                            JMX
                                  Upstart Config                            Cron jobs
                                                                           Sensu client
                                                                           Sensu plugins
                                                                           Ganglia Client
                                                                           Upstart Config

Session Store                    HBase                                             MySQL
Redis                             Namenode            Zookeeper                     MySQL Master
Cron jobs                         Secondary Namenode HBase Master                   MySQL Slaves
Sensu client                      Tasktracker         Regionserver                  Cluster Setup
Sensu plugins                     Jobtrackers                                       Cron jobs
Upstart Config                     Datanodes                                         Sensu client
                                  Bootstrap Namenmode                               Sensu plugins
                                  Java                                              Upstart Config
                                  JMX
                                  Cron jobs
                                  Sensu client
                                  Sensu plugins
                                  Ganglia Client
                                  Upstart Config
<configuration>
                    Molecules
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://ip-10-17-57-58.ec2.internal:8020/hadoop/hbase</value>
    <description>The directory shared by region servers.
    Should be fully-qualified to include the filesystem to use.
    E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR
    </description>
  </property>
  <property>
    <name>hbase.cluster.distributed</name>
    <value>true</value>
    <description>The mode the cluster will be in. Possible values are
      false: standalone and pseudo-distributed setups with managed Zookeeper
      true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh)
    </description>
  </property>
  <property>
    <name>hbase.zookeeper.quorum</name>
    <value>master0-cluster0.runa.com,regionserver0-
cluster0.runa.com,regionserver1-cluster0.runa.com</value>
    <description>Comma separated list of servers in the ZooKeeper Quorum.
    For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
    By default this is set to localhost for local and pseudo-distributed modes
    of operation. For a fully-distributed setup, this should be set to a full
    list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh
    this is the list of servers which we will start/stop ZooKeeper on.
    </description>
  </property>
Config Management:
   Leaves & Molecules

• Chef, Puppet, Cfengine
• Much better than shell scripts or cli jocks
• Infrastructure as code
• Still No Forest Perspective
Ironfan: Forest, Trees,
  Leaves and Molecules
• Builds on top of Chef
• Cluster Description in Single File
  •   Your System Diagram Come to Life
• Components announce capabilities
  •   Service Discovery automates interconnects
• Knife (CLI) extension controls cluster and
  component life-cycles
Basic Chef
Community + Own                   Chef Server
  Cookbooks           Search                     Auth &
                                                 ACLs



                      Nodes          Roles /
                                    Cookbooks   Data Bags
                     Attributes




                                                                   N




                                                                   N




                                                                   N



                  Chef Knife:
        Upload Cookboks to Chef Server                             N


              Launch Instances /
              Bootstrap Servers

                                                            VMs and/or Servers
                                                            Running Chef-Client
            Chef Development Host
Ironfan Pantry +
                    Chef + Ironfan
Community + Own                    Chef Server
    Cookbooks          Search /
                       Search                     Auth &
                      Discovery                   ACLs


                       Nodes
                       Nodes
                      Attributes      Roles /
                                     Cookbooks   Data Bags
                      Attributes
                      Discovery



                                                                    N




                                                                    N




                                                                    N



           Chef Knife + Ironfan Gem:
          Launch / Bootstrap / Manage                               N


          Whole Clusters, All Facets or
              Specific Instances

                                                             VMs and/or Servers
                                                             Running Chef-Client
             Chef Development Host
Ironfan Components
• Ironfan Gem:
  •   Knife Plugins to orchestrate clusters
  •   Logic to sync Chef Server & Cloud[s]
• Silverware: Coordinate Discovery of Services
• Ironfan-Homebase: Ironfan tuned Chef-Repo
• Ironfan-Pantry: Cookbooks tuned for
  Clusters
• Ironfan-CI: Testing of Ironfan clusters and
  Cookbooks
Cluster Config: Forest View
                  ClusterChef.cluster 'base0-cluster0' do
    Global          setup_role_implications
                    cloud :ec2 do
Cloud & Cluster       region              'us-east-1'

    Configs            availability_zones ['us-east-1b']
                      backing             'ebs'
                      image_name          'natty'
                      security_group(cluster_root) do
                        authorize_port_range(22)
                      end
                    end

                    role                 "base_role"
                    role                 "chef_client"
                    role                 "base0-cluster0"
                    role                 "production"
                    role                 "runastack"
                    role                 "ebs_volumes_attach"                Roles & Recipes
                    role                 "ebs_volumes_mount"
                                                                             are the “Leaves”
   Facets are       facet 'master' do
  the “Trees”         instances           1
                      cloud.image_id      'ami-93c31afa'
                      cloud.flavor        "cc1.4xlarge"

                      role                "big_package"
                      role                "hadoop_master"
                      role                "hbase_master"
                      recipe              "cluster_chef::cluster_webfront"
                      recipe              "hbase::utils"
                      recipe              "route53::runa"
                      role                "monitored_client"
                    end
Global Cloud & Recipe Confgs
         ClusterChef.cluster 'base0-cluster0' do
           setup_role_implications                     Cluster Name
           cloud :ec2 do
Cloud        region              'us-east-1'
Configs       availability_zones ['us-east-1b']
             backing             'ebs'                   Configure
             image_name          'natty'                  Security
             security_group(cluster_root) do               Group
               authorize_port_range(22)
             end
           end

Shared     role                 "base_role"
 Roles     role                 "chef_client"
           role                 "base0-cluster0"
           role                 "production"
           role                 "runastack"
           role                 "ebs_volumes_attach"
           role                 "ebs_volumes_mount"
Facet Name
             Facets add Specifics
             facet 'master' do
                                                                      Number of
 Cloud         instances           1                                   Copies
Overrides      cloud.image_id      'ami-93c31afa'
               cloud.flavor        "cc1.4xlarge"

               role                "hadoop_master"                      Facet
               role                "hbase_master"                      Roles &
               recipe              "cluster_chef::cluster_webfront"
               recipe              "hbase::utils"                      Recipes
               recipe              "route53::runa"
               role                "monitored_client"
Facet Name   end
                                                                      Number of
 Cloud
             facet 'regionserver' do
               instances           7
                                                                       Copies
Overrides      cloud.image_id      'ami-93c31afa'
               cloud.flavor        "cc1.4xlarge"

               role                "hadoop_slave"
               role                "hbase_regionserver"                 Facet
Make one       recipe              "hbase::utils"
                                                                       Roles &
instance       recipe              "route53::runa"
               role                "monitored_client"                  Recipes
 special       server 0 do
                 role   "zookeeper_server"
               end
             end
Facets Composed of
     Components
• Components are Services
  • Nginx, MySQL server, Zookeeper, HBMaster,
    Namenode, etc.
• Chef Cookbooks manage components
• Ironfan Silverware for service discovery
  • Auto-Connects components together
Silverware Service Discovery
• Recipe that creates a service , announces it
announce(:hadoop, :namenode)


• Recipe that requires a service , discovers it
hbase_config = Mash.new({
  :namenode_fqdn   => discover(:hadoop, :namenode ).private_hostname),
  :jobtracker_addr => discover(:hadoop, :jobtracker).private_ip),
  :zookeeper_addrs =>
discover_all(:zookeeper, :server).map(&:private_ip).sort,
  :ganglia         => discover(:ganglia, :server),
  :ganglia_addr    => discover(:ganglia, :server).private_hostname),
  :private_ip      => private_ip_of(node)
  })
Aspects enable Zeroconf
       Amenities
•   A log aspect would enable the following amenities
    •   logrotated    to manage its logs
    •   flume   to archive logs to a location
•   A port aspect would enable
    • Configuration of firewall
    • Monitoring of port uptime & latency
    • Remote checks that firewalled ports do NOT respond
Knife Cluster:
 Lifecycle Management
• A Plugin for Opscode Chef Knife
• Deployment & Lifecycle Operations:
  • launch, bootstrap, kill, start, stop
• Access and Chef Operations
  • ssh, kick, proxy
• Utilities
  • show, sync
Knife Launch Cluster,
       Facet or Instance[s]
• Launch cluster launchcluster
  knife
         all the nodes in a
                            base0-master0

• Launchcluster launch base0-master0 facet 0
  knife
         just a single instance of a single
                                            master

• Launchclusterinstances base0-master0 regionserver
  knife
         all the
                 launch
                         of a facet
Stop/Start Cluster, Facet
        or Instance[s]
• Stop whole cluster base0-master0
  knife cluster stop

• Stop a cluster stop base0-master0 master 0
  knife
         single instance of a single facet

• Stop allcluster stop abase0-master0 regionserver
  knife
           instances of facet
Same Knife Command to
   Launch Vagrant[s]

• Can use the same cluster configurations
  and knife command to launch Vagrants
  knife cluster vagrant up base0-master0

• Still Experimental
Ironfan-CI
• Jenkins based Continuous Integration of
  Clusters
• Still Experimental
• Uses Discovery to automate baseline test
  creation
• Leverages Vagrant to create clean test
  environments
References
•   Basic Chef Stuff:
    http://wiki.opscode.com/display/chef/Home

•   Ironfan Screencast:
    http://vimeo.com/37279372
•   Ironfan Wiki for the most complete info :
    https://github.com/infochimps-labs/ironfan/wiki

•   The Forest for the Trees Photo - Ame Otoko
    http://www.flickr.com/photos/ameotoko/5383225925/

HBaseCon 2012 | Orchestrating Clusters with Ironfan and Chef - Runa

  • 1.
    Orchestrating Clusters with Ironfanand Chef Robert J. Berger - CTO Runa, Inc. rberger@runa.com
  • 2.
    Hassles of BigData Stack Deployments • Lots of Moving Parts • Hadoop/HBase just one sub-system • Heterogeneous Tech • Monitoring & Metrics Everywhere • Details obscure the big picture • Need repeatability & variations on themes
  • 3.
    The Forest forthe Trees
  • 4.
    Forest Dash App Monitor board Servers Elastic Hadoop AMQP Search M/R Session HBase MySQL Store
  • 5.
    MySQL Slave Slave Slave Slave Master GDash Statsd Sensu Slave Logstash Slave Ganglia Slave Monitoring Graphite Slave Web Slave Regionsrvr Slave Regionsrvr Slave Hadoop M/R Slave Regionsrvr Trees Sec Master Regionsrvr Master Regionsrvr Regionsrvr Clj App Regionsrvr Clj App Regionsrvr Clj App Regionsrvr App Servers Web Rabbit Regionsrvr Web Rabbit AMQP Regionsrvr ELB HB Master HBase HB Master ZooKeeper ZooKeeper ZooKeeper ES Server ES Server Rails App ES Server Search Elastic Dashboard Rails App ES Server Web Redis Redis Session Store Redis
  • 6.
    Leaves Dashboard App Servers Monitoring Nginx MySQL Client Elastic Load Balancer MySQL Client Nginx Ganglia Server Reverse Proxy Upstart Config Nginx Upstart Config Reverse Proxy Ganglia Web Unicorn Logstash Client Reverse Proxy Logstash Client Java Statsd Server Rails Swarmiji Leiningen Graphite Server Dashboard App Java Jark Grpahite Web Java Leiningen Postfix Python Postfix Jark Cron jobs Logstash Cron jobs Clojure Apps Sensu Server MySQL Client Sensu client HBase Client Sensu Web Upstart Config Sensu plugins Postfix Sensu Client Logstash Client Cron jobs Sensu Plugins Sensu client Sensu plugins Elastic Search AMQP Hadoop M/R Elastic Search Server RabbitMQ Namenode Java RabbitMQ Plugins Secondary Namenode Cron jobs Cluster Config Tasktracker Sensu client Erlang Jobtrackers Sensu plugins Cron jobs Bootstrap Namenmode Upstart Config Sensu client Java Sensu plugins JMX Upstart Config Cron jobs Sensu client Sensu plugins Ganglia Client Upstart Config Session Store HBase MySQL Redis Namenode Zookeeper MySQL Master Cron jobs Secondary Namenode HBase Master MySQL Slaves Sensu client Tasktracker Regionserver Cluster Setup Sensu plugins Jobtrackers Cron jobs Upstart Config Datanodes Sensu client Bootstrap Namenmode Sensu plugins Java Upstart Config JMX Cron jobs Sensu client Sensu plugins Ganglia Client Upstart Config
  • 7.
    <configuration> Molecules <property> <name>hbase.rootdir</name> <value>hdfs://ip-10-17-57-58.ec2.internal:8020/hadoop/hbase</value> <description>The directory shared by region servers. Should be fully-qualified to include the filesystem to use. E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR </description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> <description>The mode the cluster will be in. Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) </description> </property> <property> <name>hbase.zookeeper.quorum</name> <value>master0-cluster0.runa.com,regionserver0- cluster0.runa.com,regionserver1-cluster0.runa.com</value> <description>Comma separated list of servers in the ZooKeeper Quorum. For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com". By default this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper quorum servers. If HBASE_MANAGES_ZK is set in hbase-env.sh this is the list of servers which we will start/stop ZooKeeper on. </description> </property>
  • 8.
    Config Management: Leaves & Molecules • Chef, Puppet, Cfengine • Much better than shell scripts or cli jocks • Infrastructure as code • Still No Forest Perspective
  • 9.
    Ironfan: Forest, Trees, Leaves and Molecules • Builds on top of Chef • Cluster Description in Single File • Your System Diagram Come to Life • Components announce capabilities • Service Discovery automates interconnects • Knife (CLI) extension controls cluster and component life-cycles
  • 10.
    Basic Chef Community +Own Chef Server Cookbooks Search Auth & ACLs Nodes Roles / Cookbooks Data Bags Attributes N N N Chef Knife: Upload Cookboks to Chef Server N Launch Instances / Bootstrap Servers VMs and/or Servers Running Chef-Client Chef Development Host
  • 11.
    Ironfan Pantry + Chef + Ironfan Community + Own Chef Server Cookbooks Search / Search Auth & Discovery ACLs Nodes Nodes Attributes Roles / Cookbooks Data Bags Attributes Discovery N N N Chef Knife + Ironfan Gem: Launch / Bootstrap / Manage N Whole Clusters, All Facets or Specific Instances VMs and/or Servers Running Chef-Client Chef Development Host
  • 12.
    Ironfan Components • IronfanGem: • Knife Plugins to orchestrate clusters • Logic to sync Chef Server & Cloud[s] • Silverware: Coordinate Discovery of Services • Ironfan-Homebase: Ironfan tuned Chef-Repo • Ironfan-Pantry: Cookbooks tuned for Clusters • Ironfan-CI: Testing of Ironfan clusters and Cookbooks
  • 13.
    Cluster Config: ForestView ClusterChef.cluster 'base0-cluster0' do Global setup_role_implications cloud :ec2 do Cloud & Cluster region 'us-east-1' Configs availability_zones ['us-east-1b'] backing 'ebs' image_name 'natty' security_group(cluster_root) do authorize_port_range(22) end end role "base_role" role "chef_client" role "base0-cluster0" role "production" role "runastack" role "ebs_volumes_attach" Roles & Recipes role "ebs_volumes_mount" are the “Leaves” Facets are facet 'master' do the “Trees” instances 1 cloud.image_id 'ami-93c31afa' cloud.flavor "cc1.4xlarge" role "big_package" role "hadoop_master" role "hbase_master" recipe "cluster_chef::cluster_webfront" recipe "hbase::utils" recipe "route53::runa" role "monitored_client" end
  • 14.
    Global Cloud &Recipe Confgs ClusterChef.cluster 'base0-cluster0' do setup_role_implications Cluster Name cloud :ec2 do Cloud region 'us-east-1' Configs availability_zones ['us-east-1b'] backing 'ebs' Configure image_name 'natty' Security security_group(cluster_root) do Group authorize_port_range(22) end end Shared role "base_role" Roles role "chef_client" role "base0-cluster0" role "production" role "runastack" role "ebs_volumes_attach" role "ebs_volumes_mount"
  • 15.
    Facet Name Facets add Specifics facet 'master' do Number of Cloud instances 1 Copies Overrides cloud.image_id 'ami-93c31afa' cloud.flavor "cc1.4xlarge" role "hadoop_master" Facet role "hbase_master" Roles & recipe "cluster_chef::cluster_webfront" recipe "hbase::utils" Recipes recipe "route53::runa" role "monitored_client" Facet Name end Number of Cloud facet 'regionserver' do instances 7 Copies Overrides cloud.image_id 'ami-93c31afa' cloud.flavor "cc1.4xlarge" role "hadoop_slave" role "hbase_regionserver" Facet Make one recipe "hbase::utils" Roles & instance recipe "route53::runa" role "monitored_client" Recipes special server 0 do role "zookeeper_server" end end
  • 16.
    Facets Composed of Components • Components are Services • Nginx, MySQL server, Zookeeper, HBMaster, Namenode, etc. • Chef Cookbooks manage components • Ironfan Silverware for service discovery • Auto-Connects components together
  • 17.
    Silverware Service Discovery •Recipe that creates a service , announces it announce(:hadoop, :namenode) • Recipe that requires a service , discovers it hbase_config = Mash.new({ :namenode_fqdn => discover(:hadoop, :namenode ).private_hostname), :jobtracker_addr => discover(:hadoop, :jobtracker).private_ip), :zookeeper_addrs => discover_all(:zookeeper, :server).map(&:private_ip).sort, :ganglia => discover(:ganglia, :server), :ganglia_addr => discover(:ganglia, :server).private_hostname), :private_ip => private_ip_of(node) })
  • 18.
    Aspects enable Zeroconf Amenities • A log aspect would enable the following amenities • logrotated to manage its logs • flume to archive logs to a location • A port aspect would enable • Configuration of firewall • Monitoring of port uptime & latency • Remote checks that firewalled ports do NOT respond
  • 19.
    Knife Cluster: LifecycleManagement • A Plugin for Opscode Chef Knife • Deployment & Lifecycle Operations: • launch, bootstrap, kill, start, stop • Access and Chef Operations • ssh, kick, proxy • Utilities • show, sync
  • 20.
    Knife Launch Cluster, Facet or Instance[s] • Launch cluster launchcluster knife all the nodes in a base0-master0 • Launchcluster launch base0-master0 facet 0 knife just a single instance of a single master • Launchclusterinstances base0-master0 regionserver knife all the launch of a facet
  • 21.
    Stop/Start Cluster, Facet or Instance[s] • Stop whole cluster base0-master0 knife cluster stop • Stop a cluster stop base0-master0 master 0 knife single instance of a single facet • Stop allcluster stop abase0-master0 regionserver knife instances of facet
  • 22.
    Same Knife Commandto Launch Vagrant[s] • Can use the same cluster configurations and knife command to launch Vagrants knife cluster vagrant up base0-master0 • Still Experimental
  • 23.
    Ironfan-CI • Jenkins basedContinuous Integration of Clusters • Still Experimental • Uses Discovery to automate baseline test creation • Leverages Vagrant to create clean test environments
  • 24.
    References • Basic Chef Stuff: http://wiki.opscode.com/display/chef/Home • Ironfan Screencast: http://vimeo.com/37279372 • Ironfan Wiki for the most complete info : https://github.com/infochimps-labs/ironfan/wiki • The Forest for the Trees Photo - Ame Otoko http://www.flickr.com/photos/ameotoko/5383225925/