LAB: Configuration                                                             July 2012



LAB: Planning
Goal
In these exercises you will designa MapR HACluster by planningthe service layout.
This is a theoretical exercise and at the end you will review other student’s MapR
HA Cluster design.

Exercise: Planning: Review Service Layout in a small cluster
       Planning involveswhich services to run on which nodes in the cluster.
           The majority of nodes are worker nodes, which run the TaskTracker
             and MapR-FS services for data processing.
           A few nodes run control services that manage the cluster and
             coordinate MapReduce jobs.

       Here is an example of the control service layout on a small cluster:




       Most clusters run the CLDB on three nodes and ZooKeeper on three or five
       nodes. For details and sample configurations, see Planning the Deployment.
       The following guidelines will help you refine the service layout.

             ZooKeeper and CLDB
              If possible, avoid running ZooKeeper and CLDB on the same node
              (especially in clusters over 100 nodes).
             TaskTracker on CLDB or ZooKeeper nodes
              If you must run TaskTracker on CLDB or ZooKeeper nodes, reduce the
              number of task slots by half (see Tuning MapReduce).



© 2012 by MapR Technologies                                                   Page 1 of 3
LAB: Configuration                                                        July 2012


            JobTracker on CLDB
             Avoid running the JobTracker on the master CLDB node. You can
             configure the JobTracker and CLDB on the same nodes as long as the
             running JobTracker and the master CLDB do not reside on the same
             node; if failover causes the JobTracker and CLDB to run on the same
             node, rectify the problem as quickly as practical.
            JobTracker on ZooKeeper nodes
             Avoid running the JobTracker on nodes that are running ZooKeeper.
            Set up a pool of machines for failover
             Maintain additional machines for failover; these machines can be
             normal nodes running the TaskTracker until they are needed.

Exercise: Planning: Design Service Layout for HA Cluster
      You are going to design your own HA Cluster and review with other students.
      The instructor will give you a planning sheet to record your cluster design.
      Draw the racks and name them Rack1, Rack2, … RackN, and label the
      Services per the following example:




      Layout the services using the following table asa guideline for the number of
      instances of each service to run in thecluster:




© 2012 by MapR Technologies                                              Page 2 of 3
LAB: Configuration                                                 July 2012



    Service                   Package             How Many

    CLDB (CLDB)               mapr-cldb           1-3

    FileServer (FS)           mapr-fileserver     Most or all nodes

    HBase Master (HM)         mapr-hbase-master   1-3

    HBaseRegionServer         mapr-hbase-         Varies
    (HR)                      regionserver

    JobTracker (JT)           mapr-jobtracker     1-3

    NFS                       mapr-nfs            Varies

    TaskTracker (TT)          mapr-tasktracker    Most or all nodes

    WebServer (WS)            mapr-webserver      One or more

    Zookeeper (ZK)            mapr-zookeeper      1, 3, 5, or a higher odd
                                                  number




© 2012 by MapR Technologies                                       Page 3 of 3

14 lab-planing

  • 1.
    LAB: Configuration July 2012 LAB: Planning Goal In these exercises you will designa MapR HACluster by planningthe service layout. This is a theoretical exercise and at the end you will review other student’s MapR HA Cluster design. Exercise: Planning: Review Service Layout in a small cluster Planning involveswhich services to run on which nodes in the cluster.  The majority of nodes are worker nodes, which run the TaskTracker and MapR-FS services for data processing.  A few nodes run control services that manage the cluster and coordinate MapReduce jobs. Here is an example of the control service layout on a small cluster: Most clusters run the CLDB on three nodes and ZooKeeper on three or five nodes. For details and sample configurations, see Planning the Deployment. The following guidelines will help you refine the service layout.  ZooKeeper and CLDB If possible, avoid running ZooKeeper and CLDB on the same node (especially in clusters over 100 nodes).  TaskTracker on CLDB or ZooKeeper nodes If you must run TaskTracker on CLDB or ZooKeeper nodes, reduce the number of task slots by half (see Tuning MapReduce). © 2012 by MapR Technologies Page 1 of 3
  • 2.
    LAB: Configuration July 2012  JobTracker on CLDB Avoid running the JobTracker on the master CLDB node. You can configure the JobTracker and CLDB on the same nodes as long as the running JobTracker and the master CLDB do not reside on the same node; if failover causes the JobTracker and CLDB to run on the same node, rectify the problem as quickly as practical.  JobTracker on ZooKeeper nodes Avoid running the JobTracker on nodes that are running ZooKeeper.  Set up a pool of machines for failover Maintain additional machines for failover; these machines can be normal nodes running the TaskTracker until they are needed. Exercise: Planning: Design Service Layout for HA Cluster You are going to design your own HA Cluster and review with other students. The instructor will give you a planning sheet to record your cluster design. Draw the racks and name them Rack1, Rack2, … RackN, and label the Services per the following example: Layout the services using the following table asa guideline for the number of instances of each service to run in thecluster: © 2012 by MapR Technologies Page 2 of 3
  • 3.
    LAB: Configuration July 2012 Service Package How Many CLDB (CLDB) mapr-cldb 1-3 FileServer (FS) mapr-fileserver Most or all nodes HBase Master (HM) mapr-hbase-master 1-3 HBaseRegionServer mapr-hbase- Varies (HR) regionserver JobTracker (JT) mapr-jobtracker 1-3 NFS mapr-nfs Varies TaskTracker (TT) mapr-tasktracker Most or all nodes WebServer (WS) mapr-webserver One or more Zookeeper (ZK) mapr-zookeeper 1, 3, 5, or a higher odd number © 2012 by MapR Technologies Page 3 of 3