SlideShare a Scribd company logo
1 of 24
Planning a Cluster
  7/6/2012

© 2012 MapR Technologies   Planning a Cluster 1
Planning a Cluster
   Agenda
   • Hardware Requirements
   • Hardware Recommendations
   • Operating System Requirements
   • Node Configuration
   • Service Layout
   • HA Cluster Design
   • LAB: Planning



© 2012 MapR Technologies        Planning a Cluster 2
Planning a Cluster
   Objectives
   At the end of this module you will be able to:
   • List the minimum hardware and software requirements for MapR
   • List the recommended hardware configuration
   • Describe a typical 50 and 100 TB rack configuration
   • Explain how MapR services are arranged on a cluster
   • Identify the important considerations of HA cluster design




© 2012 MapR Technologies        Planning a Cluster 3
Hardware Requirements




© 2012 MapR Technologies   Planning a Cluster 4
Hardware Requirements
   Minimum Hardware requirements:

   64-bit processor(s)
   4GB DRAM
   1 1GbE network interface
   One free un-mounted drive or partition
       100GB
   >=10GB free space on OS partition
   Swap space = 2X RAM
       or set overcommit_memory to 1.




© 2012 MapR Technologies   Planning a Cluster 5
Hardware
                           Recommendations



© 2012 MapR Technologies      Planning a Cluster 6
Hardware Recommendations
   Recommended Hardware configuration

   64-bit processor with 8-12 cores
   32G DRAM or more
   2 GigE network interfaces
   3-12 disks of 1-3 TB each
   At least 20 GB of free space on OS partition
   32 GB swap space or more




© 2012 MapR Technologies   Planning a Cluster 7
Hardware Recommendations
   Typical Compute/Storage Node

   2U Chassis
   Single motherboard, dual socket
   2 x 4-core + 32 GB RAM or 2 x 6-core + 48 GB RAM
   12 x 2 TB 7200-RPM drives
   2 - 4 1GbE network interfaces
    (on-board NIC + additional NIC)
   OS on single partition on one drive
    (remainder of drive used for storage)




© 2012 MapR Technologies   Planning a Cluster 8
Hardware Recommendations
   Typical 50TB Rack Configuration

   10 typical compute/storage nodes
    (10 x 12 x 2 TB storage; 3x replication, 25% margin)
   24-port 1 Gb/s rack-top switch with 2 x 10Gb/s uplink
   Add second switch if each node uses 4 network interfaces




© 2012 MapR Technologies   Planning a Cluster 9
Hardware Recommendations
   Typical 100TB Rack Configuration

   20 typical nodes
    (20 x 12 x 2 TB storage; 3x replication, 25% margin)
   48-port 1 Gb/s rack-top switch with 4 x 10Gb/s uplink
   Add second switch if each node uses 4 network interfaces




© 2012 MapR Technologies   Planning a Cluster 10
Notes on Rack Configuration



     Plenty of bandwidth! –match to disk I/O
     Multiple NICs per node
     Make sure switches can handle throughput




© 2012 MapR Technologies   Planning a Cluster 11
Operating System
                            Requirements



© 2012 MapR Technologies      Planning a Cluster 12
Operating System Requirements
   OS requirements

   64-bit CentOS 5.4 or greater
   64-bit Red Hat 5.4 or greater
   64-bit Ubuntu 9.04 or greater
   64-bit SUSE Enterprise 11.0 or greater




© 2012 MapR Technologies   Planning a Cluster 13
Node Configuration




© 2012 MapR Technologies       Planning a Cluster 14
Node Configuration
   Each node must have:

   Unique hostname
   Forward/reverse name resolution with all other nodes
   MapR-specific user




© 2012 MapR Technologies    Planning a Cluster 15
Node Configuration

     Hardware
      –   If RAID used for MapR disks, use –w 1 with disksetup
     Software and operating system
      –   Java
     Networking
      –   Keyless ssh
      –   Forward and reverse hostname resolution between all nodes
     Configuration
      –   Same users/groups on all nodes, or integration with LDAP (via PAM)
          •   Common unified authentication
      –   NTP – all nodes sync to one internal NTP server that syncs to outside


© 2012 MapR Technologies            Planning a Cluster 16
Service Layout




© 2012 MapR Technologies     Planning a Cluster 17
Example: Small Cluster Admin Svcs.




© 2012 MapR Technologies   Planning a Cluster 18
Service Instances
    Service                   Package                   How Many
    CLDB                      mapr-cldb                 1-3
    FileServer                mapr-fileserver           Most or all nodes
    HBase Master              mapr-hbase-master         1-3
    HBase RegionServer        mapr-hbase-regionserver   Varies
    JobTracker                mapr-jobtracker           1-3
    NFS                       mapr-nfs                  Varies
    TaskTracker               mapr-tasktracker          Most or all nodes
    WebServer                 mapr-webserver            One or more
                                                        1, 3, 5, or a higher odd
    Zookeeper                 mapr-zookeeper
                                                        number




© 2012 MapR Technologies        Planning a Cluster 19
HA Cluster Design




© 2012 MapR Technologies       Planning a Cluster 20
HA Cluster Design

     M5 license required for HA
     Number of instances of each service
     Placement of services on different racks
     No single point of failure
     Failover:
      –   JobTracker
      –   CLDB
      –   NFS
      –   Zookeeper




© 2012 MapR Technologies       Planning a Cluster 21
HA Cluster Design




    Note: it is not a requirement to put admin services on first node, but it is a useful
    convention

© 2012 MapR Technologies            Planning a Cluster 22
LAB:
                           Planning



© 2012 MapR Technologies   Planning a Cluster 23
Questions




© 2012 MapR Technologies   Planning a Cluster 24

More Related Content

What's hot

HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.Jack Levin
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureDataWorks Summit
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR HadoopMapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoopabord
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementDataWorks Summit
 
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored NodesAchieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored NodesDataWorks Summit
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon
 
XPDDS17: To Grant or Not to Grant? - João Martins, Oracle
XPDDS17: To Grant or Not to Grant? - João Martins, Oracle XPDDS17: To Grant or Not to Grant? - João Martins, Oracle
XPDDS17: To Grant or Not to Grant? - João Martins, Oracle The Linux Foundation
 
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...Tommy Lee
 
Gluster Storage
Gluster StorageGluster Storage
Gluster StorageRaz Tamir
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaKamesh Pemmaraju
 
PLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP MoonshotPLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP MoonshotPROIDEA
 
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaApache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaDataWorks Summit
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networksinside-BigData.com
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemCloudera, Inc.
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreCloudera, Inc.
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance TuningLars Hofhansl
 

What's hot (20)

HBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBaseHBaseCon 2015: Multitenancy in HBase
HBaseCon 2015: Multitenancy in HBase
 
Ceph on arm64 upload
Ceph on arm64   uploadCeph on arm64   upload
Ceph on arm64 upload
 
Hug Hbase Presentation.
Hug Hbase Presentation.Hug Hbase Presentation.
Hug Hbase Presentation.
 
High Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and FutureHigh Availability for HBase Tables - Past, Present, and Future
High Availability for HBase Tables - Past, Present, and Future
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
MapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR HadoopMapReduce Improvements in MapR Hadoop
MapReduce Improvements in MapR Hadoop
 
Rigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance MeasurementRigorous and Multi-tenant HBase Performance Measurement
Rigorous and Multi-tenant HBase Performance Measurement
 
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored NodesAchieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
 
HBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ SalesforceHBaseCon 2015: HBase Performance Tuning @ Salesforce
HBaseCon 2015: HBase Performance Tuning @ Salesforce
 
XPDDS17: To Grant or Not to Grant? - João Martins, Oracle
XPDDS17: To Grant or Not to Grant? - João Martins, Oracle XPDDS17: To Grant or Not to Grant? - João Martins, Oracle
XPDDS17: To Grant or Not to Grant? - João Martins, Oracle
 
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...Gluster fs tutorial   part 2  gluster and big data- gluster for devs and sys ...
Gluster fs tutorial part 2 gluster and big data- gluster for devs and sys ...
 
Gluster Storage
Gluster StorageGluster Storage
Gluster Storage
 
OpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of AlabamaOpenStack and Ceph case study at the University of Alabama
OpenStack and Ceph case study at the University of Alabama
 
PLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP MoonshotPLNOG 13: Maciej Grabowski: HP Moonshot
PLNOG 13: Maciej Grabowski: HP Moonshot
 
Apache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in AlibabaApache Hadoop YARN 3.x in Alibaba
Apache Hadoop YARN 3.x in Alibaba
 
A Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural NetworksA Dataflow Processing Chip for Training Deep Neural Networks
A Dataflow Processing Chip for Training Deep Neural Networks
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File System
 
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and MoreHBaseCon 2013: How to Get the MTTR Below 1 Minute and More
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
 
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
 
Ceph on rdma
Ceph on rdmaCeph on rdma
Ceph on rdma
 

Viewers also liked

3 map r installation & setup administration course description
3 map r installation & setup administration course description3 map r installation & setup administration course description
3 map r installation & setup administration course descriptionmapr-academy
 
The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere MapR Technologies
 
30a accessing your cluster
30a accessing your cluster30a accessing your cluster
30a accessing your clustermapr-academy
 
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters Hortonworks
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariDataWorks Summit
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopDataWorks Summit
 
Deploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDeploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDataWorks Summit
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariDataWorks Summit
 

Viewers also liked (11)

3 map r installation & setup administration course description
3 map r installation & setup administration course description3 map r installation & setup administration course description
3 map r installation & setup administration course description
 
12a architecture
12a architecture12a architecture
12a architecture
 
The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere The Briefcase Cluster – Enabling Big Data Everywhere
The Briefcase Cluster – Enabling Big Data Everywhere
 
30a accessing your cluster
30a accessing your cluster30a accessing your cluster
30a accessing your cluster
 
MapR 5.2 Product Update
MapR 5.2 Product UpdateMapR 5.2 Product Update
MapR 5.2 Product Update
 
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
Discover HDP 2.1: Using Apache Ambari to Manage Hadoop Clusters
 
Managing 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with AmbariManaging 2000 Node Cluster with Ambari
Managing 2000 Node Cluster with Ambari
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
 
Deploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARIDeploying and Managing Hadoop Clusters with AMBARI
Deploying and Managing Hadoop Clusters with AMBARI
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Managing your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache AmbariManaging your Hadoop Clusters with Apache Ambari
Managing your Hadoop Clusters with Apache Ambari
 

Similar to 13c planning

Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Etu Solution
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batchboorad
 
What CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDWhat CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDShapeBlue
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDKKernel TLV
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7Ted Dunning
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Uwe Printz
 
Polyteda Power DRC/LVS July 2016
Polyteda Power DRC/LVS July 2016Polyteda Power DRC/LVS July 2016
Polyteda Power DRC/LVS July 2016Oleksandra Nazola
 
NetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and VerticaNetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and VerticaJosef Niedermeier
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuiteEDB
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli
 
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analyticsmason_s
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanNarayana B
 

Similar to 13c planning (20)

22 configuration
22 configuration22 configuration
22 configuration
 
HBase with MapR
HBase with MapRHBase with MapR
HBase with MapR
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
 
TriHUG - Beyond Batch
TriHUG - Beyond BatchTriHUG - Beyond Batch
TriHUG - Beyond Batch
 
48a tuning
48a tuning48a tuning
48a tuning
 
What CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBDWhat CloudStackers Need To Know About LINSTOR/DRBD
What CloudStackers Need To Know About LINSTOR/DRBD
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
14 lab-planing
14 lab-planing14 lab-planing
14 lab-planing
 
14 lab-planing
14 lab-planing14 lab-planing
14 lab-planing
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Inside MapR's M7
Inside MapR's M7Inside MapR's M7
Inside MapR's M7
 
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
 
Yarns About Yarn
Yarns About YarnYarns About Yarn
Yarns About Yarn
 
Polyteda Power DRC/LVS July 2016
Polyteda Power DRC/LVS July 2016Polyteda Power DRC/LVS July 2016
Polyteda Power DRC/LVS July 2016
 
NetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and VerticaNetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and Vertica
 
Postgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster SuitePostgres & Red Hat Cluster Suite
Postgres & Red Hat Cluster Suite
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data AnalyticsSupersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
Supersized PostgreSQL: Postgres-XL for Scale-Out OLTP and Big Data Analytics
 
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
 

More from mapr-academy

More from mapr-academy (6)

53 lab-nfs
53 lab-nfs53 lab-nfs
53 lab-nfs
 
51 lab-volumes
51 lab-volumes51 lab-volumes
51 lab-volumes
 
50a volumes
50a volumes50a volumes
50a volumes
 
42 lab-managing services
42 lab-managing services42 lab-managing services
42 lab-managing services
 
41a managing services
41a managing services41a managing services
41a managing services
 
10c introduction
10c introduction10c introduction
10c introduction
 

Recently uploaded

Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 

Recently uploaded (20)

20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 

13c planning

  • 1. Planning a Cluster 7/6/2012 © 2012 MapR Technologies Planning a Cluster 1
  • 2. Planning a Cluster Agenda • Hardware Requirements • Hardware Recommendations • Operating System Requirements • Node Configuration • Service Layout • HA Cluster Design • LAB: Planning © 2012 MapR Technologies Planning a Cluster 2
  • 3. Planning a Cluster Objectives At the end of this module you will be able to: • List the minimum hardware and software requirements for MapR • List the recommended hardware configuration • Describe a typical 50 and 100 TB rack configuration • Explain how MapR services are arranged on a cluster • Identify the important considerations of HA cluster design © 2012 MapR Technologies Planning a Cluster 3
  • 4. Hardware Requirements © 2012 MapR Technologies Planning a Cluster 4
  • 5. Hardware Requirements Minimum Hardware requirements:  64-bit processor(s)  4GB DRAM  1 1GbE network interface  One free un-mounted drive or partition  100GB  >=10GB free space on OS partition  Swap space = 2X RAM  or set overcommit_memory to 1. © 2012 MapR Technologies Planning a Cluster 5
  • 6. Hardware Recommendations © 2012 MapR Technologies Planning a Cluster 6
  • 7. Hardware Recommendations Recommended Hardware configuration  64-bit processor with 8-12 cores  32G DRAM or more  2 GigE network interfaces  3-12 disks of 1-3 TB each  At least 20 GB of free space on OS partition  32 GB swap space or more © 2012 MapR Technologies Planning a Cluster 7
  • 8. Hardware Recommendations Typical Compute/Storage Node  2U Chassis  Single motherboard, dual socket  2 x 4-core + 32 GB RAM or 2 x 6-core + 48 GB RAM  12 x 2 TB 7200-RPM drives  2 - 4 1GbE network interfaces (on-board NIC + additional NIC)  OS on single partition on one drive (remainder of drive used for storage) © 2012 MapR Technologies Planning a Cluster 8
  • 9. Hardware Recommendations Typical 50TB Rack Configuration  10 typical compute/storage nodes (10 x 12 x 2 TB storage; 3x replication, 25% margin)  24-port 1 Gb/s rack-top switch with 2 x 10Gb/s uplink  Add second switch if each node uses 4 network interfaces © 2012 MapR Technologies Planning a Cluster 9
  • 10. Hardware Recommendations Typical 100TB Rack Configuration  20 typical nodes (20 x 12 x 2 TB storage; 3x replication, 25% margin)  48-port 1 Gb/s rack-top switch with 4 x 10Gb/s uplink  Add second switch if each node uses 4 network interfaces © 2012 MapR Technologies Planning a Cluster 10
  • 11. Notes on Rack Configuration  Plenty of bandwidth! –match to disk I/O  Multiple NICs per node  Make sure switches can handle throughput © 2012 MapR Technologies Planning a Cluster 11
  • 12. Operating System Requirements © 2012 MapR Technologies Planning a Cluster 12
  • 13. Operating System Requirements OS requirements  64-bit CentOS 5.4 or greater  64-bit Red Hat 5.4 or greater  64-bit Ubuntu 9.04 or greater  64-bit SUSE Enterprise 11.0 or greater © 2012 MapR Technologies Planning a Cluster 13
  • 14. Node Configuration © 2012 MapR Technologies Planning a Cluster 14
  • 15. Node Configuration Each node must have:  Unique hostname  Forward/reverse name resolution with all other nodes  MapR-specific user © 2012 MapR Technologies Planning a Cluster 15
  • 16. Node Configuration  Hardware – If RAID used for MapR disks, use –w 1 with disksetup  Software and operating system – Java  Networking – Keyless ssh – Forward and reverse hostname resolution between all nodes  Configuration – Same users/groups on all nodes, or integration with LDAP (via PAM) • Common unified authentication – NTP – all nodes sync to one internal NTP server that syncs to outside © 2012 MapR Technologies Planning a Cluster 16
  • 17. Service Layout © 2012 MapR Technologies Planning a Cluster 17
  • 18. Example: Small Cluster Admin Svcs. © 2012 MapR Technologies Planning a Cluster 18
  • 19. Service Instances Service Package How Many CLDB mapr-cldb 1-3 FileServer mapr-fileserver Most or all nodes HBase Master mapr-hbase-master 1-3 HBase RegionServer mapr-hbase-regionserver Varies JobTracker mapr-jobtracker 1-3 NFS mapr-nfs Varies TaskTracker mapr-tasktracker Most or all nodes WebServer mapr-webserver One or more 1, 3, 5, or a higher odd Zookeeper mapr-zookeeper number © 2012 MapR Technologies Planning a Cluster 19
  • 20. HA Cluster Design © 2012 MapR Technologies Planning a Cluster 20
  • 21. HA Cluster Design  M5 license required for HA  Number of instances of each service  Placement of services on different racks  No single point of failure  Failover: – JobTracker – CLDB – NFS – Zookeeper © 2012 MapR Technologies Planning a Cluster 21
  • 22. HA Cluster Design Note: it is not a requirement to put admin services on first node, but it is a useful convention © 2012 MapR Technologies Planning a Cluster 22
  • 23. LAB: Planning © 2012 MapR Technologies Planning a Cluster 23
  • 24. Questions © 2012 MapR Technologies Planning a Cluster 24