CLOUMON

R

ENJOY HADOOP
Hadoop, Hive and Hadoop Ecosystem
Monitoring and Management System

The powerful Hadoop open-source software stack requires careful integration,
calibration, and monitoring, which is why Gruter has developed its own in-house cloud

management solution, Cloumon. With a user-friendly interface and management
console, Cloumon enables system administrators to optimize the Hadoop ecosystem
and take control of the cloud across the entire data lifecycle.

CLOUMON CH (Core Hadoop)
Hadoop (HDFS, MapReduce) and Hive
R

CLOUMON PA (Power Analytics)
R

Advanced Analysis Rule Manager, Streaming Data Processing
Manager, and Interactive Analysis Query Manager

CLOUMON EPs (Extension Packs)
Oozie, HBase, ZooKeeper, and Flume
R
CLOUMON KEY FEATURES
R

MANAGEMENT

ZooKeeper

DATA ANALYSIS

DATA STORAGE

ZooKeeper Node
Manager

HDFS File

HDFS File
Manager

MapReduce

MapReduce Job
Manager

Hive Query

Hive Table

Hive Table
Manager

Hive Query
Manager

HBase

HBase Data
Manager

STREAMING DATA PROCESSING

Esper

Esper Query
Manager

Pig Query

Job Workflow Manager

DATA COLLECTION

Data

Agent

•
Collector

Real-time Analysis Manager
Flume Data Flow Manager

Flume Data Flow
Manager
* Key Cloumon management zones in orange
R

MONITORING
• Collect and graph metrics from target daemon servers including NameNode, DataNode, JobTracker and TaskTracker
• Create alerts by setting thresholds on target metrics and servers
• Construct highly visible log data management views
• Monitor system resource usage

CLUSTER MANAGEMENT
• Manage integrated configurations for server groups
• Conveniently access optimized Hive and Oozie functionality
• Remotely control servers to perform stop-start maintenance routines
• Run various Hadoop distributions including Apache Hadoop 1.0.x, Apache Hadoop 2.0.x and CDH Hadoop 4.2.x
• Control multiple Hadoop clusters

DATA MANAGEMENT
• Manage entire data lifecycle from data collection to storage, batch analysis and real-time analysis
• Browse files with Hadoop File Browser; create and execute queries with Hive Query Workbench
• Design and schedule workflows with Oozie Workflow Designer; manage ZNode with ZooKeeper Manager

1

Enjoy Connecting GRUTER
CLOUMON CH (Core Hadoop)
R

Cloumon CH provides a streamlined environment for the operation of Hadoop and Hive, the core components of
advanced Big Data platforms. Through enhanced component visibility and task management features, Cloumon CH
gives unprecedented access to and control over Big Data systems.

HDFS Manager

HDFS Cluster Manager
KEY FEATURES

HDFS daemon status monitoring

Remote server control

· Monitor status and failures on NameNode, JournalNode, SecondaryNameNode,
DataNode and DFSZKFailoverController
· Use simple pre-configured wizards to add new servers to running clusters
(coming release — Q2 2013)
· Start and stop servers remotely
· Manage configurations in server groups

Server group configuration

· Detect servers with asymmetric configurations automatically
· Apply configurations to all clusters or specific target servers

Comprehensive metric monitoring

· Collect HDFS metrics at single minute intervals
· Track performance history and graph server metrics for thorough system analysis
· Set disk usage thresholds by server and partition

User-configured server threshold
alerts

· Set alert thresholds on all HDFS metrics
· Set SMS alerts for critical metrics via Alert Plugin

Integrated log view creation and
management

· Create one-stop views of logs from across the distributed system

Multiple cluster commissioning and
management

· Commission and manage multiple clusters as system scales out

Major HDFS distribution compatibility

· Compatible with major distributions including Apache Hadoop 0.20.x, Apache
Hadoop 1.0.x, Apache Hadoop 2.0.x, CHD 4.1.x, CDH 4.2.x

HDFS File Browser
KEY FEATURES

HDFS commands

· Execute commands including list, mkdir, delete, chown and chmod

List sorting

· Sort lists by name, size, date and owner to improve search speed

And more: Directory tree views; file block information; file data views; file download/upload capabilities

2

Enjoy Connecting GRUTER
MapReduce Manager

MapReduce Cluster Manager
KEY FEATURES

MapReduce daemon status monitoring

· Monitor status and failures on JobTracker and TaskTracker

Remote server control

· Use simple pre-configured wizards to add new servers to running clusters
(coming release — Q2 2013)
· Start and stop servers remotely
· Manage configurations in server groups

Server group configuration

· Detect servers with asymmetric configurations automatically
· Apply configurations to all clusters or specific target servers
· Set disk usage thresholds by server and partition

Comprehensive metric monitoring and
configurable server threshold alerts

· Set alert thresholds on all HDFS metrics
· Set SMS alerts for critical metrics via Alert Plugin

Integrated log view creation and
management

· Create one-stop views of logs from across the distributed system

Multiple cluster commissioning and
management

· Commission and manage multiple clusters as system scales out

MapReduce Job Manager
KEY FEATURES

Job management

Job status monitoring

· Manage current job information and track job history
· Filter job lists by status and period
· Monitor task status and job counter
· Track full execution history

Task profiling

· Profile task execution progress and elapsed execution time

Task control

· Abort processes through stop task functionality

Scheduler monitoring

Hive and Oozie integration

3

· Monitor fair scheduler mode queue status
· Manage queues
· Monitor Hive query executions

Enjoy Connecting GRUTER
Hive Manager

Hive Query and Hive Configuration
KEY FEATURES

Hive connection management

· Support multiple connections with built-in Hive delegator (Hive installation not
required)

Hive session management

· Manage driver sessions and track query execution status

Table meta viewer and table viewer

· Generate detailed table description views and data tables

Multiple query executor

· Execute multiple simultaneous queries

User-defined jar and script
management

· Upload/delete/apply UDF and Custom M/R

Progress viewer and query status
inquiry

· Check query execution progress and track execution history

Query management

· Generate saved query and Hive function description views

Table and query wizard

· Use simple pre-configured wizards to create tables and queries
· Edit and dynamically deploy Hive and Hadoop client configurations

Configuration management

· Access comprehensive storage usage, partitioning and bucket information.

Versions Supported

System Requirements
Apache Hadoop 0.20.x

HDFS

OS

Linux, Windows

Apache Hadoop 1.0.x

WebServer

Tomcat 6.x

DataBase

MySQL 5.x

Java Virtual Machine

JDK6

Apache Hadoop 2.0.x
CDH 4.1.x
CDH 4.2.x

Apache Hadoop 0.20.x

MapReduce

Apache Hadoop 1.0.x
CDH 4.x-mr1

Service SLA
Apache Hive 0.8.x
Apache Hive 0.9.x

Hive

Apache Hive 0.10.x
CDH 4.1.x Hive
CDH 4.2.x Hive

4

Web-based support
8x5

Phone support

24x7

2-24 hour initial response time

Enjoy Connecting GRUTER
CLOUMON PA (Power Analytics)
R

CLOUMON PACKAGE

Cloumon PA is a high-performance Big Data system which brings together a powerful set of cutting-edge
technologies and tools to help you perform advanced analytics on Hadoop and Hive.
Smart query building processes and intuitive execution flows generate sophisticated outputs in just a few clicks
without the need for complex query syntax.

Stream Processing Rule Manager
KEY FEATURES

Console for streaming data
processing

· Manage entire lifecycle of streaming data processing by registering data type,
configuring parser, managing EPL queries for analysis, and storing and querying
results, among other functionalities

Data type management and
configuration

· Define type name, column, record parser and result table

Analysis result storage management

· Use built-in storage interfaces such as HBase and MySQL
· Extend interface to add and select user-defined storage
· Manage EPL queries

Analysis query manager

· Add and delete queries dynamically in a running environment
· Have results stored automatically in selected storage

Analysis output visualization

· Visualize results using various charts and graphs according to data type

Interactive Analytics (Impala, Tajo)
KEY FEATURES
· Manage metadata such as table schemas for integration with Hive

Impala

· Use Impala query workbench
· Monitor status of Impala clusters
· Manage metadata such as table schemas
· Use Tajo query workbench

Tajo

· Monitor status of Tajo clusters
· Manage Hive/Tajo/Impala queries in an integrated fashion and choose optimal
execution platform
· Manage queries in concert with Advanced Analysis Rule Manager

5

Enjoy Connecting GRUTER
Advanced Analysis Rule Manager
KEY FEATURES

Hive Query Based Analysis Rule Management
Analysis target object management

Hive query builder

· Select analysis targets such as Hive table and existing queries, among others
· Manage aliases down to fields and rules via user-friendly UI
· Build complex queries with multiple “join”, “group by”, and “order by”
functions/clauses simply and quickly
· Employ variables for high re-usability and productivity

Analysis target querying

· Define materialized views as analysis targets without burden of generating
actual views
· Create fresh results at each execution or conveniently reuse previous results
at point of execution

Rule charting

· Visualize usage of individual rules and their interrelationships through charting
and graphing tools

Execution and Result Management
Scheduling

· Manage execution start time
· Track execution history

Query optimization

· Create fresh results at each execution or conveniently reuse previous results
at point of execution

Dynamic variable binding at execution

· Bind actual values to variables dynamically at point of execution

Multiple storage options

· Use Hive Table, HDFS Directory and HBase Table

Powerful viewer and APIs to access analysis outputs

6

Enjoy Connecting GRUTER
CLOUMON EPs (Extension Packs)
R

CLOUMON PACKAGE

Cloumon EPs provide additional monitoring and management capabilities for other key components of the Hadoop
ecosystem including Oozie, HBase, Flume and Zookeeper, granting comprehensive control of the entire data lifecycle
from data collection, storage and workflow design to task scheduling and distributed system role management.

Oozie Workflow Manager
KEY FEATURES

Wysiwyg job designer
· Upload jar files

Library file management

· Manage mapper, reducer, and writable classes
· Manage job libraries (distributed cache)
· Schedule job execution

Job execution management

· Track job execution history
· Monitor jobs via integrated Cloumon MapReduce Job Manager

HBase Manager
KEY FEATURES

HBase Cluster Management
HMaster and RegionServer alerts
Single server metric monitoring

· Collect information at single minute intervals
· Track history of metric changes over time and chart in time-series

Table and Region status monitoring

·
·
·
·

Manage Region (Q3 2013)

· Perform Region compaction, split, and merge
· Execute and schedule jobs according to user-configured rules

Fetch lists of tables and look up table schemas
Manage table region lists and region lists on RegionServers
Monitor detailed region metrics
Create and drop table (Q3 2013)

HBase Data Management
Table data scanning
Column data fetching by row query
Long type transformation

· Automatically convert byte array long to numeric long for readability

Web-based HBase shell (Q3 2013)

7

Enjoy Connecting GRUTER
ZooKeeper Manager
KEY FEATURES

ZooKeeper Cluster Management
Monitor server status and set alerts
View detailed ZooKeeper server
metrics

· Collect metrics at single minute intervals

Monitor ZooKeeper connections

· Monitor and inspect all connections to ZooKeeper servers

· Track metric change history

Manage multiple clusters simultaneously
ZooKeeper Node Management
Easily manage zNodes by accessing detailed information and manipulating data through convenient file browser
interface
Manage ACLs for each zNode
Manage zNode watcher registration

Flume Manager for Flume-OG (v0.9.4)
KEY FEATURES

Data flow management

· Inspect data flow between agent and collector
· Monitor workloads of each node using workload indicators
· Design data processing flows via powerful tool which allocates source, deco
and sink

Powerful configuration tool

· Easily set parameters with pre-configured forms and help tips
· Reuse and edit existing configurations
· Check overview of node status and drill down to analyze specific details

Physical/logical node status
monitoring

· List logical nodes on specific physical nodes
· Create integrated views by combining data from Flume masters and ZooKeeper

Map/unmap/decommission/purgeAll
Multiple cluster management

8

· Control the entire lifecycle of logical nodes with minimal clicks
· Use smart proxies to complete complex jobs in a single click
· Manage multiple clusters

Enjoy Connecting GRUTER
Gruter: Your Partner in the Big Data Revolution
Phone: +82-2-508-5911
Fax: +82-2-508-5912
E-mail: inquiries@gruter.com
Web: www.gruter.com
For demo videos, please visit: www.gruter.com/products/cloumon#video

GRUTER, INC.
5F Sehwa Office Building 889-70 Daechi-dong, Gangnam-gu, Seoul, South Korea 135-839

Cloumon Product Introduction

  • 1.
    CLOUMON R ENJOY HADOOP Hadoop, Hiveand Hadoop Ecosystem Monitoring and Management System The powerful Hadoop open-source software stack requires careful integration, calibration, and monitoring, which is why Gruter has developed its own in-house cloud management solution, Cloumon. With a user-friendly interface and management console, Cloumon enables system administrators to optimize the Hadoop ecosystem and take control of the cloud across the entire data lifecycle. CLOUMON CH (Core Hadoop) Hadoop (HDFS, MapReduce) and Hive R CLOUMON PA (Power Analytics) R Advanced Analysis Rule Manager, Streaming Data Processing Manager, and Interactive Analysis Query Manager CLOUMON EPs (Extension Packs) Oozie, HBase, ZooKeeper, and Flume R
  • 2.
    CLOUMON KEY FEATURES R MANAGEMENT ZooKeeper DATAANALYSIS DATA STORAGE ZooKeeper Node Manager HDFS File HDFS File Manager MapReduce MapReduce Job Manager Hive Query Hive Table Hive Table Manager Hive Query Manager HBase HBase Data Manager STREAMING DATA PROCESSING Esper Esper Query Manager Pig Query Job Workflow Manager DATA COLLECTION Data Agent • Collector Real-time Analysis Manager Flume Data Flow Manager Flume Data Flow Manager * Key Cloumon management zones in orange R MONITORING • Collect and graph metrics from target daemon servers including NameNode, DataNode, JobTracker and TaskTracker • Create alerts by setting thresholds on target metrics and servers • Construct highly visible log data management views • Monitor system resource usage CLUSTER MANAGEMENT • Manage integrated configurations for server groups • Conveniently access optimized Hive and Oozie functionality • Remotely control servers to perform stop-start maintenance routines • Run various Hadoop distributions including Apache Hadoop 1.0.x, Apache Hadoop 2.0.x and CDH Hadoop 4.2.x • Control multiple Hadoop clusters DATA MANAGEMENT • Manage entire data lifecycle from data collection to storage, batch analysis and real-time analysis • Browse files with Hadoop File Browser; create and execute queries with Hive Query Workbench • Design and schedule workflows with Oozie Workflow Designer; manage ZNode with ZooKeeper Manager 1 Enjoy Connecting GRUTER
  • 3.
    CLOUMON CH (CoreHadoop) R Cloumon CH provides a streamlined environment for the operation of Hadoop and Hive, the core components of advanced Big Data platforms. Through enhanced component visibility and task management features, Cloumon CH gives unprecedented access to and control over Big Data systems. HDFS Manager HDFS Cluster Manager KEY FEATURES HDFS daemon status monitoring Remote server control · Monitor status and failures on NameNode, JournalNode, SecondaryNameNode, DataNode and DFSZKFailoverController · Use simple pre-configured wizards to add new servers to running clusters (coming release — Q2 2013) · Start and stop servers remotely · Manage configurations in server groups Server group configuration · Detect servers with asymmetric configurations automatically · Apply configurations to all clusters or specific target servers Comprehensive metric monitoring · Collect HDFS metrics at single minute intervals · Track performance history and graph server metrics for thorough system analysis · Set disk usage thresholds by server and partition User-configured server threshold alerts · Set alert thresholds on all HDFS metrics · Set SMS alerts for critical metrics via Alert Plugin Integrated log view creation and management · Create one-stop views of logs from across the distributed system Multiple cluster commissioning and management · Commission and manage multiple clusters as system scales out Major HDFS distribution compatibility · Compatible with major distributions including Apache Hadoop 0.20.x, Apache Hadoop 1.0.x, Apache Hadoop 2.0.x, CHD 4.1.x, CDH 4.2.x HDFS File Browser KEY FEATURES HDFS commands · Execute commands including list, mkdir, delete, chown and chmod List sorting · Sort lists by name, size, date and owner to improve search speed And more: Directory tree views; file block information; file data views; file download/upload capabilities 2 Enjoy Connecting GRUTER
  • 4.
    MapReduce Manager MapReduce ClusterManager KEY FEATURES MapReduce daemon status monitoring · Monitor status and failures on JobTracker and TaskTracker Remote server control · Use simple pre-configured wizards to add new servers to running clusters (coming release — Q2 2013) · Start and stop servers remotely · Manage configurations in server groups Server group configuration · Detect servers with asymmetric configurations automatically · Apply configurations to all clusters or specific target servers · Set disk usage thresholds by server and partition Comprehensive metric monitoring and configurable server threshold alerts · Set alert thresholds on all HDFS metrics · Set SMS alerts for critical metrics via Alert Plugin Integrated log view creation and management · Create one-stop views of logs from across the distributed system Multiple cluster commissioning and management · Commission and manage multiple clusters as system scales out MapReduce Job Manager KEY FEATURES Job management Job status monitoring · Manage current job information and track job history · Filter job lists by status and period · Monitor task status and job counter · Track full execution history Task profiling · Profile task execution progress and elapsed execution time Task control · Abort processes through stop task functionality Scheduler monitoring Hive and Oozie integration 3 · Monitor fair scheduler mode queue status · Manage queues · Monitor Hive query executions Enjoy Connecting GRUTER
  • 5.
    Hive Manager Hive Queryand Hive Configuration KEY FEATURES Hive connection management · Support multiple connections with built-in Hive delegator (Hive installation not required) Hive session management · Manage driver sessions and track query execution status Table meta viewer and table viewer · Generate detailed table description views and data tables Multiple query executor · Execute multiple simultaneous queries User-defined jar and script management · Upload/delete/apply UDF and Custom M/R Progress viewer and query status inquiry · Check query execution progress and track execution history Query management · Generate saved query and Hive function description views Table and query wizard · Use simple pre-configured wizards to create tables and queries · Edit and dynamically deploy Hive and Hadoop client configurations Configuration management · Access comprehensive storage usage, partitioning and bucket information. Versions Supported System Requirements Apache Hadoop 0.20.x HDFS OS Linux, Windows Apache Hadoop 1.0.x WebServer Tomcat 6.x DataBase MySQL 5.x Java Virtual Machine JDK6 Apache Hadoop 2.0.x CDH 4.1.x CDH 4.2.x Apache Hadoop 0.20.x MapReduce Apache Hadoop 1.0.x CDH 4.x-mr1 Service SLA Apache Hive 0.8.x Apache Hive 0.9.x Hive Apache Hive 0.10.x CDH 4.1.x Hive CDH 4.2.x Hive 4 Web-based support 8x5 Phone support 24x7 2-24 hour initial response time Enjoy Connecting GRUTER
  • 6.
    CLOUMON PA (PowerAnalytics) R CLOUMON PACKAGE Cloumon PA is a high-performance Big Data system which brings together a powerful set of cutting-edge technologies and tools to help you perform advanced analytics on Hadoop and Hive. Smart query building processes and intuitive execution flows generate sophisticated outputs in just a few clicks without the need for complex query syntax. Stream Processing Rule Manager KEY FEATURES Console for streaming data processing · Manage entire lifecycle of streaming data processing by registering data type, configuring parser, managing EPL queries for analysis, and storing and querying results, among other functionalities Data type management and configuration · Define type name, column, record parser and result table Analysis result storage management · Use built-in storage interfaces such as HBase and MySQL · Extend interface to add and select user-defined storage · Manage EPL queries Analysis query manager · Add and delete queries dynamically in a running environment · Have results stored automatically in selected storage Analysis output visualization · Visualize results using various charts and graphs according to data type Interactive Analytics (Impala, Tajo) KEY FEATURES · Manage metadata such as table schemas for integration with Hive Impala · Use Impala query workbench · Monitor status of Impala clusters · Manage metadata such as table schemas · Use Tajo query workbench Tajo · Monitor status of Tajo clusters · Manage Hive/Tajo/Impala queries in an integrated fashion and choose optimal execution platform · Manage queries in concert with Advanced Analysis Rule Manager 5 Enjoy Connecting GRUTER
  • 7.
    Advanced Analysis RuleManager KEY FEATURES Hive Query Based Analysis Rule Management Analysis target object management Hive query builder · Select analysis targets such as Hive table and existing queries, among others · Manage aliases down to fields and rules via user-friendly UI · Build complex queries with multiple “join”, “group by”, and “order by” functions/clauses simply and quickly · Employ variables for high re-usability and productivity Analysis target querying · Define materialized views as analysis targets without burden of generating actual views · Create fresh results at each execution or conveniently reuse previous results at point of execution Rule charting · Visualize usage of individual rules and their interrelationships through charting and graphing tools Execution and Result Management Scheduling · Manage execution start time · Track execution history Query optimization · Create fresh results at each execution or conveniently reuse previous results at point of execution Dynamic variable binding at execution · Bind actual values to variables dynamically at point of execution Multiple storage options · Use Hive Table, HDFS Directory and HBase Table Powerful viewer and APIs to access analysis outputs 6 Enjoy Connecting GRUTER
  • 8.
    CLOUMON EPs (ExtensionPacks) R CLOUMON PACKAGE Cloumon EPs provide additional monitoring and management capabilities for other key components of the Hadoop ecosystem including Oozie, HBase, Flume and Zookeeper, granting comprehensive control of the entire data lifecycle from data collection, storage and workflow design to task scheduling and distributed system role management. Oozie Workflow Manager KEY FEATURES Wysiwyg job designer · Upload jar files Library file management · Manage mapper, reducer, and writable classes · Manage job libraries (distributed cache) · Schedule job execution Job execution management · Track job execution history · Monitor jobs via integrated Cloumon MapReduce Job Manager HBase Manager KEY FEATURES HBase Cluster Management HMaster and RegionServer alerts Single server metric monitoring · Collect information at single minute intervals · Track history of metric changes over time and chart in time-series Table and Region status monitoring · · · · Manage Region (Q3 2013) · Perform Region compaction, split, and merge · Execute and schedule jobs according to user-configured rules Fetch lists of tables and look up table schemas Manage table region lists and region lists on RegionServers Monitor detailed region metrics Create and drop table (Q3 2013) HBase Data Management Table data scanning Column data fetching by row query Long type transformation · Automatically convert byte array long to numeric long for readability Web-based HBase shell (Q3 2013) 7 Enjoy Connecting GRUTER
  • 9.
    ZooKeeper Manager KEY FEATURES ZooKeeperCluster Management Monitor server status and set alerts View detailed ZooKeeper server metrics · Collect metrics at single minute intervals Monitor ZooKeeper connections · Monitor and inspect all connections to ZooKeeper servers · Track metric change history Manage multiple clusters simultaneously ZooKeeper Node Management Easily manage zNodes by accessing detailed information and manipulating data through convenient file browser interface Manage ACLs for each zNode Manage zNode watcher registration Flume Manager for Flume-OG (v0.9.4) KEY FEATURES Data flow management · Inspect data flow between agent and collector · Monitor workloads of each node using workload indicators · Design data processing flows via powerful tool which allocates source, deco and sink Powerful configuration tool · Easily set parameters with pre-configured forms and help tips · Reuse and edit existing configurations · Check overview of node status and drill down to analyze specific details Physical/logical node status monitoring · List logical nodes on specific physical nodes · Create integrated views by combining data from Flume masters and ZooKeeper Map/unmap/decommission/purgeAll Multiple cluster management 8 · Control the entire lifecycle of logical nodes with minimal clicks · Use smart proxies to complete complex jobs in a single click · Manage multiple clusters Enjoy Connecting GRUTER
  • 10.
    Gruter: Your Partnerin the Big Data Revolution Phone: +82-2-508-5911 Fax: +82-2-508-5912 E-mail: inquiries@gruter.com Web: www.gruter.com For demo videos, please visit: www.gruter.com/products/cloumon#video GRUTER, INC. 5F Sehwa Office Building 889-70 Daechi-dong, Gangnam-gu, Seoul, South Korea 135-839