Managing 2000 Node Cluster with Ambari

© Hortonworks Inc. 2014: DO NOT SHARE. CONTAINS HORTONWORKS CONFIDENTIAL & PROPRIETARY INFORMATION
Apache Ambari
Managing 2000 node Hadoop cluster
Siddharth Wagle, PMC
swagle (@apache, @hortonworks)
Srimanth Gunturi, PMC
srimanth(@apache, @hortonworks)

Agenda
• Operating at scale
• Lessons learned
• Beyond 2K
• Ambari 1.6.0 highlights
• New Management features
• Blueprints
• Ambari Views
• Extensibility
• Q & A
Page 2

Ambari: Enterprise Hadoop Operations
Apache Ambari is the only 100% open source framework for
provisioning, managing and monitoring Apache Hadoop clusters
AMBARI
WEB
Page 3
Viewpoint Others
AMBARI REST APIs
AMBARI SERVER
PROVISION | MANAGE | MONITOR
compute
&
storage
. . .
. . .
. .
compute
&
storage
.
.

100% Apache Open Source
• Active Community
- 70+ Contributors / 40+ Committers
- 240+ Ambari User Group Members
Page 4
2013
Dec Apache Ambari Graduates to Top Level Project
2014
Apr
2014
May
Apache Ambari 1.5.1 Released
Adds operations for Hadoop 2.1 Stack
Apache Ambari 1.6.0 Released
New Ambari features

Overview and Architecture
Page 5

Platform Architecture
6
DB Orchestrator Monitoring
REST API
Request Dispatcher
Ambari
Web
Ambari
Server
Ambari
Agent/s
Ganglia/
Nagios/jmx
AuthProvider
/clusters
/stacks
/views …
User
Repo
java
python
puppet
JS
RDBMS
LDAP
AD
Cluster
Configuration
s
and Topology
resources
Definitions
stacks,
actions, views
REST API
Web Client
Configurable
Auth Provider
Bootstrap or
Manual install Monitoring
Providers

Demo
2000 Nodes on commodity hardware
Page 7
Process CPU RAM (process)
Ambari Server 16 core 2 GB
Ganglia 16 core 8 GB
Nagios 8 core 8 GB
Masters 8 core 8 GB
Slaves 1 core 4 GB

Demo Video
• Increase compute capacity with Next Gen Slaves
• Group the new hosts with Manage Config groups feature
• Override a default config property for the new group
• Apply the config by performing rolling restarts on the next
gen slaves with 0 – little downtime expectation
Page 8

Optimizations with Ambari 1.6.1
• Better utilization of rrdcached
• Tuning Nagios with recommended performance configurations
• Ambari API optimizations
Page 11
Process Starting point 1.6.1
Ambari Server > 10 (0.63) ~ 6.0 (0.37)
Ganglia Server > 12 (0.75) ~ 0.94 (0.06)
Nagios > 14 (1.75) ~ 6.8 (0.85)
 Load Average comparison
 iostats
Process Starting point 1.6.1
Ganglia Server > 10.3 GB writes ~ 0.3 GB write
> 34 MB reads cached reads

Beyond 2K ?
• Better metric collection with fan out
• Ability to export metrics to existing analytics and long term metric
persistence solutions like OpenTSDB
• Improve the alerting subsystem to minimize I/O overhead for alerts
processing
• Server Scale out solution for handling heartbeats and server agent
talk for 10K+ nodes
Page 12

Hadoop
Daemon
AmbariMetricsSink
Rack-aware
Ambari Metrics
Collector
(1…N)
AmbariMetricsService
MySQL
Ambari
Agent
HostMetricsCollector
Future of Ambari Metrics System ?
(AMBARI-5707)
Long term storage
AMBARI
AMBARI
Views
Hive
Pig
TEZ

Ambari 1.6.0 Features
Page 14

Request Scheduling
• Open source quartz scheduler integration
• Create a batch of requests executed in the order of creation
• Expose API to allow user to create own schedules
Page 15

Rolling Restarts
• Goal: minimize cluster downtime
• Optionally include only hosts with configurations changes
• Set host batch size + time to wait between batches
• Set failure tolerance to halt restarts automatically
Page 16

Host Configuration Groups
• Set custom configuration properties for one or more host groups (e.g.
“host overrides”)
• Important for handing “heterogeneous” HW clusters
–Different memory, mount points, directories
17
HEAPSIZE= 1024
HEAPSIZE= 512

Staged Configurations Changes
• Restart indicators
• Push changes without affecting liveliness of the service
Page 19

Ambari Blueprints
• Blueprint defines a cluster layout and
component configuration
• Simplifies “Headless Installs”
• Export blueprint from cluster
• Boot and Save wizard with blueprint
BLUEPRINT
AMBARI
Submit to Ambari
via REST CLUSTER
Ambari provisions
cluster
BLUEPRINT
<stack>
<host>
<service>
<component>
<config>
HOST
MANIFEST
<host>
<meta>
SERVICE
CONFIGS
<props>
Page 20

Cluster create with Blueprint
Page 21
• POST /api/v1/blueprints/:blueprintName • POST /api/v1/clusters/:clusterName
201 Created
202 - Accepted

Bulk Host Operations
• Perform operations such as Stop, Start, Restart, Decommission,
Maintenance Mode in “bulk” form
• Perform operations on all hosts, filtered hosts or a selected group of
hosts
• Perform host level operations, or component type operations.
Page 23

Bulk Host Operations
• 10+ ways to filter hosts - component type and state, alerts, stale
configurations, maintenance mode, etc.
Page 24
• Component type start, stop, restart operations are performed in
batches

Maintenance Mode
• Goal: silence alerts for services, hosts and components when
performing maintenance
• Ability to put Service or Host “Out of Service”
• Alerts will be suspended for that item
• Item will not respond to bulk operations (such as restarts)
Page 25

Maintenance Mode
• Components inherit maintenance mode from either service or host
• Service/Host in maintenance mode
–Bulk operations skipped
–Host/Service operations skipped (start all, stop all and restart all)
Page 26

Moving Masters
Page 27
• Move master components to
different hosts
– NameNode (including HA)
– SecondaryNameNode
– TaskTracker (Hadoop 1)
– ResourceManager (Hadoop 2)

Views
Page 28

Ambari Views
• Goal: Customize the Ambari Web experience
• Allows creation of custom views (API and UI) of cluster
• Gives users and admins a single entry point to cluster
• Views compliment Stack Extensibility
–Stack Extensibility makes custom Stack Services available to
Ambari
–Views expose custom UI features for Services
• Ambari Admins can entitle “views” to Ambari Web users
–Entitlements framework for finer-grained permissions control for
Ambari users
Page 29

Ambari Views – Demo
Page 30

Ambari Views – Packaging
Page 31
files-0.1.0-SNAPSHOT.jar
├── WEB-INF
│ └── web.xml
│ └── lib
├── index.html
├── org
│ └── apache
│ └── ambari
│ └── view
│ └── filebrowser
│ ├── HdfsApi.class
│ └── ...
└── view.xml
# ls -l /var/lib/ambari-server/resources/views/
-rw-r--r--. 1 root root 26023710 Jun 1 00:55 files-0.1.0-SNAPSHOT.jar
-rw-r--r--. 1 root root 22578573 Jun 1 00:55 pig-0.1.0-SNAPSHOT.jar
-rw-r--r--. 1 root root 54649972 Jun 1 00:55 slider-0.1.0-SNAPSHOT.jar

Ambari Views: view.xml
Page 32
<view>
<name>WEATHER</name>
<label>Weather</label>
<version>1.0.0</version>
<parameter>
<name>cities</name>
<description>The list of cities.</description>
<required>true</required>
</parameter>
<resource>
<name>city</name>
<plural-name>cities</plural-name>
<id-property>id</id-property>
<resource-class>org.apache.ambari.view.weather.CityResource</resource-class>
<provider-class>org.apache.ambari.view.weather.CityResourceProvider</provider-class>
<service-class>org.apache.ambari.view.weather.CityService</service-class>
</resource>
<instance>
<name>EUROPE</name>
<property>
<key>cities</key>
<value>London, UK;Paris;Munich</value>
</property>
</instance>
</view>

Ambari Views – Framework API
• GET
– http://server:8080/api/v1/views
– http://server:8080/api/v1/views/{view-id}/versions
– http://server:8080/api/v1/views/{view-id}/versions/{view-version}/instances
– http://server:8080/api/v1/views/{view-id}/versions/{view-
version}/instances/{view-instance}
• POST
– Create new instance of view with appropriate parameters
– Parameter example for HDFS view – dataworker.defaultFS, dataworker.username
• PUT
– Update {view-instance} with modified parameters
• DELETE
– Delete {view-instance}
Page 33

Ambari Views – View Instance API
• GET UI
– http://server:8080/views/{view-id}/{view-version}/{view-instance}
• GET API
version}/instances/{view-instance}/resources/{resource-name}
version}/instances/{view-instance}/{servlet-path}
• Example: HDFS
– GET: http://views-1:8080/views/FILES/0.1.0/HDFS
– GET: http://views-
1:8080/api/v1/views/FILES/versions/0.1.0/instances/HDFS/resources/files/fileops/l
istdir?path=%2F
– GET: http://views-
1:8080/api/v1/views/FILES/versions/0.1.0/instances/HDFS/resources/files/download/
browse?path=%2Fuser%2Fhdfs%2FplayerYears.pig&download=true
– POST: http://views-
1:8080/api/v1/views/FILES/versions/0.1.0/instances/HDFS/resources/files/fileops/r
ename
Page 34

Ambari Views – Single cluster interface
Page 37
Administrators can control cluster Data Workers can use cluster

Jobs
Page 38

Ambari Jobs
Page 39
• Hadoop 1.0: MapReduce
– Visualize MapReduce jobs in swimlanes
– Task scatter plots across jobs

Ambari Jobs
Page 40
• Hadoop 2.0: YARN + Tez

Ambari Jobs
Page 41
• Visualize Hive queries using Tez engine

Ambari Jobs
Page 42

Ambari Jobs - Counters
Page 43
FILE_BYTES_READ +
HDFS_BYTES_READ
FILE_BYTES_WRITTEN +
HDFS_BYTES_WRITTEN
HDFS_WRITE_OPS /
HDFS_BYTES_WRITTEN
HDFS_READ_OPS /
HDFS_BYTES_READ
FILE_WRITE_OPS /
FILE_BYTES_WRITTEN
FILE_READ_OPS /
FILE_BYTES_READ
SPILLED_RECORDS

Ambari Jobs – DAG Graph
Page 44
Summary Metrics
• Input
• Output
• Tez Tasks
• Spilled Records
Vertex Types
• Map Vertex
• Reduce Vertex
• Union Vertex
Hive Operators
Edge Types
• Scatter Gather
• Broadcast
• Contains

Ambari Jobs
Page 45
• Event notification flow
ATS (Application Timeline
Server – YARN)
Ambari
PUSH
PULL

Ambari Jobs - Configurations
Page 46

Ambari Jobs – Scaling
Page 47

Extensibility
Page 48

Ambari Stacks
• Goal: Reduce time + effort to add new Services to Ambari for
provisioning, management and monitoring
• Ambari defines a consistent Service lifecycle management interface
that can be extended
• Dynamically add Stacks + Services definitions
Page 49
AMBARI
{rest}
<ambari-web>
Stack
HDFS YARN MR2
Hive
Pig
Oozie
NEW
NEW
NEW
HDP-2.0
Stack
GlusterFS YARN MR2
Hive HIVENEW
2.0-GlusterFS

Stack Details
• Stacks define Services + Repos
– What is in the Stack, and where to get the bits
• Each Service has a definition
– What Components are part of the Service
• Each Service has defined lifecycle commands
– start, stop, status, install, configure
• Lifecycle is controlled via command scripts
• Ability to define “custom” commands
• Ability to “extend” Stacks
Page 50
AMBARI
SERVER
Stack
Command
Scripts
Service
Definitions
AMBARI
AGENT/S
AMBARI
AGENT/S
AMBARI
AGENT/S
pythonxml
Repos

Stack Mechanics
• Ambari Server reads Stack definitions on start
• Ambari Server sends a command to Agents
• Agents download Stack definition + command scripts
• Agent executes command
• If the Stack definition changes, Agent will request latest Stack
definition + command scripts
Page 51

Declarative Definition
Page 52

In closing …
Page 53

Everyone is welcome to contribute
• Thank you for all the contributions
• Bring your favorite Hadoop services to Ambari
• Useful Links
– Website
– http://apache.apache.org
– Mailing Lists
– http://ambari.apache.org/mail-lists.html
– Development Wiki
– https://cwiki.apache.org/confluence/display/AMBARI
• Current and Upcoming Releases
– Ambari 1.6.1 (pending release)
– Ambari 1.6.0 (May)
Page 54

Thank you.
Page 55

Managing 2000 Node Cluster with Ambari

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Managing 2000 Node Cluster with Ambari

Similar to Managing 2000 Node Cluster with Ambari (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Managing 2000 Node Cluster with Ambari

Editor's Notes