Hadoop Operations - Past, Present, and Future

1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Hadoop Operations -
Past, Present, and
Future
Room III
Wednesday, April 18
2:50 PM - 3:30 PM

Introduction
⬢ Who are we?
⬢ Logging - Ambari Log Search
⬢ Metrics - Anomaly Detection
⬢ Upgrades - Patch Upgrade
⬢ Core - Management Packs, Multi-Version/Instance
⬢ Recommendations - SmartSense

Today’s Speakers
Oliver Szabo Paul Codding

Logging
Oliver Szabo
Istvan Tobias
Ambari Log Search

Logging - Feature Intro
Problem Statement: Hadoop services produce a lot of logs, how can we make it easier
for operators to find the key logs they are looking for in a hundred, or thousand node
cluster...especially if the issue occured a week ago?
Challenge: How do we build a scalable logging infrastructure that is aware of all of our
components, their dependencies, and can parse all of the logs from 30+ projects?
Key Requirements:
- Provide a turnkey solution to logging for all of our HDP products that a
- Make it easy and fun to use

Logging - Technical Approach
⬢ Log Aggregation, analysis and visualization for Ambari managed services
⬢ Store Logs + Search Logs + Centralized
⬢ Basic components: SOLR + ZooKeeper + Log Feeder + Log Search

Logging - Architecture

Logging - Architecture (from Ambari 2.7+)

Logging - Pluggable components
⬢ Config API: ZooKeeper by default
⬢ Log Shipper: LogFeeder
⬢ Log Collection: Solr by default
– Log Search server act as a proxy (other auth. mechanisms can be added)

Logging - Notifications (Ambari 3.0)
⬢ Do something with the logs !
⬢ Send notifications (like email with reports)
⬢ Notifications based on policies

Ambari Log Search Demo

Logging - Timelines
Ambari 2.6.0.0
Oct 2017
HDP
2.6.3 GA
Ambari 3.0.0.0
2H 2018
HDP
3.1 GA
Ambari 2.6.1.0
Jan 2018
HDP
2.6.4 GA
Ambari 2.7.0.0
1H 2018
HDP
3.0 GA
Log Search
GA
Log Search
Tech Preview

Metrics
Sid Wagle
Aravindan Vijayan
Ambari Metrics System - Anomaly Detection

Anomaly Detection - Feature Intro
Problem Statement: Hadoop services produce a lot of metrics, and an operator can
only stare at a wall of graphs for so long...how do we ensure that we only bring
attention to those metrics that are messed up?
Challenge: How do we build a system that can detect both point and trend anomalies
in multiple different time series data streams?
Key Requirements:
- Provide a way to “watch” key component metrics and detect when there is a non-
normal change in those metrics
- Do all of this without creating too many false positives

Anomaly Detection - Technical Approach
Types of Anomalies:
⬢ Point in Time Anomaly
⬢ Trend Anomaly
⬢ Correlation Anomaly

Set of 3 independent subsystems
Every configured metric will be processed by every subsystem

Point In Time Subsystem
⬢ EMA & Tukey’s
Trend Anomaly Subsystem
⬢ KS Test & Historical Standard Deviation
Correlation Anomaly Subsystem
⬢ Application of Isolation Forest

What can we do when a system event of interest occurs?
Event: HBase Region Server crashes
⬢ We can provide a dynamic Grafana dashboard which has
– Snapshot of the metrics & anomalies in the RS profile
– Snapshot of the metrics in related and correlated profiles
– Highlight RS profile anomalies that occurred in the last N minutes
⬢ Can be used as a launching pad for exploring the crash.

Anomaly Detection - Timelines

Upgrades
Jonathan Hurley
Nate Cole
Ambari Patch Upgrade

Patch Upgrade - Feature Intro
Problem Statement: Bugs happen, and Hortonworks support is there to help
customers get a fix for those bugs...but how do we make it easy for customers to apply
those fixes and not have to upgrade everything?
Challenge: How do we build a system that can allow customers to easily apply, revert,
and track applied patches?
Key Requirements:
- Make applying a patch feel just like any other upgrade...only just upgrade those
services affected
- Allow users to revert patches

Patch Upgrade - Technical Approach
A Problem With Ambari's Architecture…
⬢ Ambari only dealt with stacks and services as a whole.
⬢ A cluster was always bound to a single repository at a time.
⬢ Upgrades could be accomplished, but would require every service to participate
– For larger stack changes, this was fine. However, this also meant that applying simple
patches for bug fixes touched the entire cluster.
HDFS
YARN
ZooKeeper
HDP 2.6
(2.6.0.0-1234)
HDFS
YARN
ZooKeeper
HDP 2.6
(2.6.1.0-5555)
HDFS
YARN
ZooKeeper
HDP 2.6
(2.6.2.0-7890)
Upgrade Upgrade

⬢ Allowing the cluster to be associated with multiple stacks would not only be a
monumental change, it would also require a matrix of testing which could not
possible be supported.
⬢ However, allowing individual services and components to be associated with
different repositories was possible.
– The only restriction is that each repository is still a part of the same major stack version.
Upgrade
HDFS
(2.6.0.0-1234)
HDP 2.6
YARN
(2.6.0.0-1234)
ZooKeeper
(2.6.0.0-1234)
HDFS
(2.6.0.0-1234)
HDP 2.6
YARN
(2.6.0.0-1234)
ZooKeeper
(2.6.1.0-1111)
HDFS
(2.6.2.5-9999)
HDP 2.6
YARN
(2.6.0.0-1234)
ZooKeeper
(2.6.1.0-1111)
Upgrade HDFS
(2.6.3.0-0001)
HDP 2.6
YARN
(2.6.3.0-0001)
ZooKeeper
(2.6.1.0-1111)
Upgrade

Version Definition File (VDF) - A single XML file which describes the contents of a
repository, such a services, versions, repository URLs, etc.
⬢ Ambari needed a way to determine if a repository only contained a subset of
services. This would allow the software to understand what kind of upgrade was
being performed.
<release>
<type>PATCH</type>
<stack-id>HDP-2.6</stack-id>
<version>2.6.3.0</version>
<build>235</build>
<compatible-with>2.6.3.d+</compatible-with>
<release-notes>http://example.com</release-notes>
<display>HDP-2.6.3.0-235</display>
</release>
<manifest>
<service id="STORM-110" name="STORM" version="1.1.0"/>
</manifest>
<available-services>
<service idref="STORM-110"/>
</available-services>
<repository-info>
<os family="redhat6">
<package-version>2_6_3_0_*</package-version>
<repo>
<baseurl>http://repo.ambari.apache.org/hdp/centos6/HDP-2.6.3.0-235</baseurl>
⬢ Defines the repository as a PATCH
repository
⬢ Contains only STORM
⬢ Specifies version 2.6.3.0-235
⬢ Contains the URL for packages

⬢ Ambari is now able to track versions of services and components independently
⬢ Depending on the command being generated, Ambari can determine the correct
version information to send to the agents
– Writing out configuration values which contain versioned paths
– Starting the correct version of a service
⬢ During distribution of a PATCH repository, Ambari can also target specific hosts
which only contain the services from the VDF
– Provides faster installations which prevent RPM bloat on hosts which are not involved
– Allows for a much faster upgrade since only specific services are restarted
⬢ A larger problem remained still … HADOOP!
– Jobs launched from one host assumed that the versions of components remained constant
across the cluster
– Some components assume that dependent components will match their specific versions

Register Version

Install Version

Upgrade & Revert

Patch Upgrade - Timelines
Ambari 2.6.0.0
Oct 2017
HDP
2.6.3 GA
Ambari 2.6.1.0
Jan 2018
HDP
2.6.4 GA
Ambari 2.7.0.0
Summer 2018
HDP
3.0 GA

Core
Jayush Luniya
Madhuvanthi Radhakrishnan
Swapan Shridhar
Scott Duan
Ambari Management Packs

Core - Feature Intro
Problem Statement: When we started HDP we had one product with 8 services, now
we have over 30 spread across multiple products...In order for Ambari to handle these
new complexities changes are necessary.
Challenge: How do we build a system that can allow multiple Hortonworks products,
and third party products to work together seamlessly...all while having complete
upgrade, patch, and lifecycle operations be completely automated?
Key Requirements:
- Allow Ambari to mix and match multiple products and services on the same
clusters
- Make it easy for partners and users to create their own software that’s managed
by Ambari

Core - Technical Approach
Ambari 2.x
Stacks
Ambari 3.x
MPacks
● Stack Definitions Shipped with
Ambari
● Ambari 2.x MPacks used to
“shim” services into an existing
stack
● Only stack services can
participate in upgrades (not
stack extensions, or services
added as MPacks)
● Hard 1:1 Relationships
● Stack Definitions externalized
into MPacks
● MPacks are stand-alone stacks
● MPack services get full Ambari
upgrade automation
● Flexible Relationships

MPack Repository
HDP
3.1.0
BigSQL
5.0.2
HDF
3.1.0
HDP
3.1.0
A Module Definition is:
● Built as a tarball
● Equivalent to Service Definition
● Also contains module.json
● hdfs-3.0.0.0-b123-definition.tar.gz
An Ambari 3.x Management Pack is:
● Built as a tarball
● Containing module definitions
● Equivalent to Stack Definition
● Also contains mpack.json
● hdpcore-1.0.0-b22-definition.tar.gz
Module RPMs:
● Actual install bits for the module
● hdfs_3_0_0_0_b123.rpm
MPack Meta RPM:
● Meta RPM to install all module RPMs
● hdp_3_1_0_b22.rpm
An MPack Repository:
● Holds references and metadata for
MPacks
● Ambari supports multiple MPack
repositories
● Allows operators to search and
discover management packs
● Stores compatibility between
management packs
● Provides recommendations for
MPack bundles

Core - Timelines
Ambari 2.6.0.0
Oct 2017
HDP
2.6.3 GA
Ambari 3.0.0.0
2H 2018
HDP
3.1 GA
Ambari 2.6.1.0
Jan 2018
HDP
2.6.4 GA
Ambari 2.7.0.0
1H 2018
HDP
3.0 GA

Recommendations
Sheetal Dolas
Beau Plath
Aditya Pathak
Cabir Zounaidou
Better Configuration & Performance

Recommendations - Feature Intro
Problem Statement: Hadoop services have many configurations, and our users use
cases change, and morph over time. How can we make sure that customers
configurations stays optimal as their use of the cluster changes?
Challenge: Ambari’s stack advisor logic is shipped with each version of Ambari, how
we can we provide fresh advice to customers using software shipped two years ago?
Key Requirements:
- Provide a way to constantly analyze a customers cluster and make
recommendations using up to date best practices
- Provide an easy way for customers to review and apply those recommendations in
Ambari

Recommendations - Technical Approach
Challenge 1: Collecting configuration, metrics, logs for all components and all services
Challenge 2: Anonymizing, encrypting, and sending those logs to Hortonworks on a
scheduled basis
Challenge 3: Making recommendations based off of the input diagnostics
Challenge 4: Make it easy for customers to apply these recommendations in Ambari

Recommendations - Collection
S E RV E R
A M B A R I
A G E N T A G E N T
A G E N TA G E N TA G E N T
A G E N T
B U N D L E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E

Recommendations - Sending
L A N D I N G Z O N E
S E RV E R
G AT E WAY
A M B A R I
A G E N T A G E N T
A G E N T
B U N D L E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E

Recommendations - Recommend
S E RV E R
G AT E WAY
A M B A R I
A G E N T A G E N T
A G E N T
B U N D L E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
S m a r t S e n s e
A n a l y t i c s

Recommendations - Apply
S E RV E R
G AT E WAY
A M B A R I
A G E N T A G E N T
A G E N T
B U N D L E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
WO R K E R
N O D E
S m a r t S e n s e
A n a l y t i c s

Recommendations - Timelines
Ambari 2.6.0.0
Oct 2017
HDP
2.6.3 GA
Ambari 2.6.1.0
Jan 2018
HDP
2.6.4 GA
Ambari 2.7.0.0
Summer 2018
HDP
3.0 GA

Summary

Summary
 Logging - Ambari Log Search
 Metrics - Anomaly Detection
 Upgrades - Patch Upgrade
 Core - Management Packs, Multi-Version/Instance
 Recommendations - SmartSense
Ambari 2.6.0.0
Oct 2017
Ambari 3.0.0.0
2H 2018
Ambari 2.6.1.0
Jan 2018
Ambari 2.7.0.0
1H 2018
Log Search
Anomaly Detection
Management Packs
Coming Soon
Patch Upgrade
Recommendations
Available Now

Questions?
Thank you for attending!

Hadoop Operations - Past, Present, and Future

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Hadoop Operations - Past, Present, and Future

Similar to Hadoop Operations - Past, Present, and Future (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Hadoop Operations - Past, Present, and Future

Editor's Notes