SlideShare a Scribd company logo
Streamline
Hadoop DevOps
with Apache
Ambari
Jayush Luniya
Hadoop Summit, Tokyo
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Speaker
Jayush Luniya
Staff Software Engineer @ Hortonworks
Apache Ambari PMC
jluniya@apache.org
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache Ambari is the open-source platform to
provision, manage and monitor Hadoop clusters
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
4 years old
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Exciting Enterprise Features in Ambari 2.4
 New Services: Log Search, Zeppelin, Hive LLAP
 Role Based Access Control
 Management Packs
 Grafana UI for Ambari Metrics System
 New Views: Zeppelin, Storm
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
More in Ambari 2.4
• Alerts: Customizable props and thresholds
(AMBARI-14898)
• Alerts: Retry tolerance (AMBARI-15686)
• Alerts: New HDFS Alerts (AMBARI-14800)
• New Host Page Filtering (AMBARI-15210)
• Remove Service from UI (AMBARI-14759)
• Support for SLES 12 (AMBARI-16007)
• Stability: Database Consistency Checking
(AMBARI-16258)
• Customizable Ambari Log + PID Dirs (AMBARI-
15300)
• New Version Registration Experience (AMBARI-
15724)
• Log Search Technical Preview (AMBARI-14927)
• Operational Audit Logging (AMBARI-15241)
• Role-Based Access Control (AMBARI-13977)
• Automated Setup of Ambari Kerberos through
Blueprints (AMBARI-15561)
• Automated Setup of Ambari Proxy User
(AMBARI-15561)
• Customizable Host Reg. SSH Port (AMBARI-
13450)
Core Features Security Features
• View URLs for bookmarks (AMBARI-15821),
View Refresh (AMBARI-15682)
• Inherit Cluster Permissions (AMBARI-16177)
• Remote Cluster Registration (AMBARI-16274)
Views Framework Features
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy
Secure/L
DAP
Smart
Configs
Upgrade
Monitor
Scale,
Extend,
Analyze
Simply Operations - Lifecycle
Ease-of-Use Deploy
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy On Premise
Ambari UI wizard handles all of these
combinations and makes recommendations
based on host specs.
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy On The Cloud
Certified environments
Sysprepped VMs
Hundreds of similar clusters
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Deploy with Blueprints
 Systematic way of defining a cluster
 Export existing cluster into blueprint
/api/v1/clusters/:clusterName?format=blueprint
Config
s
Topology Hosts Cluster
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Blueprints for Large Scale
• Kerberos, secure out-of-the-box
• High Availability is setup initially for
NameNode, YARN, Hive, Oozie, etc
• Host Discovery allows Ambari to
automatically install services for a Host
when it comes online
• Stack Advisor recommendations
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
POST /api/v1/clusters/MyCluster/hosts
[
{
"blueprint" : "single-node-hdfs-test2",
"host_groups" :[
{
"host_group" : "slave",
"host_count" : 3,
"host_predicate" : "Hosts/cpu_count>1”
}, {
"host_group" : "super-slave",
"host_count" : 5,
"host_predicate" : "Hosts/cpu_count>2&
Hosts/total_mem>3000000"
}
]
}
]
Blueprint Host Discovery
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Comprehensive Security
LDAP/AD
• User auth
• Sync
Kerberos
• MIT KDC
• Keytab
management
Atlas
• Governance
• Compliance
• Linage & history
• Data
classification
Ranger
• Security policies
• Audit
• Authorization
Knox
• Perimeter
security
• Supports
LDAP/AD
• Sec. for
REST/HTTP
• SSL
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kerberos
Ambari manages Kerberos principals and keytabs
Works with existing MIT KDC or Active Directory
Once Kerberized, handles
1. Adding hosts
2. Adding components
to existing hosts
3. Adding services
4. Moving components
to different hosts
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Management Packs
 Improved Release Management:
Decouple Ambari core from stacks releases
 Support Add-ons:
–Release vehicle for 3rd party services, views
–Self-contained release artifacts
–Stack is an overlay of multiple management
packs
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Overlay of Management Packs
inherits from 2.3
inherits from 2.4
inherits from 2.5
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Management Pack++
Short Term Goals (Ambari 2.4)
 Retrofit in Stack Processing Framework
 Enable 3rd party to ship add-on services
Future Goals
 Management Pack Framework
 Deliver Views
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Role Based Access Control (RBAC)
As Ambari & organizations grow,
so do security needs
Ambari integrates with external
authentication systems & LDAP
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
RBAC Terms
Users belong to groups
A group has a role
Users can also have additional roles
Roles are applied to Resources. E.g.,
Ambari, particular Cluster, particular View
Roles have permissions
e.g., add services to cluster
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
New RBAC Roles
only view
↑, except change configs
↑, except alter cluster topology
or install components
Ambari Admin
Cluster Admin
Cluster Op
Service Admin
Service Op
Read-Only
↑, except add services, Kerberos,
manage alerts & upgrades
↑, except manage permissions
all
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Service Layout
Common Services Stack Override
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Stack Advisor
Kerberos
HTTPS
Zookeeper Servers
Memory Settings
…
High Availability
atlas.rest.address =
http(s)://host:port
# Atlas Servers
atlas.enabletTLS = true|false
atlas.server.http.port = 21000
atlas.server.https.port = 21443
Example
Configurations
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Background: Upgrade Terminology
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Background: Upgrade Terminology
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
Rolling
Upgrade
Automated
Upgrades one component
per host at a time
Preserves cluster operation
and minimizes service impact
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Background: Upgrade Terminology
Express
Upgrade
Automated
Runs in parallel across hosts
Incurs downtime
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
Rolling
Upgrade
Automated
Upgrades one component
per host at a time
Preserves cluster operation
and minimizes service impact
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Automated Upgrade: Rolling or Express
Check
Prerequisites
Review the
prereqs to
confirm
your cluster
configs are
ready
Prepare
Take
backups of
critical
cluster
metadata
Perform
Upgrade
Perform the
HDP
upgrade.
The steps
depend on
upgrade
method:
Rolling or
Express
Register +
Install
Register the
HDP
repository
and install
the target
HDP version
on the
cluster
Finalize
Finalize the
upgrade,
making the
target
version the
current
version
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Process: Rolling Upgrade
ZooKeeper
Ranger
Hive
Oozie
Falcon
Kafka
Knox
Storm
Slider
Flume
Finalize or
Downgrade
Clients HDFS, YARN, MR, Tez,
HBase, Pig. Hive, etc.
Core
Masters
Core Slaves
HDFS
YARN
HBase
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Alerting Framework
Alert Type Description Thresholds (units)
WEB Connects to a Web URL. Alert status is based on
the HTTP response code
Response Code (n/a)
Connection Timeout (seconds)
PORT Connects to a port. Alert status is based on
response time
Response (seconds)
METRIC Checks the value of a service metric. Units vary,
based on the metric being checked
Metric Value (units vary)
Connection Timeout (seconds)
AGGREGATE Aggregates the status for another alert % Affected (percentage)
SCRIPT Executes a script to handle the alert check Varies
SERVER Executes a server-side runnable class to handle
the alert check
Varies
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Alert Check Counts
• Customize the number of times an alert is
checked before dispatching a notification
• Avoid dispatching an alert notification (email, snmp)
in case of transient issues
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Alerts - Configuring the Check Count
 Set globally for all alerts, or override for a specific alert
Global
Setting Alert
Override
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Storm Monitoring View
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Grafana for Ambari Metrics
 Grafana as a “Native UI” for
Ambari Metrics
 Pre-built Dashboards
Host-level, Service-level
 Supports HTTPS
 System Home, Servers
 HDFS Home, NameNodes,
DataNodes
 YARN Home, Applications,
Job History Server
 HBase Home, Performance
FEATURES DASHBOARDS
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Grafana includes pre-built
dashboards for visualizing the most
important cluster metrics.
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The HDFS NameNode
dashboard highlights
file system activity.
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Log Search
Search and index HDP logs!
Capabilities
• Rapid Search of all HDP component logs
• Search across time ranges, log levels, and for
keywords
Solr
Logsearch
Ambari
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Log Search
WO R K E R
N O D E
L O G
F E E D E R
Solr
LO G
S EA RC H
U I
Solr
Solr
A M BA R I
Java Process
Multi-output Support
Grok filters
Solr Cloud
Local Disk Storage
42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Future of Ambari
 Cloud features
 Service multi-instance (two ZK quorums)
 Service multi-versions (Spark 1.6 & Spark 2.0)
 YARN assemblies
 Patch Upgrades: upgrade individual components in
the same stack version, e.g., just DN and RM in HDP
2.5.*.* with zero downtime
 Ambari High Availability

More Related Content

What's hot

Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
DataWorks Summit/Hadoop Summit
 
IoT:what about data storage?
IoT:what about data storage?IoT:what about data storage?
IoT:what about data storage?
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data Free
DataWorks Summit
 
To The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid AnalyticsTo The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid Analytics
DataWorks Summit/Hadoop Summit
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
DataWorks Summit/Hadoop Summit
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
DataWorks Summit/Hadoop Summit
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
DataWorks Summit/Hadoop Summit
 
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache SparkRow/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
DataWorks Summit/Hadoop Summit
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
DataWorks Summit/Hadoop Summit
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud Storage
Hortonworks
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
DataWorks Summit
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
 
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
DataWorks Summit/Hadoop Summit
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
DataWorks Summit
 
The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache Ambari
DataWorks Summit
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
DataWorks Summit/Hadoop Summit
 
Empower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and HadoopEmpower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and Hadoop
DataWorks Summit/Hadoop Summit
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
Hortonworks
 

What's hot (20)

Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
 
IoT:what about data storage?
IoT:what about data storage?IoT:what about data storage?
IoT:what about data storage?
 
Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data Free
 
To The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid AnalyticsTo The Cloud and Back: A Look At Hybrid Analytics
To The Cloud and Back: A Look At Hybrid Analytics
 
Running Services on YARN
Running Services on YARNRunning Services on YARN
Running Services on YARN
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Evolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage SubsystemEvolving HDFS to Generalized Storage Subsystem
Evolving HDFS to Generalized Storage Subsystem
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache SparkRow/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
 
Hive - 1455: Cloud Storage
Hive - 1455: Cloud StorageHive - 1455: Cloud Storage
Hive - 1455: Cloud Storage
 
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and FutureApache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?Ingest and Stream Processing - What will you choose?
Ingest and Stream Processing - What will you choose?
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache Ambari
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Major advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL complianceMajor advancements in Apache Hive towards full support of SQL compliance
Major advancements in Apache Hive towards full support of SQL compliance
 
Empower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and HadoopEmpower Data-Driven Organizations with HPE and Hadoop
Empower Data-Driven Organizations with HPE and Hadoop
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
 

Viewers also liked

Comparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBaseComparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBase
DataWorks Summit/Hadoop Summit
 
Rebuilding Web Tracking Infrastructure for Scale
Rebuilding Web Tracking Infrastructure for ScaleRebuilding Web Tracking Infrastructure for Scale
Rebuilding Web Tracking Infrastructure for Scale
DataWorks Summit/Hadoop Summit
 
Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.
DataWorks Summit/Hadoop Summit
 
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
DataWorks Summit/Hadoop Summit
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
DataWorks Summit/Hadoop Summit
 
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
DataWorks Summit/Hadoop Summit
 
The truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on HadoopThe truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on Hadoop
DataWorks Summit/Hadoop Summit
 
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
DataWorks Summit/Hadoop Summit
 
The real world use of Big Data to change business
The real world use of Big Data to change businessThe real world use of Big Data to change business
The real world use of Big Data to change business
DataWorks Summit/Hadoop Summit
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
DataWorks Summit/Hadoop Summit
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
DataWorks Summit/Hadoop Summit
 
#HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course #HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
DataWorks Summit/Hadoop Summit
 
SEGA : Growth hacking by Spark ML for Mobile games
SEGA : Growth hacking by Spark ML for Mobile gamesSEGA : Growth hacking by Spark ML for Mobile games
SEGA : Growth hacking by Spark ML for Mobile games
DataWorks Summit/Hadoop Summit
 
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop
DataWorks Summit/Hadoop Summit
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
DataWorks Summit/Hadoop Summit
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
DataWorks Summit/Hadoop Summit
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
DataWorks Summit/Hadoop Summit
 
Data science lifecycle with Apache Zeppelin
Data science lifecycle with Apache ZeppelinData science lifecycle with Apache Zeppelin
Data science lifecycle with Apache Zeppelin
DataWorks Summit/Hadoop Summit
 
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
DataWorks Summit/Hadoop Summit
 

Viewers also liked (20)

Comparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBaseComparison of Transactional Libraries for HBase
Comparison of Transactional Libraries for HBase
 
Rebuilding Web Tracking Infrastructure for Scale
Rebuilding Web Tracking Infrastructure for ScaleRebuilding Web Tracking Infrastructure for Scale
Rebuilding Web Tracking Infrastructure for Scale
 
Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.Case study of DevOps for Hadoop in Recruit.
Case study of DevOps for Hadoop in Recruit.
 
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
Use case and Live demo : Agile data integration from Legacy system to Hadoop ...
 
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPANNetwork for the Large-scale Hadoop cluster at Yahoo! JAPAN
Network for the Large-scale Hadoop cluster at Yahoo! JAPAN
 
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
From a single droplet to a full bottle, our journey to Hadoop at Coca-Cola Ea...
 
The truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on HadoopThe truth about SQL and Data Warehousing on Hadoop
The truth about SQL and Data Warehousing on Hadoop
 
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
 
The real world use of Big Data to change business
The real world use of Big Data to change businessThe real world use of Big Data to change business
The real world use of Big Data to change business
 
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
Leveraging smart meter data for electric utilities: Comparison of Spark SQL w...
 
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
 
#HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course #HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course
 
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
How to overcome mysterious problems caused by large and multi-tenancy Hadoop ...
 
SEGA : Growth hacking by Spark ML for Mobile games
SEGA : Growth hacking by Spark ML for Mobile gamesSEGA : Growth hacking by Spark ML for Mobile games
SEGA : Growth hacking by Spark ML for Mobile games
 
Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop Hadoop Summit Tokyo HDP Sandbox Workshop
Hadoop Summit Tokyo HDP Sandbox Workshop
 
Hadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash CourseHadoop Summit Tokyo Apache NiFi Crash Course
Hadoop Summit Tokyo Apache NiFi Crash Course
 
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
Introduction to Hadoop and Spark (before joining the other talk) and An Overv...
 
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark ClustersA Container-based Sizing Framework for Apache Hadoop/Spark Clusters
A Container-based Sizing Framework for Apache Hadoop/Spark Clusters
 
Data science lifecycle with Apache Zeppelin
Data science lifecycle with Apache ZeppelinData science lifecycle with Apache Zeppelin
Data science lifecycle with Apache Zeppelin
 
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
 

Similar to Streamline Hadoop DevOps with Apache Ambari

Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
Jayush Luniya
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
DataWorks Summit/Hadoop Summit
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
Alejandro Fernandez
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
Jayush Luniya
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
Hortonworks
 
Simplified Cluster Operation and Troubleshooting
Simplified Cluster Operation and TroubleshootingSimplified Cluster Operation and Troubleshooting
Simplified Cluster Operation and Troubleshooting
DataWorks Summit/Hadoop Summit
 
Simplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & TroubleshootingSimplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & Troubleshooting
DataWorks Summit/Hadoop Summit
 
Apache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
Apache Ambari: Simplified Hadoop Cluster Operation & TroubleshootingApache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
Apache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
Jayush Luniya
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
Hortonworks
 
Manage Add-on Services in Apache Ambari
Manage Add-on Services in Apache AmbariManage Add-on Services in Apache Ambari
Manage Add-on Services in Apache Ambari
Jayush Luniya
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
Hortonworks
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache Ambari
DataWorks Summit
 
Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0Hortonworks
 
Best Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentBest Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop Environment
DataWorks Summit/Hadoop Summit
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
Hortonworks
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
DataWorks Summit
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
Gergely Devenyi
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object Stores
Steve Loughran
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
DataWorks Summit
 
Oracle Cloud - Infrastruktura jako kód
Oracle Cloud - Infrastruktura jako kódOracle Cloud - Infrastruktura jako kód
Oracle Cloud - Infrastruktura jako kód
MarketingArrowECS_CZ
 

Similar to Streamline Hadoop DevOps with Apache Ambari (20)

Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
Managing Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache AmbariManaging Enterprise Hadoop Clusters with Apache Ambari
Managing Enterprise Hadoop Clusters with Apache Ambari
 
Simplified Cluster Operation and Troubleshooting
Simplified Cluster Operation and TroubleshootingSimplified Cluster Operation and Troubleshooting
Simplified Cluster Operation and Troubleshooting
 
Simplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & TroubleshootingSimplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & Troubleshooting
 
Apache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
Apache Ambari: Simplified Hadoop Cluster Operation & TroubleshootingApache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
Apache Ambari: Simplified Hadoop Cluster Operation & Troubleshooting
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Manage Add-on Services in Apache Ambari
Manage Add-on Services in Apache AmbariManage Add-on Services in Apache Ambari
Manage Add-on Services in Apache Ambari
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache Ambari
 
Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0
 
Best Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop EnvironmentBest Practices for Enterprise User Management in Hadoop Environment
Best Practices for Enterprise User Management in Hadoop Environment
 
Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0Apache Ambari - What's New in 1.7.0
Apache Ambari - What's New in 1.7.0
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object Stores
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Oracle Cloud - Infrastruktura jako kód
Oracle Cloud - Infrastruktura jako kódOracle Cloud - Infrastruktura jako kód
Oracle Cloud - Infrastruktura jako kód
 

More from DataWorks Summit/Hadoop Summit

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
DataWorks Summit/Hadoop Summit
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
DataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
DataWorks Summit/Hadoop Summit
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
DataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
DataWorks Summit/Hadoop Summit
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
DataWorks Summit/Hadoop Summit
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
DataWorks Summit/Hadoop Summit
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
DataWorks Summit/Hadoop Summit
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
DataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
DataWorks Summit/Hadoop Summit
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
DataWorks Summit/Hadoop Summit
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
DataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
DataWorks Summit/Hadoop Summit
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
DataWorks Summit/Hadoop Summit
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
DataWorks Summit/Hadoop Summit
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage SchemesScaling HDFS to Manage Billions of Files with Distributed Storage Schemes
Scaling HDFS to Manage Billions of Files with Distributed Storage Schemes
 

Recently uploaded

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 

Recently uploaded (20)

GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 

Streamline Hadoop DevOps with Apache Ambari

  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Speaker Jayush Luniya Staff Software Engineer @ Hortonworks Apache Ambari PMC jluniya@apache.org
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Ambari is the open-source platform to provision, manage and monitor Hadoop clusters
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 4 years old
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Exciting Enterprise Features in Ambari 2.4  New Services: Log Search, Zeppelin, Hive LLAP  Role Based Access Control  Management Packs  Grafana UI for Ambari Metrics System  New Views: Zeppelin, Storm
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved More in Ambari 2.4 • Alerts: Customizable props and thresholds (AMBARI-14898) • Alerts: Retry tolerance (AMBARI-15686) • Alerts: New HDFS Alerts (AMBARI-14800) • New Host Page Filtering (AMBARI-15210) • Remove Service from UI (AMBARI-14759) • Support for SLES 12 (AMBARI-16007) • Stability: Database Consistency Checking (AMBARI-16258) • Customizable Ambari Log + PID Dirs (AMBARI- 15300) • New Version Registration Experience (AMBARI- 15724) • Log Search Technical Preview (AMBARI-14927) • Operational Audit Logging (AMBARI-15241) • Role-Based Access Control (AMBARI-13977) • Automated Setup of Ambari Kerberos through Blueprints (AMBARI-15561) • Automated Setup of Ambari Proxy User (AMBARI-15561) • Customizable Host Reg. SSH Port (AMBARI- 13450) Core Features Security Features • View URLs for bookmarks (AMBARI-15821), View Refresh (AMBARI-15682) • Inherit Cluster Permissions (AMBARI-16177) • Remote Cluster Registration (AMBARI-16274) Views Framework Features
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy Secure/L DAP Smart Configs Upgrade Monitor Scale, Extend, Analyze Simply Operations - Lifecycle Ease-of-Use Deploy
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy On Premise Ambari UI wizard handles all of these combinations and makes recommendations based on host specs.
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy On The Cloud Certified environments Sysprepped VMs Hundreds of similar clusters
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Deploy with Blueprints  Systematic way of defining a cluster  Export existing cluster into blueprint /api/v1/clusters/:clusterName?format=blueprint Config s Topology Hosts Cluster
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Blueprints for Large Scale • Kerberos, secure out-of-the-box • High Availability is setup initially for NameNode, YARN, Hive, Oozie, etc • Host Discovery allows Ambari to automatically install services for a Host when it comes online • Stack Advisor recommendations
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved POST /api/v1/clusters/MyCluster/hosts [ { "blueprint" : "single-node-hdfs-test2", "host_groups" :[ { "host_group" : "slave", "host_count" : 3, "host_predicate" : "Hosts/cpu_count>1” }, { "host_group" : "super-slave", "host_count" : 5, "host_predicate" : "Hosts/cpu_count>2& Hosts/total_mem>3000000" } ] } ] Blueprint Host Discovery
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Comprehensive Security LDAP/AD • User auth • Sync Kerberos • MIT KDC • Keytab management Atlas • Governance • Compliance • Linage & history • Data classification Ranger • Security policies • Audit • Authorization Knox • Perimeter security • Supports LDAP/AD • Sec. for REST/HTTP • SSL
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Kerberos Ambari manages Kerberos principals and keytabs Works with existing MIT KDC or Active Directory Once Kerberized, handles 1. Adding hosts 2. Adding components to existing hosts 3. Adding services 4. Moving components to different hosts
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Management Packs  Improved Release Management: Decouple Ambari core from stacks releases  Support Add-ons: –Release vehicle for 3rd party services, views –Self-contained release artifacts –Stack is an overlay of multiple management packs
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Overlay of Management Packs inherits from 2.3 inherits from 2.4 inherits from 2.5
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Management Pack++ Short Term Goals (Ambari 2.4)  Retrofit in Stack Processing Framework  Enable 3rd party to ship add-on services Future Goals  Management Pack Framework  Deliver Views
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Role Based Access Control (RBAC) As Ambari & organizations grow, so do security needs Ambari integrates with external authentication systems & LDAP
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved RBAC Terms Users belong to groups A group has a role Users can also have additional roles Roles are applied to Resources. E.g., Ambari, particular Cluster, particular View Roles have permissions e.g., add services to cluster
  • 25. 25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved New RBAC Roles only view ↑, except change configs ↑, except alter cluster topology or install components Ambari Admin Cluster Admin Cluster Op Service Admin Service Op Read-Only ↑, except add services, Kerberos, manage alerts & upgrades ↑, except manage permissions all
  • 26. 26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Service Layout Common Services Stack Override
  • 27. 27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Stack Advisor Kerberos HTTPS Zookeeper Servers Memory Settings … High Availability atlas.rest.address = http(s)://host:port # Atlas Servers atlas.enabletTLS = true|false atlas.server.http.port = 21000 atlas.server.https.port = 21443 Example Configurations
  • 28. 28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Background: Upgrade Terminology Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime
  • 29. 29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Background: Upgrade Terminology Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime Rolling Upgrade Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact
  • 30. 30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Background: Upgrade Terminology Express Upgrade Automated Runs in parallel across hosts Incurs downtime Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime Rolling Upgrade Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact
  • 31. 31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Automated Upgrade: Rolling or Express Check Prerequisites Review the prereqs to confirm your cluster configs are ready Prepare Take backups of critical cluster metadata Perform Upgrade Perform the HDP upgrade. The steps depend on upgrade method: Rolling or Express Register + Install Register the HDP repository and install the target HDP version on the cluster Finalize Finalize the upgrade, making the target version the current version
  • 32. 32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Process: Rolling Upgrade ZooKeeper Ranger Hive Oozie Falcon Kafka Knox Storm Slider Flume Finalize or Downgrade Clients HDFS, YARN, MR, Tez, HBase, Pig. Hive, etc. Core Masters Core Slaves HDFS YARN HBase
  • 33. 33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Alerting Framework Alert Type Description Thresholds (units) WEB Connects to a Web URL. Alert status is based on the HTTP response code Response Code (n/a) Connection Timeout (seconds) PORT Connects to a port. Alert status is based on response time Response (seconds) METRIC Checks the value of a service metric. Units vary, based on the metric being checked Metric Value (units vary) Connection Timeout (seconds) AGGREGATE Aggregates the status for another alert % Affected (percentage) SCRIPT Executes a script to handle the alert check Varies SERVER Executes a server-side runnable class to handle the alert check Varies
  • 34. 34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Alert Check Counts • Customize the number of times an alert is checked before dispatching a notification • Avoid dispatching an alert notification (email, snmp) in case of transient issues
  • 35. 35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Alerts - Configuring the Check Count  Set globally for all alerts, or override for a specific alert Global Setting Alert Override
  • 36. 36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Storm Monitoring View
  • 37. 37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Grafana for Ambari Metrics  Grafana as a “Native UI” for Ambari Metrics  Pre-built Dashboards Host-level, Service-level  Supports HTTPS  System Home, Servers  HDFS Home, NameNodes, DataNodes  YARN Home, Applications, Job History Server  HBase Home, Performance FEATURES DASHBOARDS
  • 38. 38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Grafana includes pre-built dashboards for visualizing the most important cluster metrics.
  • 39. 39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The HDFS NameNode dashboard highlights file system activity.
  • 40. 40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Log Search Search and index HDP logs! Capabilities • Rapid Search of all HDP component logs • Search across time ranges, log levels, and for keywords Solr Logsearch Ambari
  • 41. 41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Log Search WO R K E R N O D E L O G F E E D E R Solr LO G S EA RC H U I Solr Solr A M BA R I Java Process Multi-output Support Grok filters Solr Cloud Local Disk Storage
  • 42. 42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Future of Ambari  Cloud features  Service multi-instance (two ZK quorums)  Service multi-versions (Spark 1.6 & Spark 2.0)  YARN assemblies  Patch Upgrades: upgrade individual components in the same stack version, e.g., just DN and RM in HDP 2.5.*.* with zero downtime  Ambari High Availability

Editor's Notes

  1. Single pane of glass. Provision on the cloud Metrics Services Config Security Alerts Host management Views framework
  2. 0.9 in Sep 2012 1.5 in April 2014 1.6 in July 2014 2.3.0 was not used 2.4.0 is slated with a ton of new features. 2179 Jiras. Cadence is 2-3 major releases per year, with follow up maintenance releases in the months after. http://jsfiddle.net/mp8rqq5x/2/
  3. Log Search : Solr, Logfeeder (similar to Logstash), and Grafana UI Zeppelin for data exploration and visualization that can plugin to multiple data backends Role Based Access Control
  4. Alerts, Stability EU/RU experience LogSearch Security automation Views Framework ease of use
  5. Deploy: Blueprints with Host Discovery Secure: Kerberos, LDAP sync Smart Configs: stack advisor, painful to configure a thousand related knobs. E.g, change zoozkeeper quorum then that has an effect on several services. Log folder, then affects log search. Upgrade: Rolling and Express Upgrade, get patches Monitor: Ambari Alerts, Ambari Metrics Analyze, Scale, Extend: Views, Management Packs
  6. Cloudbreak can install on Amazon EC2, MSFT Azure, Cluster install takes 5-10 mins, mostly downloading packages, installing bits, and starting services.
  7. Used by HDInsight (Microsoft Azure) and Hortonworks QA Allow cluster creation or scaling to be started via the REST API prior to all/any hosts being available. As hosts register with Ambari server they will be matched to request host groups and provisioned according to the requested topology Allow host predicates to be specified along with host count to provide more flexibility in matching hosts to host groups. This will allow for host flavors where different host groups are matched to different host flavors Break up the current monolithic provisioning request into a request for each host operation. For example, install on host A, start on host A, install on hostB, etc. This will allow hosts to make progress even when another host encounters a failure. Allow a host count to be specified in the cluster creation template instead of host names. This is documented in https://issues.apache.org/jira/browse/AMBARI-6275 Install a cluster with two API calls
  8. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  9. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  10. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  11. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  12. Dynamic availability Allow host_count to be specified instead of host_names As hosts register, they will be matched to the request host groups and provisioned according to to the requested topology When specifying a host_count, a predicate can also be specified for finer-grained control
  13. Dynamic availability Allow host_count to be specified instead of host_names As hosts register, they will be matched to the request host groups and provisioned according to to the requested topology When specifying a host_count, a predicate can also be specified for finer-grained control 3 Terabytes since units is in MB
  14. Kerberos: LDAP/AD Services: Ranger, Atlas, Knox. Ranger: setup security policies on who can access what. Authorization of audit files, plugins for other services like HDFS, Hive, Storm, etc. Atlas: Lineage of data, compliance, especially in health care and financial institutions Knox: perimeter security for HTTP and REST calls in the Hadoop Services. Works with SSL, Kerberos. Kerberos Key Distribution Center so we can define service principals and keytabs.
  15. Can use existing KDC (key distribution center) or install one for Hadoop Hadoop uses a rule-based system to create mappings between service principals and their related UNIX username
  16. As Ambari grows and organizations grow, so do security needs Users have fine-grained roles over the cluster and individual views. Granular authorization checks to distribute the responsibilities and privileges of authenticated users
  17. Configuration files Package with python scripts and templates Alert definitions Kerberos configurations, principals, identities, keytabs Meta data Metric details UI controls and widgets
  18. Stack Advisor, can now ship the recommendations for a service with the service itself, instead of a monolithic stack advisor for the entire stack. Makes it easier to integrate customer services
  19. Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  20. Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  21. Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  22. This Grafana instance is specifically for AMS, not meant to be general-purpose If customer is already using Grafana, this is not a replacement. Grafana will support read-only access for anonymous users, and HTTPS Aggregates across entire cluster, filter by host, top/bottom x, functions like avg/sum/min/max, filter by date range
  23. This is not HDP Search, it is not something that the customer has to separately license, it is an embedded Solr instance
  24. Agent/Collection process running on each host Written in Java Tails all service log files Parses logs using Grok/regex. Can merge multiple line logs, e.g. stack trace On restart, can resume from last read line. Uses checkpoint files to maintain state Extendable design to send logs to multiple destination type. Currently can send logs to Solr and Kafka
  25. Major themes. Goal is to make it as good as Australian pies