Your SlideShare is downloading. ×
Managing your Hadoop Clusters with Apache Ambari
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Managing your Hadoop Clusters with Apache Ambari


Published on

Deploying, configuring, and managing large Apache Hadoop and HBase clusters can be quite complex. Once you have your clusters, keeping them up and running and making sure that the SLAs are met …

Deploying, configuring, and managing large Apache Hadoop and HBase clusters can be quite complex. Once you have your clusters, keeping them up and running and making sure that the SLAs are met presents even more challenges and headaches to Hadoop operators. To make matters worse, managing upgrades can be a nightmare. Hadoop users are presented with their own fair share of difficulties such as slow running jobs and not knowing why they are slow. For third-party software vendors interested in incorporating Hadoop management and monitoring capabilities, there does not seem to be an obvious, easy solution. Apache Ambari is aimed at making lives of Hadoop operators, users, and integrators simpler by providing a management interface to do all of that and more. This session presents usages of Ambari`s Web UI for Hadoop operators (deploying, managing, and monitoring) as well as Hadoop users (job analytics). The talk will also touch upon Ambari`s REST API and how it is used in the real world. The session concludes by revealing the future roadmap of Ambari including queue management, upgrade, disaster recovery, high availability, and more.

Published in: Technology

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. © Hortonworks Inc. 2013 Managing Your Hadoop Clusters with Apache Ambari Hadoop Summit June 2013
  • 2. © Hortonworks Inc. 2013 Hello! • Yusaku Sako –Committer / PPMC member, Apache Ambari –Member of Technical Staff @ Hortonworks – • Jeff Sposetti –Contributor, Apache Ambari –Director of Product Management @ Hortonworks – Page 2
  • 3. © Hortonworks Inc. 2013 Today, We’ll Go Over… • Intro • Open Source Activity • Demo • Futures • Architecture • Recent Developments • Q & A Page 3
  • 4. © Hortonworks Inc. 2013 Ambari: Enterprise Hadoop Operations Ambari is the only 100% open source framework for provisioning, managing and monitoring Apache Hadoop clusters HADOOP Storage & Process at Scale AMBARI PROVISION MANAGE MONITOR AMBARI WEB
  • 5. © Hortonworks Inc. 2013 Features Today Provisioning: Simplified deployment across platforms Managing: Consistent controls across the Stack Monitoring: Visibility into key cluster metrics - Single pane of glass for Hadoop & System status - Pre-configured metrics & alerts - Single point for cluster operations - Customize w/o dealing with Hadoop complexities - Advanced configurations and host controls - Wizard-driven cluster install experience - Deploy 10s,100s or 1000s of Hadoop servers - Cloud, virtual and physical environments
  • 6. © Hortonworks Inc. 2013 Apache Ambari – 100% Open Source! • Active community • 50+ Contributors / 20+ Committers • 140+ Ambari User Group Members • Steady progress/release cycle Page 6 Release Version Release Date JIRAs Resolved 0.9.0 Sep 2012 402 1.2.0 Feb 2013 441 1.2.1 Mar 2013 134 1.2.2 Apr 2013 106 1.2.3 Jun 2013 515 1.2.4 Jul 2013 109+ 1.2.5 Jul 2013 131+  Current Release  Today’s Demo
  • 7. © Hortonworks Inc. 2013 Ambari System Architecture 7 Ambari Server Host Agent gmond Host Agent gmond Ganglia Server Agent Host Agent gmondgmetad gmond Ambari Web DB REST /clusters Nagios Server Agent
  • 8. © Hortonworks Inc. 2012 Demo Page 8
  • 9. © Hortonworks Inc. 2012 Futures Page 9
  • 10. © Hortonworks Inc. 2013 Host Group Configuration Controls • Set custom configuration properties at the host level for one or more hosts • Important for handing “heterogeneous” clusters • AMBARI-1509 and AMBARI-1370 10 HEAPSIZE= 1024 HEAPSIZE= 2048
  • 11. © Hortonworks Inc. 2013 Cluster Blueprints 11 • Perform “Headless Install” • Perform “Cluster Takeover” • Export blueprint from cluster • Boot & save wizard w/blueprint • AMBARI-1783 BLUEPRINT <stack> <host> <service> <component> <config> Ambari Server HOST MANIFEST <host> <meta> SERVICE CONFIGS <props> BLUEPRINT
  • 12. © Hortonworks Inc. 2013 Hadoop 2.0 Support • Provision, manage and monitoring Hadoop 2.0 Stack • HDFS2, YARN, Tez • Rolling Cluster Upgrades –Enable cluster upgrade, one host at a time, in such a way that services and resources offered by the cluster are always available through out the upgrade process Page 12
  • 13. © Hortonworks Inc. 2013 Ambari Architecture Page 13 DB Orchestrator SPI REST API Request Dispatcher Ambari Web Ambari Server Metrics AuthProvider /clusters /services /hosts /workflows/jobs /users, … User Store java RDBMS javascript RDBM S AD/ LDAP REST API for integration Auth Provider Cluster Configurations Web Client 100% REST Ambari Agents ganglia nagios Alerts Pluggable Service Providersfalcon Data Mgmt jmx python puppet
  • 14. © Hortonworks Inc. 2013 REST API – Centralized & Consistent Page 14 Ambari REST API Alerts Job HistoryMetricsConfigurations Config DB Nagios Server Ganglia Server … HTTP GET, POST, PUT, DELETE :8080 HTTP Status Code / JSON core- site.xml core- site.xml Config files Config files Config files JMX Realtime Historical*-site.xml… Job History DB Hosts / ServicesCluster
  • 15. © Hortonworks Inc. 2013 REST API Resource Tree • Resources • Clusters • Services (HDFS, MR, HIVE…) • Components (NAMENODE, DATANODE…) • Hosts • Host Components (DATANODE on host1…) • Configurations (core-site, mapred-site, …) • Workflows (Hive queries, Pig scripts, MR programs) • Jobs (spawned MR jobs…) • Task Attempts (Map, Shuffle, Reduce…) • Stacks (HDP, other distros) • Page 15
  • 16. © Hortonworks Inc. 2013 Ambari + Teradata Viewpoint Integration Page 16 • Ambari = Key enabler for integrating Hadoop monitoring capabilities to Viewpoint • Viewpoint uses Ambari REST API and Custom Service Providers to get Hadoop metrics from a non- Ambari deployed cluster
  • 17. © Hortonworks Inc. 2013 Stack Definitions • Design Goals –Ambari should be able to support choice of Hadoop stacks –Ambari should enable adding new components to an existing stack • Define which Services are available (services) • Define where to get the packages (repos) 17 S S S SStack B repos services S S S SStack A repos services S S S S Stack C extends Stack B repos services S S+
  • 18. © Hortonworks Inc. 2013 Ambari + Redhat GlusterFS Integration • Using Ambari to deploy / manage cluster with distributed file system other than HDFS –HCFS: GlusterFS as first implementation –Pluggability with other HCFS’s –See AMBARI-1817 Page 18 MapReduce Hive Distributed File System HDFS GlusterFS HBasePig Other HCFS …
  • 19. © Hortonworks Inc. 2013 Ambari + Accumulo Integration • Using Ambari to deploy / manage cluster with Accumulo –Google Summer of Code project –See AMBARI-1930 MapReduce Hive Distributed File System HBasePig
  • 20. © Hortonworks Inc. 2013 Ambari + Splunk Integration • Head over to Splunk’s Expo booth to learn about Ambari integrated into Splunk’s Management UI Page 20 +
  • 21. © Hortonworks Inc. 2013 Get Involved! • Project Website – • Check out Ambari – Try installing your own cluster! (See project website for instructions) • Mailing Lists – – • IRC Chanel – @apacheambari Page 6
  • 22. © Hortonworks Inc. 2013 Thanks! • Questions? Page 22