Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Managing your Hadoop Clusters with Ambari


Published on

Apache Ambari provides a 100% open source and intuitive set of tools to monitor, manage and efficiently provision your Apache Hadoop cluster. Ambari simplifies the operation and hides the complexity of Hadoop, making Hadoop appear like a single, cohesive data platform. Hadoop cluster provisioning and ongoing management can be a complicated task, especially when there are hundreds or thousands of nodes involved. Ambari allows you to control Hadoop cluster services from a single point. In this session, we will provide an overview of the Apache Ambari key features, architecture and web service-based APIs.

Published in: Technology

Managing your Hadoop Clusters with Ambari

  1. 1. Apache Ambari – Management, Monitoring andOperationsMahadev KonarCo Founder & Software Engineer@mahadevkonar (@hortonworks)Pramod ThangaliDirector of Engineering@pramodthangali (@hortonworks) Page 1
  2. 2. Hello!• Hadoop – Contributor since 2006 – Committer/PMC• ZooKeeper – Contributor since 2008 – Committer/PMC• Ambari – Contributor since 2012 – Committer/PPMC Page 2
  3. 3. Ambari Easiest way to manage Hadoop ClustersCluster Operations Job Diagnostics Extensible PlatformCore capabilities to include the Insight into job performance and Integration and customization critical tasks associated with reduce the burden on points so Hadoop can provisioning and operating specialized Hadoop skills and interoperate with existing Hadoop clusters. knowledge. operational tooling. Page 3
  4. 4. Architecture Web Client REST API JS Ambari /clusters Web Configurable java Ambari Auth Provider Server Auth Provider Cluster REST APIConfigurations RDBMS Request Dispatcher LDAP DB User postgres Orchestrator Monitoring Repo Bootstrap or Manual install python Ambari Ganglia php puppet Agent/s Collector Page 4
  5. 5. API’s API’s and API’s http://{your.ambari.server}:8080 Clusters /api/v1/clusters Services /api/v1/clusters/mycluster/services Components /api/v1/clusters/mycluster/services/hdfs/components/namenode Hosts /api/v1/clusters/mycluster/hosts Host-Components /api/v1/clusters/mycluster/hosts/host1/host_components Page 5
  6. 6. How to plug into the API’s• Consistent front-end REST API for –Accessing Hadoop and System metrics –Managing cluster resources• Service provider plugin architecture to enable –Adding new components and resources –Providing alternative service providers to manage existing components < Place tittle Here by using Header and Footer Options > Page 6
  7. 7. Enabling Integration Scenarios• Expand REST API “ZERO TOUCH” INSTALLS Blueprint Ambari Cluster “LIGHTS OUT” WORKFLOWS Alarm Decom InformAVOID CONFIGURATION “DRIFT” Change Apply Callback “BRING MY OWN” SCENARIOS Custom Provider Page 7
  8. 8. Ambari + Teradata Viewpoint Integration• Integrate with Ambari to provide Monitoring REST API• Use Ganglia for metrics collection• No Ambari mgmt controls• No Ambari UI• No Ambari Agents P HDP Cluster Ambari Custom P Page 8
  9. 9. Extending Ambari – Adding new services• Stack Definitions – Define which Services are available (services) – Define where to get the packages (repos) repos Stack A services S S S S S repos Stack B services S S S S repos Stack C extends Stack A S S S S S S services + Page 9
  10. 10. Demo Page 10
  11. 11. Ambari RoadMap• Releases –Ambari 1.2.0 –Ambari 1.2.1 –Ambari 1.2.2 (under vote) –Ambari 1.2.3 (under development for stability fixes)• Ambari 1.3.0• Ambari 1.4.0• Futures < Place tittle Here by using Header and Footer Options > Page 11
  12. 12. Ambari 1.2.2 – whats new• Bug Fixes to 1.2 line• Host-level Alerts• Paging Controls on Zoomed Graphs• Change Ambari Web Port #• Support for AD Authentication• Quicker Install/Start/Test progress display• Upgrade Paths – 1.2.0  1.2.2 and 1.2.1  1.2.2• Performance – Optimized deployment + monitoring on clusters < Place tittle Here by using Header and Footer Options > Page 12
  13. 13. Host Level Alerts• Host-specific alerts• For example: – TaskTracker process, DataNode process, DataNode Storage < Place tittle Here by using Header and Footer Options > Page 13
  14. 14. Paging Controls on Zoomed GraphsLast 1 Hour Last 12 HoursLast 2 Hours Last 24 HoursLast 4 Hours Last 1 Week Page 14
  15. 15. Ambari 1.3• Manage Kerberos Secure Cluster• Managed Stack Upgrade• Disaster protection service via HDFS mirroring• Multi-tenancy support via capacity Scheduler• Improved Configuration Mgmt with host-level controls• Master service mobility to ease hardware maintenance < Place tittle Here by using Header and Footer Options > Page 15
  16. 16. Ambari 1.3 continued• External group mappings (LDAP/AD)• Support API and external driven cluster Installation Management• Enhanced Job Diagnostic visualizations• HUE Support for Hadoop data workers Page 16
  17. 17. Multi Tenancy Support• Compute resource quotas – Managed via Capacity Scheduler and used in PROD at Yahoo! – Basic configuration support in Ambari 1.3 – Visualization for Queue Diagnostics in Ambari 1.4 (target) Page 17
  18. 18. Multi Tenancy Support – User and Data Management• Disk and Name space quotas• Support for quota mgmt in Ambari 2.x (target) Page 18
  19. 19. Release Schedule and Info• Ambari 1.2.2 (pending release)• Ambari 1.2.3 (2-3 weeks)• Ambari 1.3.0 (May)• Ambari 1.4.0 (TBD)• Links – Website – – Mailing Lists – – Development Wiki – – Looking for Apache contributors < Place tittle Here by using Header and Footer Options > Page 19
  20. 20. Thank you.@mahadevkonar@pramodthangali Architecting the Future of Big Data Page © Hortonworks Inc. 2012