Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Autoscaling of hadoop on openstack

96 views

Published on

Hadoop on Openstack

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Autoscaling of hadoop on openstack

  1. 1. AutoScaling of Hadoop on OpenStack
  2. 2. TEAM Vincent Kuri Samyukta Rao Sharan Srivatsa
  3. 3. Painpoint Wastage of idle resources Manual scaling up and scaling down No flexible framework for scaling
  4. 4. Objective Rapidly provision Hadoop Clusters on OpenStack by automating provisioning and configuring of the machines To provide a framework for scaling up and scaling down the hadoop cluster
  5. 5. Hadoop Hadoop is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming model.
  6. 6. Hadoop? What going on inside?
  7. 7. OpenStack • Open source cloud computing platform • Infrastructure as a Service (IaaS) Started by Rackspace and NASA
  8. 8. Auto Clustering Rapidly provision Hadoop Clusters on OpenStack by automating provisioning and configuring of the machines
  9. 9. Architecture
  10. 10. Distributed Systems Analytics Ganglia Monitoring System Ganglia is a scalable distributed system used to monitoring for high performance systems such as clusters and grids. Ceilometer Ceilometer is a native software of Openstack that is used to measure usage of system resources by client’s instances to make billing simpler. Ambari The Apache Project aimed at making Hadoop management simpler for developing software for provisioning, managing and monitoring Hadoop.
  11. 11. Where are we now? Auto clustering scripts Scaling up scripts Ganglia collecting metrics Setup multinode Devstack and Openstack Production environments(Grizzly)
  12. 12. What’s ahead? GUI frontend for auto scaling Support for Custom Hadoop Clusters Abstraction of the cluster provisioning framework to deploy any sort of cluster, not just Hadoop To allow easy integration with multiple frameworks for enhanced monitoring
  13. 13. Thank You. We would like to thank you for this opportunity.

×