Your SlideShare is downloading. ×
  • Like
Hong Kong OpenStack Summit: Savanna - Hadoop on OpenStack
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Hong Kong OpenStack Summit: Savanna - Hadoop on OpenStack

  • 2,120 views
Published

 

Published in Health & Medicine , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
2,120
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
205
Comments
0
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Savanna Hadoop on OpenStack Ilya Elterman (Mirantis) Matthew Farrellee (Red Hat) Sergey Lukjanov (Mirantis)
  • 2. Agenda ● Savanna Overview ● Current state ○ EDP overview ○ other features ● Roadmap ● Live Demo
  • 3. Agenda ● Savanna Overview ● Current state ○ EDP overview ○ other features ● Roadmap ● Live Demo
  • 4. OpenStack Data Processing - Savanna Mission: To provide the OpenStack community with an open, cutting edge, performant and scalable data processing stack and associated management interfaces ● provision and operate Hadoop clusters ● schedule and operate Hadoop jobs
  • 5. Hadoop - Big Data Platform
  • 6. Popularity Hadoop OpenStack http://www.google.com/trends/explore?q=hadoop+openstack#q=openstack%2C%20hadoop&cmpt=q
  • 7. Use Cases ● Self-service provisioning of Hadoop clusters ● Utilization of unused compute capacity for bursty workloads ● Run Hadoop workloads in few clicks without expertise in Hadoop ops
  • 8. Architecture Overview Keystone Hadoop VM Hadoop VM Horizon Hadoop VM Auth Savanna Pages Cluster Configuration Manager REST API Savanna Python Client Hadoop VM Vendors Plugins Job Sources Job Manager Data Access Layer Swift Nova Trove DB Data Sources Resources Orchestration Manager Glance Heat Cinder Neutron
  • 9. Savanna Status ● Official incubated OpenStack project ● v0.3 released 17 Oct 2013 ● Supported Hadoop distros: ○ Vanilla Apache Hadoop (reference implementation) ○ Hortonworks Data Platform 1.3.x ○ Intel Distribution on review ○ Cloudera Distribution in blueprint ● Included in OpenStack distros: ○ RDO - http://openstack.redhat.com ○ Mirantis OpenStack - http://software.mirantis.com
  • 10. Cluster Provisioning Performance
  • 11. Agenda ● Savanna Overview ● Current state ○ EDP overview ○ other features ● Roadmap ● Live Demo
  • 12. EDP Overview ● End users have data and questions ○ The data lives in a data repository ○ The questions are embodied in code ● Savanna Elastic Data Processing (EDP) brings the Hadoop ecosystem to the end user ○ Hides all cluster management behind the scenes
  • 13. EDP “Customers launch millions of Amazon EMR clusters every year.” http://aws.amazon.com/elasticmapreduce/
  • 14. EDP ● Variety and depth of value add offerings on top of clouds are growing ● Offerings are rarely open, rarely allow for choice ● Examples - Google Cloud, Azure, AWS
  • 15. EDP Savanna and EDP can both match and exceed use cases provided by most public clouds
  • 16. EDP in Savanna v0.3 ● UI, integrated into Horizon, for ad-hoc analytics queries based on Hive or Pig ● API to execute MapReduce jobs without exposing details of underlying infrastructure ● Pluggable data sources: Swift ● Supported job types: Jar, Pig, Hive ● Integration with Oozie for workflow management
  • 17. Agenda ● Savanna Overview ● Current state ○ EDP overview ○ other features ● Roadmap ● Live Demo
  • 18. Cluster Ops in Savanna 0.3 REST API Configuration templates Manual cluster scaling Data node anti-affinity and location control Full support of data locality - rack and 4-level awareness for HDFS and Swift ● Swift integration ● ● ● ● ●
  • 19. OpenStack Integration in Savanna 0.3 ● ● ● ● OpenStack Dashboard plugin Both Neutron and Nova Network support Keystone trusts used for async operations Python client
  • 20. Agenda ● Savanna Overview ● Current state ○ EDP overview ○ other features ● Roadmap ● Live Demo
  • 21. Live Demo
  • 22. Icehouse Roadmap ● Integration with OpenStack ecosystem ○ Heat ○ Tempest ○ Devstack ○ Ceilometer ○ Ironic ● EDP enhancements ● Code hardening ● Polished api v2 ● Performance testing
  • 23. Design Summit Sessions Friday, November 8 ● 1:30pm Network and installation topologies ● 2:20pm Heat integration and scalability ● 3:10pm Further OpenStack integration ● 4:10pm Savanna in Icehouse http://goo.gl/2iEv8u
  • 24. Q&A