Your SlideShare is downloading. ×
  • Like

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Big Data in the Cloud: An example using Ansible, R, RHadoop, and AppScale to deploy a big data environment on AWS/Eucalyptus

  • 1,059 views
Published

 

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,059
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
28
Comments
0
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Big Data in the CloudsAn example using Ansible, R, RHadoop, and AppScale to deploy a big data environment on AWS/Eucalyptus
  • 2. Big Data Environment●Why? Why R? Why AppScale? WhyAWS/Eucalyptus?●Environments needing to process “big data” arein high-demand●Flexibility in deploying big data environments -AWS has Elastic MapReduce; Eucalyptus has ?
  • 3. Goals●Deploy open source big data environment onIaaS●Same deployment method can be used on bothpublic and private IaaS (hybrid?)
  • 4. The Architecture
  • 5. Ansible●http://www.ansibleworks.com/●Open SourceConfiguration Managementusing SSH●Flexible, powerful, efficient,secure●http://ansible.cc/docs/
  • 6. R and RHadoop●http://www.r-project.org/ ● open source statistics software; very flexible, and powerful●http://www.revolutionanalytics.com/ ● Provides enterprise analytics software using R●https://github.com/RevolutionAnalytics/RHadoop/wiki
  • 7. AppScale●http://www.appscale.com●PaaS that implements GoogleApp Engine APIs on differentpublic/private IaaS, and virtualenvironments.●http://www.slideshare.net/shatteredNirvana/intro-to-app-engine-and-appscale●Ships with Cloudera for back-end support of Google AppEngine MapReduce APIimplementation
  • 8. AWS EC2/Eucalyptus●http://aws.amazon.com●Cloud API that has prettymuch become a standard●http://www.eucalyptus.com●Closely follows AWS APIs forEC2, S3, IAM (soon ELB,CloudWatch, and AutoScaling)
  • 9. Deployment
  • 10. AWS/Eucalyptus●Account/User Credentials ● EC2_ACCESS_KEY ● EC2_SECRET_KEY ● EC2_URL●IAM policy for EC2 policies tolaunch instances, create securitygroups, authorize ports, imagemanagement (bundle, upload, andregister)
  • 11. AppScale● Pre-built AppScale Images ● AWS - ami-4e472227 ● Eucalyptus - AppScale image found @ http://emis- catalog.s3.amazonaws. com/index.html● appscale-tools - https://github.com/AppScale/appscale-tools ● appscale init cloud ● edit AppScaleFile ● appscale up
  • 12. Ansible, R, RHadoop●Use git to grab Ansible playbook -https://github.com/hspencer77/ansible-r-appscale-playbook●Playbook installs R, and grabsrhdfs and rmr2 from RHadoop ● https://github. com/downloads/Revolution Analytics/RHadoop/rhdfs_1 .0.5.tar.gz ● https://github. com/downloads/Revolution Analytics/RHadoop/rmr2_2 .0.2.tar.gz
  • 13. Test - Wordcount.R●Test deployment using wordcount program written in R -wordcount.R●SSH into head node, pull out wordcount.R file - tar zxfrmr2_2.0.2.tar.gz rmr2/tests/wordcount.R●Execute it - Rscript rmr2/tests/wordcount.R
  • 14. Results
  • 15. Contact InfoAppScale - hannah@appscale.com Eucalyptus - harold.spencer. jr@eucalyptus.com
  • 16. Questions? Demo