Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Install Sqoop on Amazon EMR (Elastic Map Reduce)

4,303 views

Published on

Slides used in the Video "Sqoop on EMR" - https://www.youtube.com/watch?v=3YJwDJOyDE0

Published in: Technology
  • Be the first to comment

Install Sqoop on Amazon EMR (Elastic Map Reduce)

  1. 1. Installing Sqoop on AWS Elas1c Map Reduce BY Rohit Ghatol Director of Engineering @ Synerzip h3p://www.linkedin.com/in/rohitghatol @rohitghatol h3p://rohitghatol.com
  2. 2. So<ware Stack Apache Sqoop Amazon EMR
  3. 3. Step 1 – Set S3 Buckets S3 S3 S3 synerzip-­‐sqoop-­‐scripts • install-­‐sqoop.sh • sqoop-­‐import-­‐all.sh • mysql-­‐connector-­‐java-­‐5.1.33.tar.gz • sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz synerzip-­‐emr-­‐logs • j-­‐2SL51VFFUEVZT/ • daemons • node • steps synerzip-­‐imported-­‐data • User_Profile-­‐12-­‐12-­‐12_10:10:10 • part-­‐m-­‐00000 • part-­‐m-­‐00001 • part-­‐m-­‐00002 S3 Bucket with Sqoop Scripts S3 Bucket with EMR Logs S3 Bucket with Sqoop Imported Data
  4. 4. S3 Buckets
  5. 5. Install-­‐Sqoop.sh #!/bin/bash cd /home/hadoop hadoop fs -­‐copyToLocal s3://synerzip-­‐sqoop-­‐scripts/ sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz tar -­‐xzf sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz hadoop fs -­‐copyToLocal s3://synerzip-­‐sqoop-­‐scripts/mysql-­‐ connector-­‐java-­‐5.1.33.tar.gz mysql-­‐connector-­‐java-­‐5.1.33.tar.gz tar -­‐xzf mysql-­‐connector-­‐java-­‐5.1.33.tar.gz cp mysql-­‐connector-­‐java-­‐5.1.33/mysql-­‐connector-­‐java-­‐5.1.33-­‐ bin.jar sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha/lib/
  6. 6. Sqoop-­‐Import-­‐all.sh !/bin/bash cd /home/hadoop/sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha/bin ./sqoop import -­‐-­‐connect jdbc:mysql://db.c5zzejm1gdnx.us-­‐ west-­‐1.rds.amazonaws.com/test -­‐-­‐username root -­‐-­‐password password -­‐-­‐table User_Profile -­‐-­‐target-­‐dir s3://synerzip-­‐ imported-­‐data/User_Profile-­‐`date +"%m-­‐%d-­‐%y_%T"`
  7. 7. Step 2 – MySQL Database
  8. 8. User_Profile Table
  9. 9. Step 3 – Start EMR Cluster s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐ runner.jar S3://synerzip-­‐sqoop-­‐scripts/install-­‐sqoop.sh s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐ runner.jar S3://synerzip-­‐sqoop-­‐scripts/import-­‐sqoop-­‐all.sh Install-­‐Sqoop Step Import Sqoop Step
  10. 10. Install Sqoop Step Jar locaaon -­‐ s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐runner.jar
  11. 11. Import Sqoop Jar locaaon -­‐ s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐runner.jar
  12. 12. EMR Steps
  13. 13. Step 4 – See Imported Data
  14. 14. part-­‐m-­‐00000

×