Apache Hadoop & Hive installation with movie rating exercise

1,807 views

Published on

Apache Hadoop & Hive installation with movie rating exercise

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,807
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
82
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Apache Hadoop & Hive installation with movie rating exercise

  1. 1. Hadoop & Hive Installation With Exercise -by Shiva Dasharathi shivaramakrishnad@mail.com
  2. 2. Contents Section-1:  Pre requisites  Set Environment variables  Configure Hadoop for psuedomode  Install SSH  Format HDFS & Start hadoop cluster Section-2:  Start hive metastore Section-3:  Sample data (user,movie, rating) format  Exercise
  3. 3. Section-1 > Pre-requisites: 1. Should have Unix environment; use cygwin if ur in windows. 2. JAVA: - Should have jdk 1.6 or above 3. Hadoop <hadoop-1.0.4.tar.gz> - link: http://archive.apache.org/dist/hadoop/core/hadoop1.0.4/hadoop-1.0.4.tar.gz 4. Hive <hive-0.12.0.tar.gz> - link: http://download.nextag.com/apache/hive/hive-0.12.0/hive0.12.0.tar.gz Say u downloaded tars into /home/hadoop/ * To untar hadoop-1.0.4.tar.gz, hive-0.12.0.tar.gz goto terminal, & run the commands $cd /home/hadoop/ $tar xzf hadoop-1.0.4.tar.gz hadoop-1.0.4 $tar xzf hive-0.12.0.tar.gz hive-0.12.0
  4. 4. Section-1 > Set Environment variables: Update ur .bashrc file with the below variables export JAVA_HOME=<ur-jdk-location-which-has-bin-folder-in-it> export HADOOP_HOME=<ur-hadoop-distribution-base-directory> export HIVE_HOME=<ur-hive-installable-base-directory> export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME /bin Example: export JAVA_HOME=/usr/lib/jvm/jdk_1.6 export HADOOP_HOME=/home/hadoop/hadoop-1.0.4 export HIVE_HOME=/home/hadoop/hive-0.12.0 export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME
  5. 5. Section-1 > Configure hadoop for pseudomode: Goto ~/hadoop/conf/ folder add the below properties, respectively core-site.xml <property> <name>fs.default.name</name> <value>hdfs://localhost:9100</value> </property> hdfs-site.xml <property> <name>dfs.replication</name> <value>1</value> </property> mapred-site.xml <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property>
  6. 6. Section-1 > Install SSH Step 1: check if ssh is installed in ur machine $ssh localhost should work, Otherwise, --------------------------------------------------install ssh $sudo apt-get install ssh Create .ssh folder in /home/hadoop/ $mkdir /home/hadoop/.ssh Generate rsa key for the machine $ssh-keygen -t rsa Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: $cat /home/hadoop/.ssh/id_dsa.pub >> /home/hadoop/.ssh/authorized_keys $ ssh localhost
  7. 7. Section-1 > Format HDFS & Start hadoop cluster: To format HDFS: $hadoop namenode –format To start the cluster: $start-all.sh To test the services: $jps
  8. 8. Section-2 > Start hive service $cd $HIVE_HOME $hive This will start ur hive metastore service & start the hive cli hive> Try the command hive> show databases;
  9. 9. Section-3 > Sample data movie_rating.csv -- user,name,rating Dasha,Geethanjali,5 Dasha,17Again,5 Buddi,Geethanjali,4.5 Sam,Blood Diamond, 4.5 Dasha,Apocalypto,4.5 Dasha,Matrix,3 Sam,Inception,4 Buddi,Matrix,5
  10. 10. Section-3 > Exercise hive>CREATE TABLE movie_ratings(user string, mname string, rating string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY 'n'; hive>LOAD DATA LOCAL INPATH ’~/move_rating.csv’ INTO TABLE movie_ratings; hive>SELECT user ,’–-’ ,mname, ‘–-’ ,rating from movie_ratings;

×