Apache hadoop 2_installation

324 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
324
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Apache hadoop 2_installation

  1. 1. Apache Hadoop 2 Installation in Pseudo Mode Download URL 1. Hadoop: https://archive.apache.org/dist/hadoop/core/stable/ 2. Hive: http://people.apache.org/~hashutosh/hive-0.10.0-rc0/ 3. Pig: http://ftp.udc.es/apache/pig/pig-0.12.0/ 4. Hbase: http://archive.apache.org/dist/hbase/hbase-0.94.10/ Step 1: Generate ssh key $ssh-keygen -t rsa -P “” Step 2: Copy id_rsa.pub to authorized_keys $cd .ssh $cp id_rsa.pub authorized_keys $chmod 644 authorized_keys Step 3: Passwordless ssh to localhost $cd ~ $ssh localhost Step 4: Untar tarballs
  2. 2. $tar -xvzf hadoop-2.2.0.tar.gz Step 5: Configuration files $cd hadoop-2.2.0/etc/hadoop/ $vim core-site.xml Add following properties in core-site.xml <property> <name>fs.defaultFS</name> <value>hdfs://172.17.196.14</value> </property> <property> <name>io.native.lib.available</name> <value>true</value> </property> $vim hdfs-site.xml Add following property in hdfs-site.xml <property> <name>dfs.datanode.data.dir</name> <value>/home/hadoop/hadoop-2.2.0/pseudo/dfs/data</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/hadoop/hadoop-2.2.0/pseudo/dfs/name</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> $vim mapred-site.xml Add following property in mapred-site.xml
  3. 3. <property> <name>mapreduce.cluster.temp.dir</name> <value>/home/hadoop/hadoop-2.2.0/temp</value> <final>true</final> </property> <property> <name>mapreduce.cluster.local.dir</name> <value>/home/hadoop/hadoop-2.2.0/local</value> <final>true</final> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> $vim yarn-site.xml Add following property in yarn-site.xml <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>localhost:6000</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value> localhost:6001</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler< /value> </property> <property> <name>yarn.resourcemanager.address</name> <value> localhost:6002</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/home/hadoop/hadoop-2.2.0/yarn_nodemanager</value> </property> <property>
  4. 4. <name>yarn.nodemanager.address</name> <value>0.0.0.0:6003</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>10240</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/home/hadoop/hadoop-2.2.0/app-logs</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>/home/hadoop/hadoop-2.2.0/logs</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> $vim slaves Add localhost in masters file Step 6: set .bashrc $cd ~ $vim .bashrc export JAVA_HOME=/usr/ export HADOOP_HOME=/home/ahadoop2/hadoop-2.2.0 export HADOOP_CONF_DIR=/home/ahadoop2/hadoop-2.2.0/etc/hadoop export PIG_HOME=/home/ahadoop2/pig-0.12.0 export HBASE_HOME=/home/ahadoop2/hbase-0.96.0-hadoop2 export HIVE_HOME=/home/ahadoop2/hive-0.11.0 export PIG_CLASSPATH=/home/ahadoop2/hadoop-2.2.0/etc/hadoop export CLASSPATH=$PIG_HOME/pig-withouthadoop.jar: $HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar: $HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar:$HBASE_HOME/lib/hbase-client- 0.96.0-hadoop2.jar:$HBASE_HOME/lib/hbase-common-0.96.0-hadoop2.jar: $HBASE_HOME/lib/hbase-server-0.96.0-hadoop2.jar:$HBASE_HOME/lib/commons-httpclient- 3.1.jar:$HBASE_HOME/lib/commons-collections-3.2.1.jar:$HBASE_HOME/lib/commons-lang- 2.6.jar:$HBASE_HOME/lib/jackson-mapper-asl-1.8.8.jar:$HBASE_HOME/lib/jackson-core-asl- 1.8.8.jar:$HBASE_HOME/lib/guava-12.0.1.jar:$HBASE_HOME/lib/protobuf-java-2.5.0.jar:
  5. 5. $HBASE_HOME/lib/commons-codec-1.7.jar:$HBASE_HOME/lib/zookeeper-3.4.5.jar: $HIVE_HOME/lib/hive-jdbc-0.11.0.jar:$HIVE_HOME/lib/hive-metastore-0.11.0.jar: $HIVE_HOME/lib/hive-serde-0.11.0.jar:$HIVE_HOME/lib/hive-common-0.11.0.jar: $HIVE_HOME/lib/hive-service-0.11.0.jar:$HIVE_HOME/lib/libfb303-0.9.0.jar: $HIVE_HOME/lib/postgresql-9.2-1003.jdbc3.jar:$HIVE_HOME/lib/libthrift-0.9.0.jar: $HIVE_HOME/lib/slf4j-api-1.6.1.jar:$HIVE_HOME/lib/commons-logging- 1.0.4.jar:/home/ahadoop2/Hadoop2Training.jar export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$PIG_HOME/bin:$HBASE_HOME/bin: $HIVE_HOME/bin:/bin:/usr/lib64/qt- 3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin: Step 7: Load .bashrc $cd ~ $. .bashrc Step 8: Formatting the name node $cd ~ $hadoop namenode -format Step 9: Starting Cluster $cd ~/hadoop-2.2.0/sbin $ ./start-all.sh To view the started daemons $ jps This should show the started daemons. NameNode DataNode SecondaryNamenode Nodemanager ResourceManager Apache Hbase Installation in Pseudo Mode Step 1: Untar the tarballs $tar -xvzf hbase-0.96.0-hadoop2.tar.gz Step 2: Configuration files $cd hbase-0.96.0-hadoop2/conf
  6. 6. $vim hbase-site.xml Copy following properties in hbase-site.xml <property> <name>hbase.rootdir</name> <value>hdfs://localhost:8020/hbase</value> <description>The directory shared by RegionServers</description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> $vim regionservers Add localhost in regionservers file Step 3: Add hadoop jars from hadoop directory to hbase lib directory $cd /home/hadoop/hadoop-2.2.0/share/hadoop/common/ $cp hadoop-common-2.2.0.jar /home/hadoop/hbase-0.96.0-hadoop2/lib/ Step 4: start hbase $cd ~ $start-hbase.sh Step 5: To view the started daemons $ jps Hmaster Hregionserver Hquorumpeer Step 6: To view hbase shell $hbase shell Step 7: Before connecting to hbase using java Start hbase rest service by executing following command
  7. 7. $hbase-daemon.sh start rest -p 8090 Apache Hive Installation Step 1: Untar the tarballs $tar -xvzf hive-0.11.0.tar.gz Step 2: Configuring a remote PostgreSQL database for the Hive Metastore Before you can run the Hive metastore with a remote PostgreSQL database, you must configure a connector to the remote PostgreSQL database, set up the initial database schema, and configure the PostgreSQL user account for the Hive user. Install and start PostgreSQL if you have not already done so you need to edit the postgresql.conf file. Set the listen property to * to make sure that the Configure authentication for your network in pg_hba.conf. Add a new line into pg_hba.con that has the following information: Start PostgreSQL Server $ su postgres $cd $postgres_home/bin $./pg_ctl start -D path_to_data_dir Install the Postgres JDBC Driver Copy postgresql-jdbc driver in $HIVE_HOME/lib/ Create the metastore database and user account Proceed as in the following example: bash# sudo –u postgres psql
  8. 8. bash$ psql postgres=# CREATE USER hiveuser WITH PASSWORD 'mypassword'; postgres=# CREATE DATABASE metastore; postgres=# exit; bash# sudo –u hiveuser metastore You are now connected to database 'metastore' with hiveuser. metastore=# i /home/hadoop/hive-0.11.0/scripts/metastore/upgrade/postgres/hive-schema- 0.10.0.postgres.sql Step 3: Configuration files $cd hive-0.11.0/conf $vim hive-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:postgresql://<postgresql instance ip>:5432/metastore</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.postgresql.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hiveuser</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>mypassword</value> </property> <property>
  9. 9. <name>datanucleus.autoCreateSchema</name> <value>false</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://<namenode ip>:9083</value> <description>IP address (or fully-qualified domain name) and port of the metastore host</description> </property> <property> <name>datanucleus.autoStartMechanism</name> <value>SchemaTable</value> </property> </configuration> Step 4: Strat hive metastore $hive --service metastore Step 5: To view hive console $hive hive>show tables; OK Step 6: Before connecting to hive using java Start hiveserver by executing following command $hive --service hiveserver Apache pig installation Step 1: Untar the tarballs $tar -xvzf pig-0.12.0.tar.gz Step 2: Delete two jars (pig and pig-without hadoop jar) from pig home directory and add pig- withouthadoop.jar in pig installation directory (Uploaded in knowmax same path) Step 3: To open pig grunt $pig
  10. 10. <name>datanucleus.autoCreateSchema</name> <value>false</value> </property> <property> <name>hive.metastore.uris</name> <value>thrift://<namenode ip>:9083</value> <description>IP address (or fully-qualified domain name) and port of the metastore host</description> </property> <property> <name>datanucleus.autoStartMechanism</name> <value>SchemaTable</value> </property> </configuration> Step 4: Strat hive metastore $hive --service metastore Step 5: To view hive console $hive hive>show tables; OK Step 6: Before connecting to hive using java Start hiveserver by executing following command $hive --service hiveserver Apache pig installation Step 1: Untar the tarballs $tar -xvzf pig-0.12.0.tar.gz Step 2: Delete two jars (pig and pig-without hadoop jar) from pig home directory and add pig- withouthadoop.jar in pig installation directory (Uploaded in knowmax same path) Step 3: To open pig grunt $pig

×