SlideShare a Scribd company logo
1 of 14
Download to read offline
Hive - II
Installing Hive, internal and
external table, import-
export and more.
Installing Hive
$ wget: http://www-us.apache.org/dist/hive/hive-1.2.2/apache-hive-1.2.2-bin.t ar.gz
#unzip
Tar –zxvf apache-hive-1.2.2-bin.tar.gz
#renaming
mv apache-hive-1.2.2.bin.tar.gz hive
(#move hive to the folder were we have other apache products for ease
maintenance (optional) $ mv hive /usr/local)
#now specify hive in .bashrc
$cd ..
$cd hduser
hduser@localhost$ vi .bashrc
Export HIVE_HOME = /home/hduser/Hive (or /usr/local/hive) # define the location
where Hive is installed.
#update
PATH=$PATH:$HADOOP_PREFIX/bin:$SQOOP_HOME/bin:$HIVE_HOME/bin
Rupak Roy
#udpate OS with the recent changes:
hduser@localhost$ source ~/.bashrc
To see list of commands type:
hduser@localhost$ hive –help
Again to see list of services type
hduser@localhost$ hive --help
Now to Start Hive simply type:
 hduser@localhost$ hive
Rupak Roy
Schema
 Hive does it best to read the format of the data,
if the HDFS file have more data then the columns
described while creating a hive table, then it will
not take the extra columns and again if the HDFs
file has less data columns then the columns
described then it will take those columns extra
columns as NULL.
 Also if the column data type is different than the
original data type, it will return null values. For
example describing integer data type for a string
data type will show as null values.
Rupak Roy
Hive Databases
 A database is a namespace or a collection of tables.
 To create a database in Hive use
CREATE DATABASE|SCHEMA [IF NOT EXISTS] <database name>
[IF NOT EXISTS ] checks if there is already a database with the same name.
Example:
hive> CREATE DATABASE IF NOT EXISTS db_1;
Or hive> CREATE SCHEMA db_1;
 To view the list of databases;
hive> SHOW DATABASES
 We can also check and set the database directory using:
CHECK
hive > SET hive.metastore.warehouse.dir;
output: hive.metastore.warehouse.dir =/user/hive/warehouse
SET
hive> SET hive.metastore.warehouse.dir = /user/hive1/warehouse1;
Rupak Roy
 To delete a hive database use:
DROP DATABASE <database_name>
Example:
hive> DROP DATABASE db_1;
Remember: We can’t DROP a database if it contains table, this will throw an error.
So use, hive> DROP DATABASE db_1 CASCADE;
 Now if we want to store it in a different location/directory use:
hive> CREATE DATABASE IF NOT EXISTS db_2
LOCATION ‘/user/cloudera/hive/’ ;
 To use a newly created database:
hive> use db_2;
 Lets create a table to demonstrate the DROP and the Cascade function.
hive> create table tb_1 ( Name STRING, ID INT);
hive> drop database db_2;
-------error -----database is not empty.
hive> drop database db_2 CASCADE;
------no error---database is deleted.
Rupak Roy
 In real life it will be very difficult each and every time to remember or
keep track of the current database that we are working with. So its
better we should enable the cli.print = true to display the current
working database.
hive> create database db_2;
hive> SET hive.cli.print.current.db;
hive> SET hive.cli.print.current.db = true;
hive(db_2) > quit;
Now, we can set this feature permanently so that each time we run hive
we don’t have to go through the above commands again and again.
#browse to the user directory
$ cd ..
$ cd home
$ cd user (hduser)
hduser@localhost~$ vi .hiverc
Set hive.cli.print.current.db=true;
Next time when we start the hive, it will look for a file labeled .hvierc in
our home directory and will execute the command automatically.
Rupak Roy
To create a Hive table
hive>Create table IF NOT EXISTS employee (ID int, name string , location string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘#’
LINES TERMINATED BY ‘n’
STORED AS TEXTFILE;
hive> describe employee
We can also prefix the database name
where we want to create the table even if
we are already working in another database
Example:
Hive> use db_3;
Hive> create table IF NOT EXISTS db_1.employee (ID int, name string, location String)
Row format delimited
Fields terminated by ‘t’
Lines terminated by ‘n’
Stored as textfile;
ID Name Location
Rupak Roy
Alternative Way
Copy the Structure that is the Schema of the table:
hive> create db_2;
hive> create table db_2.emp Like db_1.employee
To get more details about a table use:
hive> describe db_2.emp ;
hive> describe extended db_2.emp;
hive> describe FORMATTED db_2.emp;
We can create & insert data at the same using a single query:
hive(db_2)> create table tb_9 AS
Select * from emp ;
Also we can create new table by sub setting:
hive(db_2)> create table tb_9 AS
Select ID, location from emp;
Rupak Roy
Hive table in two different ways:
1) Internal Table: whenever we drop the table it will also delete the physical
file(table data) from the directory created by hive. This is why it is not user
friendly when it comes to sharing the same data with other applications.
Hive(db_2)>Create table IF NOT EXISTS emp (ID int, name string, location string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘t’
LINES TERMINATED BY ‘n’
STORED AS TEXTFILE;
hive(db_2)> load data local INPATH ‘/home/hduser/datasets/htable’
OVERWRITE INTO table emp;
$ hadoop fs -cat /user/cloudera/hive/emp/htable;
Note: if we want to perform the same steps again just use
hive> truncate table emp;
it will delete all the rows in the table and not the table itself.
Rupak Roy
2) External Table: is the opposite of the internal table. Whenever the table is
dropped only the meta data is removed not the physical file (data).
hive(db_2)>Create EXTERNAL TABLE empExternal
(ID int, name string, location string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘#’
LINES TERMINATED BY ‘n’
STORED AS TEXTFILE
LOCATION ‘/user/cloudera/hiveExternal’;
Hive(db_2)> select * from empExternal ;
-------------------empty----- data is not copied -------------
$ hadoop fs -put datasets/file1 /user/cloudera/hiveExternal;
hive(db_2)> select * from empExternal
#now drop the table
hive(db_2)> DROP table emp;
hive(db_2)> DROP table empExternal
Rupak Roy
Load data
 Load Data locally:
hive(db_1)> LOAD data local INPATH(‘datasets/file1’)
OVERWRITE INTO employees;
 Load data from HDFS
hive(db_1)> LOAD data INPATH ‘hivedata/file1’ OVERWRITE
INTO employees;
 Load all the data from a folder instead a single file.
hvie(db_1)> Load data local INPATH ‘datasets/ ‘ OVERWRITE
INTO employees;
Note: whenever we load the data from the HDFS, the data
will be moved from its directory to the destination directory
preventing duplicate records inside HDFS, however in local
mode it will create an another copy of the file in the HDFS.
Rupak Roy
Export data from Hive
Hive> INSERT overwrite Local Directory
‘user/hduser/datasets/ ‘ select * from emp;
hive> insert INTO Local Directory ‘user/hduser/datasets/’
select * from emp
OR inside HDFS
hive> insert ‘directory in the hdfs’ select * from emp;
Or manually copy the file from HDFS like
$ hadoop fs –get
/user/cloudera/emp/user/hduser/datasets
Rupak Roy
Next
 Partitioned Table, HQL
Rupak Roy

More Related Content

What's hot

Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testingGaruda Trainings
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebookragho
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat SheetHortonworks
 
Oracle Managed Files
Oracle Managed FilesOracle Managed Files
Oracle Managed FilesAnar Godjaev
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows habeebulla g
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBaseAmal Abid
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117exsuns
 
Asian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On UblAsian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On Ublnewrforce
 
Oracle goldengate 11g schema replication from standby database
Oracle goldengate 11g schema replication from standby databaseOracle goldengate 11g schema replication from standby database
Oracle goldengate 11g schema replication from standby databaseuzzal basak
 
Run a mapreduce job
Run a mapreduce jobRun a mapreduce job
Run a mapreduce jobsubburaj raj
 
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...CloudxLab
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsCommand Prompt., Inc
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamZheng Shao
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteAllen Wittenauer
 

What's hot (20)

Unix commands in etl testing
Unix commands in etl testingUnix commands in etl testing
Unix commands in etl testing
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
 
SQL to Hive Cheat Sheet
SQL to Hive Cheat SheetSQL to Hive Cheat Sheet
SQL to Hive Cheat Sheet
 
Sql cheat sheet
Sql cheat sheetSql cheat sheet
Sql cheat sheet
 
Hadoop on osx
Hadoop on osxHadoop on osx
Hadoop on osx
 
RHive tutorial - HDFS functions
RHive tutorial - HDFS functionsRHive tutorial - HDFS functions
RHive tutorial - HDFS functions
 
Oracle Managed Files
Oracle Managed FilesOracle Managed Files
Oracle Managed Files
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBase
 
Shark - Lab Assignment
Shark - Lab AssignmentShark - Lab Assignment
Shark - Lab Assignment
 
Perl Programming - 04 Programming Database
Perl Programming - 04 Programming DatabasePerl Programming - 04 Programming Database
Perl Programming - 04 Programming Database
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
 
Asian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On UblAsian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On Ubl
 
RHive tutorial - Installation
RHive tutorial - InstallationRHive tutorial - Installation
RHive tutorial - Installation
 
Oracle goldengate 11g schema replication from standby database
Oracle goldengate 11g schema replication from standby databaseOracle goldengate 11g schema replication from standby database
Oracle goldengate 11g schema replication from standby database
 
Run a mapreduce job
Run a mapreduce jobRun a mapreduce job
Run a mapreduce job
 
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
Introduction to MapReduce - Hadoop Streaming | Big Data Hadoop Spark Tutorial...
 
PostgreSQL Administration for System Administrators
PostgreSQL Administration for System AdministratorsPostgreSQL Administration for System Administrators
PostgreSQL Administration for System Administrators
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell Rewrite
 

Similar to Installing Apache Hive, internal and external table, import-export

Implementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdfImplementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdffunkybabyindia
 
Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine Monowar Mukul
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON Padma shree. T
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Nag Arvind Gudiseva
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Manish Chopra
 
Working with databases in Perl
Working with databases in PerlWorking with databases in Perl
Working with databases in PerlLaurent Dami
 
Installing hive on ubuntu 16
Installing hive on ubuntu 16Installing hive on ubuntu 16
Installing hive on ubuntu 16Enrique Davila
 
DBD::SQLite
DBD::SQLiteDBD::SQLite
DBD::SQLitecharsbar
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Titus Damaiyanti
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start TutorialCarl Steinbach
 
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdfVikas 500 BIG DATA TECHNOLOGIES LAB.pdf
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdfvikas12611618
 
DSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital LibraryDSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital Libraryrajivkumarmca
 

Similar to Installing Apache Hive, internal and external table, import-export (20)

Hive
HiveHive
Hive
 
Unit 5-lecture-3
Unit 5-lecture-3Unit 5-lecture-3
Unit 5-lecture-3
 
Implementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdfImplementing a basic directory-tree structure that is derived from a.pdf
Implementing a basic directory-tree structure that is derived from a.pdf
 
Sah
SahSah
Sah
 
Apache Hive
Apache HiveApache Hive
Apache Hive
 
Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine Exadata - BULK DATA LOAD Testing on Database Machine
Exadata - BULK DATA LOAD Testing on Database Machine
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
 
Working with databases in Perl
Working with databases in PerlWorking with databases in Perl
Working with databases in Perl
 
Installing hive on ubuntu 16
Installing hive on ubuntu 16Installing hive on ubuntu 16
Installing hive on ubuntu 16
 
DBD::SQLite
DBD::SQLiteDBD::SQLite
DBD::SQLite
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
 
Hive Quick Start Tutorial
Hive Quick Start TutorialHive Quick Start Tutorial
Hive Quick Start Tutorial
 
Sqlite perl
Sqlite perlSqlite perl
Sqlite perl
 
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdfVikas 500 BIG DATA TECHNOLOGIES LAB.pdf
Vikas 500 BIG DATA TECHNOLOGIES LAB.pdf
 
Dspace tutorial
Dspace tutorialDspace tutorial
Dspace tutorial
 
DSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital LibraryDSpace Tutorial : Open Source Digital Library
DSpace Tutorial : Open Source Digital Library
 
20081030linkedin
20081030linkedin20081030linkedin
20081030linkedin
 
Hive practice
Hive practiceHive practice
Hive practice
 

More from Rupak Roy

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPRupak Roy
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPRupak Roy
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLPRupak Roy
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLPRupak Roy
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical StepsRupak Roy
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment AnalysisRupak Roy
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular ExpressionsRupak Roy
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining Rupak Roy
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase ArchitectureRupak Roy
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase Rupak Roy
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQLRupak Roy
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive Rupak Roy
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Rupak Roy
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Rupak Roy
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command LineRupak Roy
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations Rupak Roy
 
Apache PIG casting, reference
Apache PIG casting, referenceApache PIG casting, reference
Apache PIG casting, referenceRupak Roy
 
Pig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsPig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsRupak Roy
 
Introduction to PIG components
Introduction to PIG components Introduction to PIG components
Introduction to PIG components Rupak Roy
 

More from Rupak Roy (20)

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLP
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLP
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLP
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical Steps
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular Expressions
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase Architecture
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQL
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command Line
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations
 
Apache PIG casting, reference
Apache PIG casting, referenceApache PIG casting, reference
Apache PIG casting, reference
 
Pig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsPig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store Functions
 
Introduction to PIG components
Introduction to PIG components Introduction to PIG components
Introduction to PIG components
 

Recently uploaded

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 

Recently uploaded (20)

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 

Installing Apache Hive, internal and external table, import-export

  • 1. Hive - II Installing Hive, internal and external table, import- export and more.
  • 2. Installing Hive $ wget: http://www-us.apache.org/dist/hive/hive-1.2.2/apache-hive-1.2.2-bin.t ar.gz #unzip Tar –zxvf apache-hive-1.2.2-bin.tar.gz #renaming mv apache-hive-1.2.2.bin.tar.gz hive (#move hive to the folder were we have other apache products for ease maintenance (optional) $ mv hive /usr/local) #now specify hive in .bashrc $cd .. $cd hduser hduser@localhost$ vi .bashrc Export HIVE_HOME = /home/hduser/Hive (or /usr/local/hive) # define the location where Hive is installed. #update PATH=$PATH:$HADOOP_PREFIX/bin:$SQOOP_HOME/bin:$HIVE_HOME/bin Rupak Roy
  • 3. #udpate OS with the recent changes: hduser@localhost$ source ~/.bashrc To see list of commands type: hduser@localhost$ hive –help Again to see list of services type hduser@localhost$ hive --help Now to Start Hive simply type:  hduser@localhost$ hive Rupak Roy
  • 4. Schema  Hive does it best to read the format of the data, if the HDFS file have more data then the columns described while creating a hive table, then it will not take the extra columns and again if the HDFs file has less data columns then the columns described then it will take those columns extra columns as NULL.  Also if the column data type is different than the original data type, it will return null values. For example describing integer data type for a string data type will show as null values. Rupak Roy
  • 5. Hive Databases  A database is a namespace or a collection of tables.  To create a database in Hive use CREATE DATABASE|SCHEMA [IF NOT EXISTS] <database name> [IF NOT EXISTS ] checks if there is already a database with the same name. Example: hive> CREATE DATABASE IF NOT EXISTS db_1; Or hive> CREATE SCHEMA db_1;  To view the list of databases; hive> SHOW DATABASES  We can also check and set the database directory using: CHECK hive > SET hive.metastore.warehouse.dir; output: hive.metastore.warehouse.dir =/user/hive/warehouse SET hive> SET hive.metastore.warehouse.dir = /user/hive1/warehouse1; Rupak Roy
  • 6.  To delete a hive database use: DROP DATABASE <database_name> Example: hive> DROP DATABASE db_1; Remember: We can’t DROP a database if it contains table, this will throw an error. So use, hive> DROP DATABASE db_1 CASCADE;  Now if we want to store it in a different location/directory use: hive> CREATE DATABASE IF NOT EXISTS db_2 LOCATION ‘/user/cloudera/hive/’ ;  To use a newly created database: hive> use db_2;  Lets create a table to demonstrate the DROP and the Cascade function. hive> create table tb_1 ( Name STRING, ID INT); hive> drop database db_2; -------error -----database is not empty. hive> drop database db_2 CASCADE; ------no error---database is deleted. Rupak Roy
  • 7.  In real life it will be very difficult each and every time to remember or keep track of the current database that we are working with. So its better we should enable the cli.print = true to display the current working database. hive> create database db_2; hive> SET hive.cli.print.current.db; hive> SET hive.cli.print.current.db = true; hive(db_2) > quit; Now, we can set this feature permanently so that each time we run hive we don’t have to go through the above commands again and again. #browse to the user directory $ cd .. $ cd home $ cd user (hduser) hduser@localhost~$ vi .hiverc Set hive.cli.print.current.db=true; Next time when we start the hive, it will look for a file labeled .hvierc in our home directory and will execute the command automatically. Rupak Roy
  • 8. To create a Hive table hive>Create table IF NOT EXISTS employee (ID int, name string , location string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘#’ LINES TERMINATED BY ‘n’ STORED AS TEXTFILE; hive> describe employee We can also prefix the database name where we want to create the table even if we are already working in another database Example: Hive> use db_3; Hive> create table IF NOT EXISTS db_1.employee (ID int, name string, location String) Row format delimited Fields terminated by ‘t’ Lines terminated by ‘n’ Stored as textfile; ID Name Location Rupak Roy
  • 9. Alternative Way Copy the Structure that is the Schema of the table: hive> create db_2; hive> create table db_2.emp Like db_1.employee To get more details about a table use: hive> describe db_2.emp ; hive> describe extended db_2.emp; hive> describe FORMATTED db_2.emp; We can create & insert data at the same using a single query: hive(db_2)> create table tb_9 AS Select * from emp ; Also we can create new table by sub setting: hive(db_2)> create table tb_9 AS Select ID, location from emp; Rupak Roy
  • 10. Hive table in two different ways: 1) Internal Table: whenever we drop the table it will also delete the physical file(table data) from the directory created by hive. This is why it is not user friendly when it comes to sharing the same data with other applications. Hive(db_2)>Create table IF NOT EXISTS emp (ID int, name string, location string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ LINES TERMINATED BY ‘n’ STORED AS TEXTFILE; hive(db_2)> load data local INPATH ‘/home/hduser/datasets/htable’ OVERWRITE INTO table emp; $ hadoop fs -cat /user/cloudera/hive/emp/htable; Note: if we want to perform the same steps again just use hive> truncate table emp; it will delete all the rows in the table and not the table itself. Rupak Roy
  • 11. 2) External Table: is the opposite of the internal table. Whenever the table is dropped only the meta data is removed not the physical file (data). hive(db_2)>Create EXTERNAL TABLE empExternal (ID int, name string, location string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘#’ LINES TERMINATED BY ‘n’ STORED AS TEXTFILE LOCATION ‘/user/cloudera/hiveExternal’; Hive(db_2)> select * from empExternal ; -------------------empty----- data is not copied ------------- $ hadoop fs -put datasets/file1 /user/cloudera/hiveExternal; hive(db_2)> select * from empExternal #now drop the table hive(db_2)> DROP table emp; hive(db_2)> DROP table empExternal Rupak Roy
  • 12. Load data  Load Data locally: hive(db_1)> LOAD data local INPATH(‘datasets/file1’) OVERWRITE INTO employees;  Load data from HDFS hive(db_1)> LOAD data INPATH ‘hivedata/file1’ OVERWRITE INTO employees;  Load all the data from a folder instead a single file. hvie(db_1)> Load data local INPATH ‘datasets/ ‘ OVERWRITE INTO employees; Note: whenever we load the data from the HDFS, the data will be moved from its directory to the destination directory preventing duplicate records inside HDFS, however in local mode it will create an another copy of the file in the HDFS. Rupak Roy
  • 13. Export data from Hive Hive> INSERT overwrite Local Directory ‘user/hduser/datasets/ ‘ select * from emp; hive> insert INTO Local Directory ‘user/hduser/datasets/’ select * from emp OR inside HDFS hive> insert ‘directory in the hdfs’ select * from emp; Or manually copy the file from HDFS like $ hadoop fs –get /user/cloudera/emp/user/hduser/datasets Rupak Roy