SlideShare a Scribd company logo
Prepared by : Abhishek Gautam
 Hive is a data warehouse application which is used for
summarizing, querying and anzalyzing large amount of
data stored on HDFS.
 Used to run batch queries on structured data which is
similar to SQL.
 It’s a non-procedural language.
 It resides on top of Hadoop to summarize Big Data, and
makes querying and analyzing easy.
 Used by Data analyst.
 The language used here is called HQL.
 External Table
This type of table is used when you want to store the data
file in the HDFS even after dropping the table.
When you drop an external table, it only drops the meta
data. That means hive is ignorant of that data now. It does
not touch the data itself.
 Syntax:
CREATE EXTERNAL TABLE Table_Name
(Column_Name Datatype)
[ROW FORMAT row_format]
[STORED AS file_format];
 Internal Table
This type of table is used when you don’t want to store the
data file in the HDFS after dropping the table.
When you drop an internal table, it drops the data, and it
also drops the metadata.
 Syntax:
CREATE TABLE Table_Name(Column_Name Datatype)
[ROW FORMAT row_format]
[STORED AS file_format];
 Partitioned Table:
As in Big Data Concept we deal with large dataset, It takes
huge amount of time to process and query those datasets.
To process query faster Hive organizes tables into
partitions. It is a way of dividing a table into related parts
based on the values of partitioned columns such as date,
city, and department. Using partition, it is easy to query a
portion of the data.
 Example:
CREATE TABLE Client(id INT,Name STRING, City STRING)
PARTITIONED BY(country STRING)
STORED AS TEXTFILE;
 Bucket Table:
With partitioning, there is a possibility that you can create
multiple small partitions based on column values. If you go
for bucketing, you are restricting number of buckets to
store the data. This number is defined during table
creation.
 Example:
CREATE TABLE Client(id INT, Name STRING, City STRING)
PARTITIONED BY(country STRING)
CLUSTERED BY (City) INTO 32 BUCKETS
STORED AS TEXTFILE;
 Create:
Syntax : CREATE TABLE
Table_Name(Column_NameDATATYPE);
 ALTER:
Syntax : ALTER TABLE Table_Name Operations;
We can add partition to a table which is already created
by using ALTER Operation.
Syntax : ALTER TABLE Table_Name ADD PARTITION(
Col_Name DATATYPE);
 DROP :
Syntax : DROP TABLE Table_Name;
 SELECT Query:
Syntax : SELECT [ALL | DISTINCT] * FROM
Table_Name WHERE where_condition ;
We Can limit the number of rows produced as result of
SELECT Query statement by Using LIMIT Option.
Syntax : SELECT [ALL | DISTINCT] * FROM
Table_Name WHERE where_condition
LIMIT 10;
 GROUP BY:
GROUP BY clause is used to group the result based on
the Column named in the GROUP BY clause.
Syntax : SELECT Col_list FROM Table_Name
WHERE where_condition
GROUP BY col_list ;
 ORDER BY:
ORDER BY clause is used to organize the result in
ascending or descending order based on the Column
named in the ORDER BY clause. Ascending order
segregation is by default, to organize the result in
descending order we have to use DESC clause.
Syntax : SELECT Col_list FROM Table_Name
WHERE where_condition
GROUP BY col_list
ORDER BY col_name DESC ;
 JOIN:
JOIN clause is used to get result from two or more table
which have a common column.
Syntax: SELECT a.col_name,b.col_name
FROM Table1_Name a
JOIN Table2_Name b
ON a.col_Name=b.col_Name;

More Related Content

What's hot

Apache hive
Apache hiveApache hive
Apache hive
Vaibhav Kadu
 
Unit 5-lecture4
Unit 5-lecture4Unit 5-lecture4
Unit 5-lecture4
vishal choudhary
 
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Apache hive introduction
Apache hive introductionApache hive introduction
Apache hive introduction
Mahmood Reza Esmaili Zand
 
03 hive query language (hql)
03 hive query language (hql)03 hive query language (hql)
03 hive query language (hql)
Subhas Kumar Ghosh
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshop
Purna Chander
 
Session 14 - Hive
Session 14 - HiveSession 14 - Hive
Session 14 - Hive
AnandMHadoop
 
Hive and data analysis using pandas
 Hive  and  data analysis  using pandas Hive  and  data analysis  using pandas
Hive and data analysis using pandas
Purna Chander K
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Cloudera, Inc.
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7
Rohit Agrawal
 
Introduction to Hive and HCatalog
Introduction to Hive and HCatalogIntroduction to Hive and HCatalog
Introduction to Hive and HCatalog
markgrover
 
HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase
HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase
HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase
Cloudera, Inc.
 
Unit 5-apache hive
Unit 5-apache hiveUnit 5-apache hive
Unit 5-apache hive
vishal choudhary
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
Abhinav Tyagi
 
Big Data and Hadoop Components
Big Data and Hadoop ComponentsBig Data and Hadoop Components
Big Data and Hadoop Components
DezyreAcademy
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
Jay Nagar
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Julian Hyde
 
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingMitsuharu Hamba
 

What's hot (19)

Apache hive
Apache hiveApache hive
Apache hive
 
Unit 5-lecture4
Unit 5-lecture4Unit 5-lecture4
Unit 5-lecture4
 
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Apache Hive | Big Data Hadoop Spark Tutorial | CloudxLab
 
Apache hive introduction
Apache hive introductionApache hive introduction
Apache hive introduction
 
03 hive query language (hql)
03 hive query language (hql)03 hive query language (hql)
03 hive query language (hql)
 
Hadoop workshop
Hadoop workshopHadoop workshop
Hadoop workshop
 
Session 14 - Hive
Session 14 - HiveSession 14 - Hive
Session 14 - Hive
 
Hive and data analysis using pandas
 Hive  and  data analysis  using pandas Hive  and  data analysis  using pandas
Hive and data analysis using pandas
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7
 
Introduction to Hive and HCatalog
Introduction to Hive and HCatalogIntroduction to Hive and HCatalog
Introduction to Hive and HCatalog
 
HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase
HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase
HBaseCon 2013: Honeycomb - MySQL Backed by Apache HBase
 
Unit 5-apache hive
Unit 5-apache hiveUnit 5-apache hive
Unit 5-apache hive
 
Hive(ppt)
Hive(ppt)Hive(ppt)
Hive(ppt)
 
Big Data and Hadoop Components
Big Data and Hadoop ComponentsBig Data and Hadoop Components
Big Data and Hadoop Components
 
Apache Hadoop Big Data Technology
Apache Hadoop Big Data TechnologyApache Hadoop Big Data Technology
Apache Hadoop Big Data Technology
 
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
 
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLabIntroduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
Introduction to Pig & Pig Latin | Big Data Hadoop Spark Tutorial | CloudxLab
 
Hive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReadingHive vs Pig for HadoopSourceCodeReading
Hive vs Pig for HadoopSourceCodeReading
 

Similar to Apache Hive

Hive commands
Hive commandsHive commands
Hive commands
Ganesh Sanap
 
SKILLWISE-DB2 DBA
SKILLWISE-DB2 DBASKILLWISE-DB2 DBA
SKILLWISE-DB2 DBA
Skillwise Group
 
Apache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketingApache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketing
earnwithme2522
 
Apache TAJO
Apache TAJOApache TAJO
Apache TAJO
Asis Mohanty
 
Advanced topics in hive
Advanced topics in hiveAdvanced topics in hive
Advanced topics in hive
Uday Vakalapudi
 
Ten tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveTen tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache Hive
Will Du
 
Hive
HiveHive
vFabric SQLFire Introduction
vFabric SQLFire IntroductionvFabric SQLFire Introduction
vFabric SQLFire Introduction
Jags Ramnarayan
 
MySQL Essential Training
MySQL Essential TrainingMySQL Essential Training
MySQL Essential Training
HudaRaghibKadhim
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
Zheng Shao
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
Padma shree. T
 
Data base
Data baseData base
Data base
Girish Gowda
 
SQL DDL
SQL DDLSQL DDL
SQL DDL
Vikas Gupta
 
ADBMS Unit-II c
ADBMS Unit-II cADBMS Unit-II c
Creating database using sql commands
Creating database using sql commandsCreating database using sql commands
Creating database using sql commandsBelle Wx
 
Sql intro & ddl 1
Sql intro & ddl 1Sql intro & ddl 1
Sql intro & ddl 1
Dr. C.V. Suresh Babu
 
Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management System
sweetysweety8
 

Similar to Apache Hive (20)

Hive commands
Hive commandsHive commands
Hive commands
 
SKILLWISE-DB2 DBA
SKILLWISE-DB2 DBASKILLWISE-DB2 DBA
SKILLWISE-DB2 DBA
 
Apache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketingApache Hive, data segmentation and bucketing
Apache Hive, data segmentation and bucketing
 
Apache TAJO
Apache TAJOApache TAJO
Apache TAJO
 
Advanced topics in hive
Advanced topics in hiveAdvanced topics in hive
Advanced topics in hive
 
Ten tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache HiveTen tools for ten big data areas 04_Apache Hive
Ten tools for ten big data areas 04_Apache Hive
 
Hive
HiveHive
Hive
 
vFabric SQLFire Introduction
vFabric SQLFire IntroductionvFabric SQLFire Introduction
vFabric SQLFire Introduction
 
MySQL Essential Training
MySQL Essential TrainingMySQL Essential Training
MySQL Essential Training
 
Hive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive TeamHive User Meeting March 2010 - Hive Team
Hive User Meeting March 2010 - Hive Team
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
 
Data base
Data baseData base
Data base
 
SQL DDL
SQL DDLSQL DDL
SQL DDL
 
Mysql cheatsheet
Mysql cheatsheetMysql cheatsheet
Mysql cheatsheet
 
ADBMS Unit-II c
ADBMS Unit-II cADBMS Unit-II c
ADBMS Unit-II c
 
Creating database using sql commands
Creating database using sql commandsCreating database using sql commands
Creating database using sql commands
 
SQL.pptx
SQL.pptxSQL.pptx
SQL.pptx
 
Sql intro & ddl 1
Sql intro & ddl 1Sql intro & ddl 1
Sql intro & ddl 1
 
Sql intro & ddl 1
Sql intro & ddl 1Sql intro & ddl 1
Sql intro & ddl 1
 
Relational Database Management System
Relational Database Management SystemRelational Database Management System
Relational Database Management System
 

More from Abhishek Gautam

Power Bi Basics
Power Bi BasicsPower Bi Basics
Power Bi Basics
Abhishek Gautam
 
SQL : Structured Query Language
SQL : Structured Query LanguageSQL : Structured Query Language
SQL : Structured Query Language
Abhishek Gautam
 
Apache Pig
Apache PigApache Pig
Apache Pig
Abhishek Gautam
 
Big data
Big dataBig data
Big data
Abhishek Gautam
 
Rsa cryptosystem
Rsa cryptosystemRsa cryptosystem
Rsa cryptosystem
Abhishek Gautam
 
Enterprise application environment
Enterprise application environmentEnterprise application environment
Enterprise application environment
Abhishek Gautam
 
Software testing
Software testingSoftware testing
Software testing
Abhishek Gautam
 

More from Abhishek Gautam (7)

Power Bi Basics
Power Bi BasicsPower Bi Basics
Power Bi Basics
 
SQL : Structured Query Language
SQL : Structured Query LanguageSQL : Structured Query Language
SQL : Structured Query Language
 
Apache Pig
Apache PigApache Pig
Apache Pig
 
Big data
Big dataBig data
Big data
 
Rsa cryptosystem
Rsa cryptosystemRsa cryptosystem
Rsa cryptosystem
 
Enterprise application environment
Enterprise application environmentEnterprise application environment
Enterprise application environment
 
Software testing
Software testingSoftware testing
Software testing
 

Recently uploaded

ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 

Recently uploaded (20)

ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 

Apache Hive

  • 1. Prepared by : Abhishek Gautam
  • 2.  Hive is a data warehouse application which is used for summarizing, querying and anzalyzing large amount of data stored on HDFS.
  • 3.  Used to run batch queries on structured data which is similar to SQL.  It’s a non-procedural language.  It resides on top of Hadoop to summarize Big Data, and makes querying and analyzing easy.  Used by Data analyst.  The language used here is called HQL.
  • 4.  External Table This type of table is used when you want to store the data file in the HDFS even after dropping the table. When you drop an external table, it only drops the meta data. That means hive is ignorant of that data now. It does not touch the data itself.
  • 5.  Syntax: CREATE EXTERNAL TABLE Table_Name (Column_Name Datatype) [ROW FORMAT row_format] [STORED AS file_format];
  • 6.  Internal Table This type of table is used when you don’t want to store the data file in the HDFS after dropping the table. When you drop an internal table, it drops the data, and it also drops the metadata.
  • 7.  Syntax: CREATE TABLE Table_Name(Column_Name Datatype) [ROW FORMAT row_format] [STORED AS file_format];
  • 8.  Partitioned Table: As in Big Data Concept we deal with large dataset, It takes huge amount of time to process and query those datasets. To process query faster Hive organizes tables into partitions. It is a way of dividing a table into related parts based on the values of partitioned columns such as date, city, and department. Using partition, it is easy to query a portion of the data.
  • 9.  Example: CREATE TABLE Client(id INT,Name STRING, City STRING) PARTITIONED BY(country STRING) STORED AS TEXTFILE;
  • 10.  Bucket Table: With partitioning, there is a possibility that you can create multiple small partitions based on column values. If you go for bucketing, you are restricting number of buckets to store the data. This number is defined during table creation.
  • 11.  Example: CREATE TABLE Client(id INT, Name STRING, City STRING) PARTITIONED BY(country STRING) CLUSTERED BY (City) INTO 32 BUCKETS STORED AS TEXTFILE;
  • 12.  Create: Syntax : CREATE TABLE Table_Name(Column_NameDATATYPE);  ALTER: Syntax : ALTER TABLE Table_Name Operations; We can add partition to a table which is already created by using ALTER Operation. Syntax : ALTER TABLE Table_Name ADD PARTITION( Col_Name DATATYPE);
  • 13.  DROP : Syntax : DROP TABLE Table_Name;  SELECT Query: Syntax : SELECT [ALL | DISTINCT] * FROM Table_Name WHERE where_condition ; We Can limit the number of rows produced as result of SELECT Query statement by Using LIMIT Option. Syntax : SELECT [ALL | DISTINCT] * FROM Table_Name WHERE where_condition LIMIT 10;
  • 14.  GROUP BY: GROUP BY clause is used to group the result based on the Column named in the GROUP BY clause. Syntax : SELECT Col_list FROM Table_Name WHERE where_condition GROUP BY col_list ;
  • 15.  ORDER BY: ORDER BY clause is used to organize the result in ascending or descending order based on the Column named in the ORDER BY clause. Ascending order segregation is by default, to organize the result in descending order we have to use DESC clause. Syntax : SELECT Col_list FROM Table_Name WHERE where_condition GROUP BY col_list ORDER BY col_name DESC ;
  • 16.  JOIN: JOIN clause is used to get result from two or more table which have a common column. Syntax: SELECT a.col_name,b.col_name FROM Table1_Name a JOIN Table2_Name b ON a.col_Name=b.col_Name;