SlideShare a Scribd company logo
1 of 15
Unit-5
Apache Hive
Introduction
 Most of the Data Scientists use SQL queries in order to
explore the data and get valuable insights from them. Now, as
the volume of data is growing at such a high pace, we need
new dedicated tools to deal with big volumes of data.
 Initially, Hadoop came up and became one of the most popular
tools to process and store big data. But developers were
required to write complex map-reduce codes to work with
Hadoop. This is Facebook’s Apache Hive came to rescue. It is
another tool designed to work with Hadoop. We can write SQL
like queries in the hive and in the backend it converts them
into the map-reduce jobs.
What is Apache Hive?
 Apache Hive is a data warehouse system developed by Facebook to
process a huge amount of structure data in Hadoop. We know that to
process the data using Hadoop, we need to right complex map-
reduce functions which is not an easy task for most of the
developers. Hive makes this work very easy for us.
 It uses a scripting language called HiveQL which is almost similar to
the SQL. So now, we just have to write SQL-like commands and at
the backend of Hive will automatically convert them into the map-
reduce jobs.

Apache Hive Architecture
 Hive Clients: It allows us to write hive applications using
different types of clients such as thrift server, JDBC driver for
Java, and Hive applications and also supports the applications
that use ODBC protocol.
 Hive Services: As a developer, if we wish to process any
data, we need to use the hive services such as hive CLI
(Command Line Interface). In addition to that hive also
provides a web-based interface to run the hive applications.
 Hive Driver: It is capable of receiving queries from multiple
resources like thrift, JDBC, and ODBS using the hive server
and directly from hive CLI and web-based UI. After receiving
the queries, it transfers it to the compiler.
 HiveQL Engine: It receives the query from the compiler and
converts the SQL like query into the map-reduce jobs.
 Meta Store: Here hive stores the meta-information about the
databases like schema of the table, data types of the columns,
location in the HDFS, etc
 HDFS: It is simply the Hadoop distributed file system used to
store the data.
Working of Apache Hive
1. In the first step, we write down the query using the web interface or the command-line interface of the hive. It sends
it to the driver to execute the query.
2. In the next step, the driver sends the received query to the compiler where the compiler verifies the syntax.
3. And once the syntax verification is done, it requests metadata from the meta store.
4. Now, the metadata provides information like the database, tables, data types of the column in response to the
query back to the compiler.
5. The compiler again checks all the requirements received from the meta store and sends the execution plan to the
driver.
6. Now, the driver sends the execution plan to the HiveQL process engine where the engine converts the query into
the map-reduce job.
7. After the query is converted into the map-reduce job, it sends the task information to the Hadoop where the
processing of the query begins and at the same time it updates the metadata about the map-reduce job in the meta
store.
8. Once the processing is done, the execution engine receives the results of the query.
9. The execution engine transfers the results back to the driver and which finally sends to the hive user-interface from
where we can see the results.
Data Types in Apache Hive
Hive data types are divided into the following 5
different categories:
 Numeric Type: TINYINT, SMALLINT, INT,
BIGINT
 Date/Time Types: TIMESTAMP, DATE,
INTERVAL
 String Types: STRING, VARCHAR, CHAR
 Complex Types: STRUCT, MAP, UNION, ARRAY
 Misc Types: BOOLEAN, BINARY
Create and Drop Database
 Creating and Dropping database is very simple and
similar to the SQL. We need to assign a unique
name to each of the databases in the hive. If the
database already exists, it will show a warning and
to suppress this warning you can add the
keywords IF NOT EXISTS after the database
keyword.
CREATE DATABASE <<database_name>> ;
 Dropping a database is also very simple, you just
need to write a drop database and the database
name to be dropped. If you try to drop the database
that doesn’t exist, it will give you the
SemanticException error.
DROP DATABASE <<database_name>> ;
Create Table
We use the create table statement to create a table and the
complete syntax is as follows.
Load Data into Table
Now, the tables have been created. It’s time to load
the data into it. We can load the data from any local
file on our system using the following syntax.
LOAD DATA LOCAL INPATH <<path of file on your local system>> INTO
TABLE <<database_name.>><<table_name>> ;
When we work with a huge amount of data, there is a possibility of having
unmatched data types in some of the rows. In that case, the hive will not throw
any error rather it will fill null values in place of them. This is a very useful
feature as loading big data files into the hive is an expensive process and we
do not want to load the entire dataset just because of few files.
Alter Table
 In the hive, we can do multiple modifications to the
existing tables like renaming the tables, adding more
columns to the table. The commands to alter the
table are very much similar to the SQL commands.
 Here is the syntax to rename the table:
 ALTER TABLE <<table_name>> RENAME TO
<<new_name>> ;
Syntax to add more columns from the table:
## to add more columns
 ALTER TABLE <<table_name>> ADD
COLUMNS (new_column_name_1
data_type_1, new_column_name_2
data_type_2, . . new_column_name_n
data_type_n) ;
Advantages/Disadvantages of
Apache Hive
1. Uses SQL like query language which is already
familiar to most of the developers so makes it easy
to use.
2. It is highly scalable, you can use it to process any
size of data.
3. Supports multiple databases like MySQL, derby,
Postgres, and Oracle for its metastore.
4. Supports multiple data formats also allows indexing,
partitioning, and bucketing for query optimization.
5. Can only deal with cold data and is useless when it
comes to processing real-time data.
6. It is comparatively slower than some of its
competitors. If your use-case is mostly about batch
processing then Hive is well and fine.

More Related Content

What's hot (20)

Hive
HiveHive
Hive
 
Bootstrap
BootstrapBootstrap
Bootstrap
 
Javascript arrays
Javascript arraysJavascript arrays
Javascript arrays
 
Css
CssCss
Css
 
Css pseudo-classes
Css pseudo-classesCss pseudo-classes
Css pseudo-classes
 
Java Servlets
Java ServletsJava Servlets
Java Servlets
 
Servlet life cycle
Servlet life cycleServlet life cycle
Servlet life cycle
 
MySql slides (ppt)
MySql slides (ppt)MySql slides (ppt)
MySql slides (ppt)
 
Css
CssCss
Css
 
Collections In Java
Collections In JavaCollections In Java
Collections In Java
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Collections framework in java
Collections framework in javaCollections framework in java
Collections framework in java
 
Hive and HiveQL - Module6
Hive and HiveQL - Module6Hive and HiveQL - Module6
Hive and HiveQL - Module6
 
Structural Design pattern - Adapter
Structural Design pattern - AdapterStructural Design pattern - Adapter
Structural Design pattern - Adapter
 
Collection Framework in java
Collection Framework in javaCollection Framework in java
Collection Framework in java
 
Servlets
ServletsServlets
Servlets
 
Java Collection framework
Java Collection frameworkJava Collection framework
Java Collection framework
 
GET and POST in PHP
GET and POST in PHPGET and POST in PHP
GET and POST in PHP
 
Java Server Pages
Java Server PagesJava Server Pages
Java Server Pages
 
Bootstrap Part - 1
Bootstrap Part - 1Bootstrap Part - 1
Bootstrap Part - 1
 

Similar to Unit 5-apache hive

Working with Hive Analytics
Working with Hive AnalyticsWorking with Hive Analytics
Working with Hive AnalyticsManish Chopra
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON Padma shree. T
 
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop EcosystemUnveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystemmashoodsyed66
 
Windows Azure HDInsight Service
Windows Azure HDInsight ServiceWindows Azure HDInsight Service
Windows Azure HDInsight ServiceNeil Mackenzie
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 
Splice Machine Overview
Splice Machine OverviewSplice Machine Overview
Splice Machine OverviewKunal Gupta
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony NguyenThanh Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen
 

Similar to Unit 5-apache hive (20)

Working with Hive Analytics
Working with Hive AnalyticsWorking with Hive Analytics
Working with Hive Analytics
 
Hive_Pig.pptx
Hive_Pig.pptxHive_Pig.pptx
Hive_Pig.pptx
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
 
Hive presentation
Hive presentationHive presentation
Hive presentation
 
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop EcosystemUnveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
 
1. Apache HIVE
1. Apache HIVE1. Apache HIVE
1. Apache HIVE
 
Windows Azure HDInsight Service
Windows Azure HDInsight ServiceWindows Azure HDInsight Service
Windows Azure HDInsight Service
 
Session 14 - Hive
Session 14 - HiveSession 14 - Hive
Session 14 - Hive
 
Hive
HiveHive
Hive
 
Hive.pptx
Hive.pptxHive.pptx
Hive.pptx
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Hive with HDInsight
Hive with HDInsightHive with HDInsight
Hive with HDInsight
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
Splice Machine Overview
Splice Machine OverviewSplice Machine Overview
Splice Machine Overview
 
Apache hive introduction
Apache hive introductionApache hive introduction
Apache hive introduction
 
Apache Hive
Apache HiveApache Hive
Apache Hive
 
Intro to Hadoop
Intro to HadoopIntro to Hadoop
Intro to Hadoop
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
6.hive
6.hive6.hive
6.hive
 

More from vishal choudhary (20)

SE-Lecture1.ppt
SE-Lecture1.pptSE-Lecture1.ppt
SE-Lecture1.ppt
 
SE-Testing.ppt
SE-Testing.pptSE-Testing.ppt
SE-Testing.ppt
 
SE-CyclomaticComplexityand Testing.ppt
SE-CyclomaticComplexityand Testing.pptSE-CyclomaticComplexityand Testing.ppt
SE-CyclomaticComplexityand Testing.ppt
 
SE-Lecture-7.pptx
SE-Lecture-7.pptxSE-Lecture-7.pptx
SE-Lecture-7.pptx
 
Se-Lecture-6.ppt
Se-Lecture-6.pptSe-Lecture-6.ppt
Se-Lecture-6.ppt
 
SE-Lecture-5.pptx
SE-Lecture-5.pptxSE-Lecture-5.pptx
SE-Lecture-5.pptx
 
XML.pptx
XML.pptxXML.pptx
XML.pptx
 
SE-Lecture-8.pptx
SE-Lecture-8.pptxSE-Lecture-8.pptx
SE-Lecture-8.pptx
 
SE-coupling and cohesion.ppt
SE-coupling and cohesion.pptSE-coupling and cohesion.ppt
SE-coupling and cohesion.ppt
 
SE-Lecture-2.pptx
SE-Lecture-2.pptxSE-Lecture-2.pptx
SE-Lecture-2.pptx
 
SE-software design.ppt
SE-software design.pptSE-software design.ppt
SE-software design.ppt
 
SE1.ppt
SE1.pptSE1.ppt
SE1.ppt
 
SE-Lecture-4.pptx
SE-Lecture-4.pptxSE-Lecture-4.pptx
SE-Lecture-4.pptx
 
SE-Lecture=3.pptx
SE-Lecture=3.pptxSE-Lecture=3.pptx
SE-Lecture=3.pptx
 
Multimedia-Lecture-Animation.pptx
Multimedia-Lecture-Animation.pptxMultimedia-Lecture-Animation.pptx
Multimedia-Lecture-Animation.pptx
 
MultimediaLecture5.pptx
MultimediaLecture5.pptxMultimediaLecture5.pptx
MultimediaLecture5.pptx
 
Multimedia-Lecture-7.pptx
Multimedia-Lecture-7.pptxMultimedia-Lecture-7.pptx
Multimedia-Lecture-7.pptx
 
MultiMedia-Lecture-4.pptx
MultiMedia-Lecture-4.pptxMultiMedia-Lecture-4.pptx
MultiMedia-Lecture-4.pptx
 
Multimedia-Lecture-6.pptx
Multimedia-Lecture-6.pptxMultimedia-Lecture-6.pptx
Multimedia-Lecture-6.pptx
 
Multimedia-Lecture-3.pptx
Multimedia-Lecture-3.pptxMultimedia-Lecture-3.pptx
Multimedia-Lecture-3.pptx
 

Recently uploaded

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad EscortsCall girls in Ahmedabad High profile
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 

Recently uploaded (20)

Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
(ISHITA) Call Girls Service Hyderabad Call Now 8617697112 Hyderabad Escorts
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 

Unit 5-apache hive

  • 2. Introduction  Most of the Data Scientists use SQL queries in order to explore the data and get valuable insights from them. Now, as the volume of data is growing at such a high pace, we need new dedicated tools to deal with big volumes of data.  Initially, Hadoop came up and became one of the most popular tools to process and store big data. But developers were required to write complex map-reduce codes to work with Hadoop. This is Facebook’s Apache Hive came to rescue. It is another tool designed to work with Hadoop. We can write SQL like queries in the hive and in the backend it converts them into the map-reduce jobs.
  • 3. What is Apache Hive?  Apache Hive is a data warehouse system developed by Facebook to process a huge amount of structure data in Hadoop. We know that to process the data using Hadoop, we need to right complex map- reduce functions which is not an easy task for most of the developers. Hive makes this work very easy for us.  It uses a scripting language called HiveQL which is almost similar to the SQL. So now, we just have to write SQL-like commands and at the backend of Hive will automatically convert them into the map- reduce jobs. 
  • 4. Apache Hive Architecture  Hive Clients: It allows us to write hive applications using different types of clients such as thrift server, JDBC driver for Java, and Hive applications and also supports the applications that use ODBC protocol.  Hive Services: As a developer, if we wish to process any data, we need to use the hive services such as hive CLI (Command Line Interface). In addition to that hive also provides a web-based interface to run the hive applications.  Hive Driver: It is capable of receiving queries from multiple resources like thrift, JDBC, and ODBS using the hive server and directly from hive CLI and web-based UI. After receiving the queries, it transfers it to the compiler.  HiveQL Engine: It receives the query from the compiler and converts the SQL like query into the map-reduce jobs.  Meta Store: Here hive stores the meta-information about the databases like schema of the table, data types of the columns, location in the HDFS, etc  HDFS: It is simply the Hadoop distributed file system used to store the data.
  • 5.
  • 6. Working of Apache Hive 1. In the first step, we write down the query using the web interface or the command-line interface of the hive. It sends it to the driver to execute the query. 2. In the next step, the driver sends the received query to the compiler where the compiler verifies the syntax. 3. And once the syntax verification is done, it requests metadata from the meta store. 4. Now, the metadata provides information like the database, tables, data types of the column in response to the query back to the compiler. 5. The compiler again checks all the requirements received from the meta store and sends the execution plan to the driver. 6. Now, the driver sends the execution plan to the HiveQL process engine where the engine converts the query into the map-reduce job. 7. After the query is converted into the map-reduce job, it sends the task information to the Hadoop where the processing of the query begins and at the same time it updates the metadata about the map-reduce job in the meta store. 8. Once the processing is done, the execution engine receives the results of the query. 9. The execution engine transfers the results back to the driver and which finally sends to the hive user-interface from where we can see the results.
  • 7.
  • 8. Data Types in Apache Hive Hive data types are divided into the following 5 different categories:  Numeric Type: TINYINT, SMALLINT, INT, BIGINT  Date/Time Types: TIMESTAMP, DATE, INTERVAL  String Types: STRING, VARCHAR, CHAR  Complex Types: STRUCT, MAP, UNION, ARRAY  Misc Types: BOOLEAN, BINARY
  • 9.
  • 10. Create and Drop Database  Creating and Dropping database is very simple and similar to the SQL. We need to assign a unique name to each of the databases in the hive. If the database already exists, it will show a warning and to suppress this warning you can add the keywords IF NOT EXISTS after the database keyword. CREATE DATABASE <<database_name>> ;  Dropping a database is also very simple, you just need to write a drop database and the database name to be dropped. If you try to drop the database that doesn’t exist, it will give you the SemanticException error. DROP DATABASE <<database_name>> ;
  • 11. Create Table We use the create table statement to create a table and the complete syntax is as follows.
  • 12. Load Data into Table Now, the tables have been created. It’s time to load the data into it. We can load the data from any local file on our system using the following syntax. LOAD DATA LOCAL INPATH <<path of file on your local system>> INTO TABLE <<database_name.>><<table_name>> ; When we work with a huge amount of data, there is a possibility of having unmatched data types in some of the rows. In that case, the hive will not throw any error rather it will fill null values in place of them. This is a very useful feature as loading big data files into the hive is an expensive process and we do not want to load the entire dataset just because of few files.
  • 13. Alter Table  In the hive, we can do multiple modifications to the existing tables like renaming the tables, adding more columns to the table. The commands to alter the table are very much similar to the SQL commands.  Here is the syntax to rename the table:  ALTER TABLE <<table_name>> RENAME TO <<new_name>> ;
  • 14. Syntax to add more columns from the table: ## to add more columns  ALTER TABLE <<table_name>> ADD COLUMNS (new_column_name_1 data_type_1, new_column_name_2 data_type_2, . . new_column_name_n data_type_n) ;
  • 15. Advantages/Disadvantages of Apache Hive 1. Uses SQL like query language which is already familiar to most of the developers so makes it easy to use. 2. It is highly scalable, you can use it to process any size of data. 3. Supports multiple databases like MySQL, derby, Postgres, and Oracle for its metastore. 4. Supports multiple data formats also allows indexing, partitioning, and bucketing for query optimization. 5. Can only deal with cold data and is useless when it comes to processing real-time data. 6. It is comparatively slower than some of its competitors. If your use-case is mostly about batch processing then Hive is well and fine.