Introduction to Hive - SQL for Hadoop Data

Agenda
 Origin – The Making of Hive Story 
 What is HIVE?
 Why use Hive?
 Hive Architecture
 Hive Metastore
 Configuring Hive
 Important metastore configuration properties
 Comparison with Traditional Databases
 Hive Data Types
 Hive Tables types
 Store Hive table to HDFS file

Origin –The Making of Hive Story

What is HIVE?
 Hive is a data warehouse infrastructure built on top of Hadoop that can compile SQL queries as
MapReduce jobs and run the job in the cluster
 Associate structure with a variety of data formats
 Logical Table -‐> Physical Location
 Logical Table -‐> Physical Data Format Handler (SerDe)
 Integrates with HDFS, HBase, MongoDB etc.

Why use Hive?
 MapReduce is catered towards developers
 Run SQL-‐like queries that get compiled and run as MapReduce jobs
 Data in Hadoop even though generally unstructured has some vague structure associated with it
 We’ll get Benefits of MapReduce + HDFS (Hadoop)
 Fault tolerant
 Robust
 Scalable

Configuring Hive
For Exposing to hive-site.xml file:
% hive --config /Users/tom/dev/hive-conf
For Exposing to certain properties:
 hive -hiveconf fs.defaultFS=hdfs://localhost
-hiveconf mapreduce.framework.name=yarn
-hiveconf yarn.resourcemanager.address=localhost:8032
For Exposing to certain properties within the shell:
SET hive.execution.engine=tez;
Logging:
hive -hiveconf hive.log.dir='/tmp/${user.name}'

Important metastore configuration
properties

Comparison withTraditional Databases
 Schema on Read Versus Schema on Write

HiveTables types
 Managed Tables
 CREATE TABLE managed_table (dummy STRING);
 LOAD DATA INPATH '/user/tom/data.txt' INTO table managed_table;
 Will move the file hdfs://user/tom/data.txt into Hive’s warehouse directory for the
managed_table table, which is hdfs://user/hive/warehouse/managed_table.
 External Tables
 CREATE EXTERNAL TABLE external_table (dummy STRING)
LOCATION '/user/tom/external_table';
 LOAD DATA INPATH '/user/tom/data.txt' INTO TABLE external_table;

Introduction to Hive - SQL for Hadoop Data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Introduction to Hive - SQL for Hadoop Data

Similar to Introduction to Hive - SQL for Hadoop Data (20)

More from Uday Vakalapudi

More from Uday Vakalapudi (11)

Recently uploaded

Recently uploaded (20)

Introduction to Hive - SQL for Hadoop Data