SlideShare a Scribd company logo
1 of 44
WDABT 2016 – BHARATHIAR UNIVERSITY
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
2
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT
2016
component of
3
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT
2016
Structure DataStructure Data
Large Data SetLarge Data Set
MapreduceMapreduce Parallel
Distribution
Parallel
Distribution
Query DataQuery Data
Why HIVE
4Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Features of hive
5Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HDFS or HBASE STORAGE SYSTEM
Execution Engine
Hive QL Process Engine
WEB UIWEB UI
HIVE
COMMAND
LINE
HIVE
COMMAND
LINE
HD InsightHD Insight
Meta Store
User
Interface
HIVE Architecture
6Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Embedded Metastore
Local
Metastore Remote Metastore
7Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Hive File formats
• Text Files - Delimited by Parameters
• Sequence Files - Less Data
• RC Files - Analytic Processing
• ORC Files – Optimized file format in binary
format
8
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT
2016
Hive query language offers:
 Create Database
 Create ,manage and partition tables
 Supports various operators like Relational, Arithmetic and
Logical to evaluate functions
 Hive supports DDL and DML
HIVE Query Language (HQL)
9
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT
2016
DDL Data Definition Language)
Statements
The DDL commands are listed below
Create, Alter, Drop database
Create Alter, Drop, Truncate table
Create, Alter with Partitioning and Bucketing
Create Views
Show
Describe
10Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Loading files
Inserting data into Hive Tables from queries
DML (Data Manipulation
Language) Statements
11Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Database Operations
Syntax
CREATE DATABASE IF NOT EXISTS db_name
COMMENT ‘db_name Details’
WITH DBPROPERTIES (‘creator’ = ‘name’);
Example
CREATE DATABASE IF NOT EXISTS LIBDETS
COMMENT ’LIBRARY DETAILS’
WITH DBPROPERTIES (‘creator’ = ‘KIRUTHI’);
12Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Database
OperationsSyntax
SHOW DATABASES // displays databases available
Example
SHOW DATABASES;
Syntax
DESCRIBE DATABASE db_name; //display Schema of database
DESCRIBE DATABASE EXTENDED db_name;
Example
DESCRIBE DATABASE LIBDETS;
DESCRIBE DATABASE EXTENDED LIBDETS
13Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
ALTER Database
Syntax
ALTER DATABASE db_name // Alter database properties
SET DBPROPERTIES (‘edited-by’ = ‘name’);
Example
ALTER DATABASE LIBDETS
SET DBPROPERTIES (‘edited-by’ = ‘KANI’);
14Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
USE , DROP Database
Syntax
USE db_name; //Assign database as current working database
Example
USE LIBDETS;
Syntax
DROP DATABASE db_name; // delete database
Example
DROP DATABASE LIBDETS;
15Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
TABLES
Hive supports two types of tables
Managed Table – Table stored in
HiveWarehouse folder
External Table – Retains a schema copy in
specified location even table is deleted
16Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Creating Managed Table
Syntax
CREATE TABLE IF NOT EXISTS tb_name (column_name
data_type, column_name datatype,column_name data type)
ROW FORMAT DELIMITED FIELDS TERMINATED BY
‘t’ ;
Example
CREATE TABLE IF NOT EXISTS LIBTBL ( Member_Code
INT,Membr_Name STRING, Designation STRING,Dept_code
INT,dept_name STRING,group_name STRING,course_name
STRING,title STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY
‘t’ ;
Managed Table
17Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
External Table.
Creating External Table
Syntax
CREATE EXTERNAL TABLE tb_name IF NOT EXISTS
tb_name (column_name datatype, column_name datatype,
column_name datatype)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’
LOCATION ‘ /home/usr/filename.format’;
Example
CREATE EXTERNAL TABLE IF NOT EXISTS LIBTBL
(Member_Code INT, Member_Name STRING, Designation
STRING, Dept_code INT, course_code INT, dept_name STRING,
group_name STRING, course_name STRING, title STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’
LOCATION ‘/home/livrith/Desktop/Book2.csv’;
18Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Loading Data into Table
Syntax
LOAD DATA LOCAL INPATH
‘hdfs_file_or_directory_path’
OVERWRITE INTO TABLE tb_name;
Example
LOAD DATA LOCAL INPATH
‘/home/kiruthika/Documents/Book2.csv’
OVERWRITE INTO TABLE LIBTBL;
19Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Select clause
Syntax
SELET [ALL | DISTINCT] select_expr, select_expr, . . .
FROM tb_name
[WHERE where_conditon]
[GROUP BY column_name]
[ORDER BY column_name]
[HAVING having_condition]
[DISTRIBUTED column_name]
[LIMIT number];
Example:1
SELECT * FROM LIBTBL;
Example:2
SELECT Member Name, Designation FROM LIBTBL;
20Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Select – where
Example
SELECT * FROM LIBUDET WHERE group_name =
‘TEACHING’
OR group_name = ‘student’
AND Dept_name>= ‘18’;
Select - regular expression
Syntax
SELECT column1,column2,column3 FROM tb_name WHERE
column_name LIKE ‘%alp%’;
Example
SELECT PRODUCT, STATE, CITY FROM SALESDETS
WHERE City LIKE ‘%O%’;
21Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Group by
Example
SELECT PRODUCT, COUNT(PRODUCT)AS C1, STATE,
COUNTRY FROM SALESDETS GROUP BY PRODUCT,
STATE;
Order by // Sorts use only one reducer
Example
SELECT PRODUCT, STATE, PRICE, COUNTRY FROM
SALESDETS
ORDER BY COUNTRY;
22Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Sort by // Sorts the data before given to reducer
Example
SELECT PRODUC,STATE,COUNTRY FROM SALESDETS
SORT BY COUNTRY
LIMIT 10;
Having // Filter data based on Group By
Example
SELECT PRODUCT, COUNT(PRODUCT) AS
C1,STATE,COUNTRY FROM SALESDETS
GROUP BY PRODUCT, STATE, COUNTRY
HAVING C1 > 5;
23Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Limit
Example
SELECT PRODUCT,STATE, PRICE, COUNTRY FROM
SALESDETS COUNTRY LIMIT 10;
Distribute by // distributes rows among reducers
Syntax
SELECT column_name1, column_name2,column_name3 FROM
tb_name DISTRIBUTE BY column_name SORT BY column_name
ASC,column_name ASC LIMIT count;
Example
SELECT PRODUCT,PRICE,STATE FROM SALESDETS
DISTRIBUTE BY STATE
SORT BY STATE ASC, PRODUCT ASC
LIMIT 50;
24Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Cluster by // does the job of both distribute by and sort by
Example
SELECT PRODUCT,PRICE,STATE FROM SALESDETS
CLUSTER BY STATE LIMIT 50;
Difference in Execution of Order By , Sort By, Distribute By, Cluster By
25Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Data Aggregation
COUNT
AVG DISTINCT (AVG)
MIN DISTINCT(MIN)
MAX , DISTINCT(MAX)
26Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Partitions
Hive reads the entire dataset from warehouse even when filter
condition is specified to fetch a particular column. This results as
bottleneck in MapReduce jobs and involves huge degree of I/O.
Partition command is used to break larger dataset into small
chunks on columns.
Hive supports two types of partition
 Static partition
 Dynamic partition
27Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Creating partition table
Syntax
CREATE TABLE tb_name (column1 datatype, column2
datatype,column3 datatype)
COMMENT ‘Details of the dataset’
PARTITIONED BY (column_name STRING) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ‘,’;
Example
CREATE TABLE MY_TABLE1 (Member_Name STRING,dept_name
STRING,group_name STRING,course_name STRING,title STRING)
COMMENT ‘User information’ PARTITIONED BY (Designation
STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY
‘,’;
28Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Load data into static partition table
Syntax
LOAD DATA LOCAL INPATH ‘file_path’ OVERWRITE
INTO TABLE tb_name;
Example
LOAD DATA LOCAL INPATH
‘/home/livrith/Desktop/mytab.csv’ OVERWRITE INTO
TABLE MY_TABLE2;
29Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Set dynamic partition
The following setting has to be modified to execute
dynamic partitions.
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
Example
SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;
30Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Insert data - Dynamic partition table
Syntax
INSERT OVERWRITE TABLE 1st
_tb_name
PARTITION(column_name) SELECT
column_name1,column_name2,column_name3 FROM
2nd
_tb_name;
//partition field should be the last attribute when inserting data
Example
INSERT OVERWRITE TABLE MY_TABLE1
PARTITION(Designation)
SELECT Member_Name,dept_name,group_name,
course_name,title,Designation FROM MY_TABLE2;
31Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Bucketing
32
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Bucketing
Bucketing is similar to partitioning.
Bucket is a file.
Bucket are used to create partition on specified column values
where as partitioning is used to divided data into small blocks on
columns.
33
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Table creation
Syntax
CREATE TABLE IF NOT EXISTS tb_name (column1
datatype,column2 datatype,column3 datatype) CLUSTER
BY(column_name) into 3 BUCKETS
ROW FORMAT DELIMITED FIELDS TERMINATED BY
‘/t’;
Example
CREATE TABLE SALES_BUC1 (Transacyion_date
TIMESTAMP,Product STRING,Price INT,Payment_Type
STRING,Name STRING,City STRING,State STRING,Country
STRING,Account_Created TIMESTAMP) CLUSTERED BY
(Price) into 3 BUCKETS ROW FORMAT DELIMITED
FIELDS TERMINATED BY ‘,’;
34
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Load data into table
Syntax
FROM 1st
_tb_name INSERT OVERWRITE TABLE
2nd
_tb_name
SELECT column_name1, column_name2,column_name3;
Example
FROM SALESDETS INSERT OVERWRITE TABLE
SALES_BUC1 SELECT
Transaction_date,Product,Price,Payment_Type,Name,City,Sta
te,Country,Account_Created;
35Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Select from bucket table
Syntax:1
SELECT DISTINCT column_name FROM 2nd
_tb_name
tb_name (BUCKET 1 OUT OF 3 ON column_name);
Example
SELECT DISTINCT Price FROM SALES_BUC1
TABLESAMPLE (BUCKET 1 OUT OF 3 ON PRICE);
Syntax:2
SELECT DISTINCT column_name FROM tb_name2
Tb_name(BUCKET 1 OUT OF 2 ON column_name);
Example
SELECT DISTINCT PRICE FROM SALES_BUC1
TABLESAMPLE(BUCKET 1 OUT OF 2 ON Price);
36
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Sampling
•SAMPLING is used in hive to populate small dataset from
the existing large datasets. Sampling employs selects records
randomly to create small datasets.
Syntax
SELECT COUNT(*) FROM tb_name TABLESAMPLE
(BUCKET 1 OUT OF 3 ON column_name);
Example
In the example given below sample are created from the table
sales_buc from the available 3 buckets.
SELECT COUNT(*) FROM SALES_BUC TABLESAMPLE
(BUCKET 1 OUT OF 3 ON Price);
37Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
• Apache HBase is an open-source, distributed, versioned,
non-relational database modeled after Google's Bigtable
• Apache HBase provides Bigtable-like capabilities on top
of Hadoop and HDFS.
38
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
NoSQL Databases
• NoSQL – Not only SQL, Non Relational/Non
SQL Databases
• SCHEMA LESS
• Ideology
• BASE – Basically available Eventual
Consistency - Only can support two
availabilty, replication
39
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
NoSQL Types
• Key Value Store - Amazon S3, Riak
• Document based store – CouchDB,MongoDB
• Column based store - Hbase, Cassandra
• Graph based stores - Neoj4, Orientdb
40
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
HBASE is Not
• Table with one primary key (row key)
• No Join Operations
• Limited Atomicty and transaction support
• Manipulated by SQL
41
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Hbase components
• Master - Manages load balancing and scripting
• Regionserver – Range of tables assigned by master
Zookeper –
• Client communicate via Zookeeper for read write
operations in region servers for storing node details
• Region server uses Memstore similar to cache
memory
• Provides services for synchronization, maintenance
42
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
Refrences
• http://hadooptutorials.co.in/tutorials
• https://www.youtube.com/watch?v=W_oUrDBLBaE
• https://flume.apache.org/FlumeUserGuide.html
• https://archive.cloudera.com/cdh/3/sqoop/SqoopUser
Guide.html#_basic_usage
• http://hortonworks.com/hadoop/oozie/
• http://www.01.ibm.com/software/data/infosphere/ha
doop/zookeeper/
• https://www.youtube.com/watch?v=Dv2V7lbIRmI
• http://kafka.apache.org/documentation.html
• https://www.youtube.com/watch?v=ArUHr3Czx-8
43
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
44
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016

More Related Content

What's hot

The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataPaul Groth
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text Paul Groth
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data ManagementAmanda Whitmire
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data ShowcasingPaul Groth
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...ijtsrd
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsPaul Groth
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplicationidescitation
 
Machines are people too
Machines are people tooMachines are people too
Machines are people tooPaul Groth
 
On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...Susanna-Assunta Sansone
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkPaul Groth
 
Data management (1)
Data management (1)Data management (1)
Data management (1)SM Lalon
 
Modern association rule mining methods
Modern association rule mining methodsModern association rule mining methods
Modern association rule mining methodsijcsity
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationRinke Hoekstra
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chainPaul Groth
 

What's hot (18)

Ceis 1
Ceis 1Ceis 1
Ceis 1
 
The Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture DataThe Roots: Linked data and the foundations of successful Agriculture Data
The Roots: Linked data and the foundations of successful Agriculture Data
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
End-to-End Learning for Answering Structured Queries Directly over Text
End-to-End Learning for  Answering Structured Queries Directly over Text End-to-End Learning for  Answering Structured Queries Directly over Text
End-to-End Learning for Answering Structured Queries Directly over Text
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
From Data Search to Data Showcasing
From Data Search to Data ShowcasingFrom Data Search to Data Showcasing
From Data Search to Data Showcasing
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...Experimental Result Analysis of Text Categorization using Clustering and Clas...
Experimental Result Analysis of Text Categorization using Clustering and Clas...
 
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge GraphsCombining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
Combining Explicit and Latent Web Semantics for Maintaining Knowledge Graphs
 
Indexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record DeduplicationIndexing based Genetic Programming Approach to Record Deduplication
Indexing based Genetic Programming Approach to Record Deduplication
 
Machines are people too
Machines are people tooMachines are people too
Machines are people too
 
On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic Framework
 
Data management (1)
Data management (1)Data management (1)
Data management (1)
 
Modern association rule mining methods
Modern association rule mining methodsModern association rule mining methods
Modern association rule mining methods
 
Prov-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance VisualizationProv-O-Viz: Interactive Provenance Visualization
Prov-O-Viz: Interactive Provenance Visualization
 
The need for a transparent data supply chain
The need for a transparent data supply chainThe need for a transparent data supply chain
The need for a transparent data supply chain
 

Viewers also liked

On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)Stéphane Fréchette
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Cloudera, Inc.
 
Programming Hive Reading #4
Programming Hive Reading #4Programming Hive Reading #4
Programming Hive Reading #4moai kids
 
Hiveハンズオン
HiveハンズオンHiveハンズオン
HiveハンズオンSatoshi Noto
 
Programming Hive Reading #3
Programming Hive Reading #3Programming Hive Reading #3
Programming Hive Reading #3moai kids
 
Hive Object Model
Hive Object ModelHive Object Model
Hive Object ModelZheng Shao
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...DataWorks Summit/Hadoop Summit
 

Viewers also liked (10)

On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
On the move with Big Data (Hadoop, Pig, Sqoop, SSIS...)
 
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
Hadoop World 2011: Replacing RDB/DW with Hadoop and Hive for Telco Big Data -...
 
Data analytics
Data analyticsData analytics
Data analytics
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
6.hive
6.hive6.hive
6.hive
 
Programming Hive Reading #4
Programming Hive Reading #4Programming Hive Reading #4
Programming Hive Reading #4
 
Hiveハンズオン
HiveハンズオンHiveハンズオン
Hiveハンズオン
 
Programming Hive Reading #3
Programming Hive Reading #3Programming Hive Reading #3
Programming Hive Reading #3
 
Hive Object Model
Hive Object ModelHive Object Model
Hive Object Model
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 

Similar to Big data analytics -hive

Data Discovery on Hadoop - Realizing the Full Potential of your Data
Data Discovery on Hadoop - Realizing the Full Potential of your DataData Discovery on Hadoop - Realizing the Full Potential of your Data
Data Discovery on Hadoop - Realizing the Full Potential of your DataDataWorks Summit
 
Introduction to DBMS.pptx
Introduction to DBMS.pptxIntroduction to DBMS.pptx
Introduction to DBMS.pptxShwetha Ch
 
Hadoop Summit San Jose 2014: Data Discovery on Hadoop
Hadoop Summit San Jose 2014: Data Discovery on Hadoop Hadoop Summit San Jose 2014: Data Discovery on Hadoop
Hadoop Summit San Jose 2014: Data Discovery on Hadoop Sumeet Singh
 
Data discoveryonhadoop@yahoo! hadoopsummit2014
Data discoveryonhadoop@yahoo! hadoopsummit2014Data discoveryonhadoop@yahoo! hadoopsummit2014
Data discoveryonhadoop@yahoo! hadoopsummit2014thiruvel
 

Similar to Big data analytics -hive (6)

Introduction to pig
Introduction to pigIntroduction to pig
Introduction to pig
 
Data Discovery on Hadoop - Realizing the Full Potential of your Data
Data Discovery on Hadoop - Realizing the Full Potential of your DataData Discovery on Hadoop - Realizing the Full Potential of your Data
Data Discovery on Hadoop - Realizing the Full Potential of your Data
 
Introduction to DBMS.pptx
Introduction to DBMS.pptxIntroduction to DBMS.pptx
Introduction to DBMS.pptx
 
Hadoop Summit San Jose 2014: Data Discovery on Hadoop
Hadoop Summit San Jose 2014: Data Discovery on Hadoop Hadoop Summit San Jose 2014: Data Discovery on Hadoop
Hadoop Summit San Jose 2014: Data Discovery on Hadoop
 
Data discoveryonhadoop@yahoo! hadoopsummit2014
Data discoveryonhadoop@yahoo! hadoopsummit2014Data discoveryonhadoop@yahoo! hadoopsummit2014
Data discoveryonhadoop@yahoo! hadoopsummit2014
 
Pig
PigPig
Pig
 

Recently uploaded

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 

Recently uploaded (20)

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 

Big data analytics -hive

  • 1. WDABT 2016 – BHARATHIAR UNIVERSITY Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 2. 2 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 3. component of 3 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 4. Structure DataStructure Data Large Data SetLarge Data Set MapreduceMapreduce Parallel Distribution Parallel Distribution Query DataQuery Data Why HIVE 4Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 5. Features of hive 5Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 6. HDFS or HBASE STORAGE SYSTEM Execution Engine Hive QL Process Engine WEB UIWEB UI HIVE COMMAND LINE HIVE COMMAND LINE HD InsightHD Insight Meta Store User Interface HIVE Architecture 6Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 7. Embedded Metastore Local Metastore Remote Metastore 7Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 8. Hive File formats • Text Files - Delimited by Parameters • Sequence Files - Less Data • RC Files - Analytic Processing • ORC Files – Optimized file format in binary format 8 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 9. Hive query language offers:  Create Database  Create ,manage and partition tables  Supports various operators like Relational, Arithmetic and Logical to evaluate functions  Hive supports DDL and DML HIVE Query Language (HQL) 9 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 10. DDL Data Definition Language) Statements The DDL commands are listed below Create, Alter, Drop database Create Alter, Drop, Truncate table Create, Alter with Partitioning and Bucketing Create Views Show Describe 10Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 11. Loading files Inserting data into Hive Tables from queries DML (Data Manipulation Language) Statements 11Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 12. Database Operations Syntax CREATE DATABASE IF NOT EXISTS db_name COMMENT ‘db_name Details’ WITH DBPROPERTIES (‘creator’ = ‘name’); Example CREATE DATABASE IF NOT EXISTS LIBDETS COMMENT ’LIBRARY DETAILS’ WITH DBPROPERTIES (‘creator’ = ‘KIRUTHI’); 12Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 13. Database OperationsSyntax SHOW DATABASES // displays databases available Example SHOW DATABASES; Syntax DESCRIBE DATABASE db_name; //display Schema of database DESCRIBE DATABASE EXTENDED db_name; Example DESCRIBE DATABASE LIBDETS; DESCRIBE DATABASE EXTENDED LIBDETS 13Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 14. ALTER Database Syntax ALTER DATABASE db_name // Alter database properties SET DBPROPERTIES (‘edited-by’ = ‘name’); Example ALTER DATABASE LIBDETS SET DBPROPERTIES (‘edited-by’ = ‘KANI’); 14Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 15. USE , DROP Database Syntax USE db_name; //Assign database as current working database Example USE LIBDETS; Syntax DROP DATABASE db_name; // delete database Example DROP DATABASE LIBDETS; 15Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 16. TABLES Hive supports two types of tables Managed Table – Table stored in HiveWarehouse folder External Table – Retains a schema copy in specified location even table is deleted 16Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 17. Creating Managed Table Syntax CREATE TABLE IF NOT EXISTS tb_name (column_name data_type, column_name datatype,column_name data type) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ ; Example CREATE TABLE IF NOT EXISTS LIBTBL ( Member_Code INT,Membr_Name STRING, Designation STRING,Dept_code INT,dept_name STRING,group_name STRING,course_name STRING,title STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘t’ ; Managed Table 17Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 18. External Table. Creating External Table Syntax CREATE EXTERNAL TABLE tb_name IF NOT EXISTS tb_name (column_name datatype, column_name datatype, column_name datatype) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ LOCATION ‘ /home/usr/filename.format’; Example CREATE EXTERNAL TABLE IF NOT EXISTS LIBTBL (Member_Code INT, Member_Name STRING, Designation STRING, Dept_code INT, course_code INT, dept_name STRING, group_name STRING, course_name STRING, title STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’ LOCATION ‘/home/livrith/Desktop/Book2.csv’; 18Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 19. Loading Data into Table Syntax LOAD DATA LOCAL INPATH ‘hdfs_file_or_directory_path’ OVERWRITE INTO TABLE tb_name; Example LOAD DATA LOCAL INPATH ‘/home/kiruthika/Documents/Book2.csv’ OVERWRITE INTO TABLE LIBTBL; 19Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 20. Select clause Syntax SELET [ALL | DISTINCT] select_expr, select_expr, . . . FROM tb_name [WHERE where_conditon] [GROUP BY column_name] [ORDER BY column_name] [HAVING having_condition] [DISTRIBUTED column_name] [LIMIT number]; Example:1 SELECT * FROM LIBTBL; Example:2 SELECT Member Name, Designation FROM LIBTBL; 20Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 21. Select – where Example SELECT * FROM LIBUDET WHERE group_name = ‘TEACHING’ OR group_name = ‘student’ AND Dept_name>= ‘18’; Select - regular expression Syntax SELECT column1,column2,column3 FROM tb_name WHERE column_name LIKE ‘%alp%’; Example SELECT PRODUCT, STATE, CITY FROM SALESDETS WHERE City LIKE ‘%O%’; 21Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 22. Group by Example SELECT PRODUCT, COUNT(PRODUCT)AS C1, STATE, COUNTRY FROM SALESDETS GROUP BY PRODUCT, STATE; Order by // Sorts use only one reducer Example SELECT PRODUCT, STATE, PRICE, COUNTRY FROM SALESDETS ORDER BY COUNTRY; 22Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 23. Sort by // Sorts the data before given to reducer Example SELECT PRODUC,STATE,COUNTRY FROM SALESDETS SORT BY COUNTRY LIMIT 10; Having // Filter data based on Group By Example SELECT PRODUCT, COUNT(PRODUCT) AS C1,STATE,COUNTRY FROM SALESDETS GROUP BY PRODUCT, STATE, COUNTRY HAVING C1 > 5; 23Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 24. Limit Example SELECT PRODUCT,STATE, PRICE, COUNTRY FROM SALESDETS COUNTRY LIMIT 10; Distribute by // distributes rows among reducers Syntax SELECT column_name1, column_name2,column_name3 FROM tb_name DISTRIBUTE BY column_name SORT BY column_name ASC,column_name ASC LIMIT count; Example SELECT PRODUCT,PRICE,STATE FROM SALESDETS DISTRIBUTE BY STATE SORT BY STATE ASC, PRODUCT ASC LIMIT 50; 24Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 25. Cluster by // does the job of both distribute by and sort by Example SELECT PRODUCT,PRICE,STATE FROM SALESDETS CLUSTER BY STATE LIMIT 50; Difference in Execution of Order By , Sort By, Distribute By, Cluster By 25Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 26. Data Aggregation COUNT AVG DISTINCT (AVG) MIN DISTINCT(MIN) MAX , DISTINCT(MAX) 26Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 27. Partitions Hive reads the entire dataset from warehouse even when filter condition is specified to fetch a particular column. This results as bottleneck in MapReduce jobs and involves huge degree of I/O. Partition command is used to break larger dataset into small chunks on columns. Hive supports two types of partition  Static partition  Dynamic partition 27Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 28. Creating partition table Syntax CREATE TABLE tb_name (column1 datatype, column2 datatype,column3 datatype) COMMENT ‘Details of the dataset’ PARTITIONED BY (column_name STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’; Example CREATE TABLE MY_TABLE1 (Member_Name STRING,dept_name STRING,group_name STRING,course_name STRING,title STRING) COMMENT ‘User information’ PARTITIONED BY (Designation STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’; 28Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 29. Load data into static partition table Syntax LOAD DATA LOCAL INPATH ‘file_path’ OVERWRITE INTO TABLE tb_name; Example LOAD DATA LOCAL INPATH ‘/home/livrith/Desktop/mytab.csv’ OVERWRITE INTO TABLE MY_TABLE2; 29Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 30. Set dynamic partition The following setting has to be modified to execute dynamic partitions. SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode = nonstrict; Example SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode = nonstrict; 30Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 31. Insert data - Dynamic partition table Syntax INSERT OVERWRITE TABLE 1st _tb_name PARTITION(column_name) SELECT column_name1,column_name2,column_name3 FROM 2nd _tb_name; //partition field should be the last attribute when inserting data Example INSERT OVERWRITE TABLE MY_TABLE1 PARTITION(Designation) SELECT Member_Name,dept_name,group_name, course_name,title,Designation FROM MY_TABLE2; 31Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 32. Bucketing 32 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 33. Bucketing Bucketing is similar to partitioning. Bucket is a file. Bucket are used to create partition on specified column values where as partitioning is used to divided data into small blocks on columns. 33 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 34. Table creation Syntax CREATE TABLE IF NOT EXISTS tb_name (column1 datatype,column2 datatype,column3 datatype) CLUSTER BY(column_name) into 3 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘/t’; Example CREATE TABLE SALES_BUC1 (Transacyion_date TIMESTAMP,Product STRING,Price INT,Payment_Type STRING,Name STRING,City STRING,State STRING,Country STRING,Account_Created TIMESTAMP) CLUSTERED BY (Price) into 3 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘,’; 34 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 35. Load data into table Syntax FROM 1st _tb_name INSERT OVERWRITE TABLE 2nd _tb_name SELECT column_name1, column_name2,column_name3; Example FROM SALESDETS INSERT OVERWRITE TABLE SALES_BUC1 SELECT Transaction_date,Product,Price,Payment_Type,Name,City,Sta te,Country,Account_Created; 35Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 36. Select from bucket table Syntax:1 SELECT DISTINCT column_name FROM 2nd _tb_name tb_name (BUCKET 1 OUT OF 3 ON column_name); Example SELECT DISTINCT Price FROM SALES_BUC1 TABLESAMPLE (BUCKET 1 OUT OF 3 ON PRICE); Syntax:2 SELECT DISTINCT column_name FROM tb_name2 Tb_name(BUCKET 1 OUT OF 2 ON column_name); Example SELECT DISTINCT PRICE FROM SALES_BUC1 TABLESAMPLE(BUCKET 1 OUT OF 2 ON Price); 36 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 37. Sampling •SAMPLING is used in hive to populate small dataset from the existing large datasets. Sampling employs selects records randomly to create small datasets. Syntax SELECT COUNT(*) FROM tb_name TABLESAMPLE (BUCKET 1 OUT OF 3 ON column_name); Example In the example given below sample are created from the table sales_buc from the available 3 buckets. SELECT COUNT(*) FROM SALES_BUC TABLESAMPLE (BUCKET 1 OUT OF 3 ON Price); 37Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 38. • Apache HBase is an open-source, distributed, versioned, non-relational database modeled after Google's Bigtable • Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS. 38 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 39. NoSQL Databases • NoSQL – Not only SQL, Non Relational/Non SQL Databases • SCHEMA LESS • Ideology • BASE – Basically available Eventual Consistency - Only can support two availabilty, replication 39 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 40. NoSQL Types • Key Value Store - Amazon S3, Riak • Document based store – CouchDB,MongoDB • Column based store - Hbase, Cassandra • Graph based stores - Neoj4, Orientdb 40 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 41. HBASE is Not • Table with one primary key (row key) • No Join Operations • Limited Atomicty and transaction support • Manipulated by SQL 41 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 42. Hbase components • Master - Manages load balancing and scripting • Regionserver – Range of tables assigned by master Zookeper – • Client communicate via Zookeeper for read write operations in region servers for storing node details • Region server uses Memstore similar to cache memory • Provides services for synchronization, maintenance 42 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 43. Refrences • http://hadooptutorials.co.in/tutorials • https://www.youtube.com/watch?v=W_oUrDBLBaE • https://flume.apache.org/FlumeUserGuide.html • https://archive.cloudera.com/cdh/3/sqoop/SqoopUser Guide.html#_basic_usage • http://hortonworks.com/hadoop/oozie/ • http://www.01.ibm.com/software/data/infosphere/ha doop/zookeeper/ • https://www.youtube.com/watch?v=Dv2V7lbIRmI • http://kafka.apache.org/documentation.html • https://www.youtube.com/watch?v=ArUHr3Czx-8 43 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016
  • 44. 44 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Comp. Appll., Bharathiar University,- WDABT 2016