- 1. BIG DATA ANALYSIS USING HADOOP SEMINAR: HIVE QL BY: SHREYA JAISWAL [ENG20CA0042] NANDINI GARG [ENG20CA0023] 2022-2023
- 2. Content Overview › Introduction › Difference between HIVE and RDBMS › Hive QL › Difference between SQL and HIVE QL › HIVE QL built in operators 2
- 3. INTRODUCTION HIVE: Hive is a data warehouse software system that provides data query and analysis. Hive gives an interface like SQL to query data stored in various databases and file systems that integrate with Hadoop. Hive helps with querying and managing large datasets real fast. It is an ETL tool for Hadoop ecosystem. 3
- 4. Difference 4 RDBMS HIVE It is used to maintain database It is used to maintain data warehouse. It uses SQL( structured query language. It uses HQL( Hive query language Schema is fixed in RDBMS. Schema varies in it. Normalized data is stored. Normalized and de-normalized both type of data is stored. Tables in RDMS are sparse. Tables in Hive are dense.
- 5. 5 HIVE QUERY LANGUAGE( HIVE QL): (HiveQL) is a query language in Apache Hive for processing and analyzing structured data. It is a mixture of SQL-92, MySQL, and Oracle’s SQL. It is very much similar to SQL and highly scalable. It reuses familiar concepts from the relational database world, such as tables, rows, columns and schema, to ease learning.
- 6. › Hive provides a CLI for Hive query writing using Hive Query Language (HiveQL). › Data Definition Language (DDL) is used for creating, altering and dropping databases, tables, views, functions and indexes. › DDL and DML are the parts of HIVE QL. › Most interactions tend to take place over a command line interface (CLI). Generally, HiveQL syntax is similar to the SQL syntax that most data analysts are familiar with. 6
- 7. 7 QL
- 8. Difference 8 ON THE BASIS OF SQL Hive SQL Update-commands in table structure. Update, delete, insert. Update, delete, insert. Manages Relational data Data structures. Transaction Supported Limited support supported. Indexes Supported Supported Data types It contain a total of five data types i.e., Integral, floating-point, fixed-point, text and binary strings, temporal It contains Boolean, integral, floating- point, fixed-point, timestamp (nanosecond precision) , Date, text and binary strings, temporal, array, map, struct, Union Functions Hundreds of built-in functions Hundreds of built-in functions Map reduce Not supported Supported
- 9. HiveQL Built-in Operators › Hive provides Built-in operators for Data operations to be implemented on the tables present inside Hive warehouse. › These operators are used for mathematical operations on operands, and it will return specific value as per the logic applied. › Below are the main types of Built-in Operators in HiveQL: • Relational Operators • Arithmetic Operators • Logical Operators • Operators on Complex types 9
- 10. 10 RELATIONAL OPERATORS IN HIVE SQL We use Relational operators for relationship comparisons between two operands. Operators such as equals, Not equals, less than, greater than …etc. The operand types are all number types in these Operators.
- 11. 11 Built-in Operator Description Operand X = Y TRUE: if expression X is equivalent to expression Y Otherwise FALSE. It takes all primitive types X != Y TRUE: If expression X is not equivalent to expression Y Otherwise FALSE. It takes all primitive types X < Y TRUE: if expression X is less than expression Y Otherwise FALSE. It takes all primitive types X <= Y TRUE: if expression X is less than or equal to expression Y Otherwise FALSE. It takes all primitive types X>Y TRUE: if expression X is greater than expression Y Otherwise FALSE. It takes all primitive types X>= Y TRUE: if expression X is greater than or equal to expression Y Otherwise FALSE. It takes all primitive types X IS NULL TRUE: if expression X evaluates to NULL otherwise FALSE. It takes all types X IS NOT NULL FALSE: If expression X evaluates to NULL otherwise TRUE. It takes all types X REGEXP Y Same as RLIKE. Takes only Strings The following Table will give us details about Relational operators and its usage in HiveQL:
- 12. 12 HiveQL Arithmetic Operators We use Arithmetic operators for performing arithmetic operations on operands. Arithmetic operations such as addition, subtraction, multiplication and division between operands we use these Operators. The operand types all are number types in these Operators. Sample Example: 2 + 3 gives result 5. In this example, ‘+’ is theoperator and 2 and 3 are operands. The return value is 5
- 13. 13 The following Table will give us details about Arithmetic operators in Hive Query Language: Built-in Operator Description Operand X + Y It will return the output of adding X and Y value. It takes all number types X – Y It will return the output of subtracting Y from X value. It takes all number types X * Y It will return the output of multiplying X and Y values. It takes all number types X / Y It will return the output of dividing Y from X. It takes all number types X % Y It will return the remainder resulting from dividing X by Y. It takes all number types X & Y It will return the output of bitwise AND of X and Y. It takes all number types X | Y It will return the output of bitwise OR of X and Y. It takes all number types X ^ Y It will return the output of bitwise XOR of X and Y. It takes all number types ~X It will return the output of bitwise NOT of X. It takes all number types
- 14. 14 HiveQL Logical Operators We use Logical operators for performing Logical operations on operands. Logical operations such as AND, OR, NOT between operands we use these Operators. The operand types all are BOOLEAN type in these Operators.
- 15. 15 The following Table will give us details about Logical operators in HiveSQL: Operator s Description Operands X AND Y TRUE if both X and Y are TRUE, otherwise FALSE. Boolean types only X && Y Same as X AND Y but here we using && symbol Boolean types only X OR Y TRUE if either X or Y or both are TRUE, otherwise FALSE. Boolean types only X || Y Same as X OR Y but here we using || symbol Boolean types only NOT X TRUE if X is FALSE, otherwise FALSE. Boolean types only !X Same as NOT X but here we using! symbol Boolean types only
- 16. 16 OPERATORS ON COMPLEX TYPES The following Table will give us details about Complex Type Operators. These are operators which will provide a different mechanism to access elements in complex types. Operators Operands Description A[n] A is an Array and n is an integer type. It will return nth element in the array A. The first element has index of 0. M[key] M is a Map<K, V> and key has type K. It will return the values belongs to the key in the map.
- 17. THANK YOU.

RDBMS stands for Relational Database Management System. RDBMS is a such type of database management system which is specifically designed for relational databases. RDBMS is a subset of DBMS. A relational database refers to a database that stores data in a structured format using rows and columns and that structured form is known as table.
SQL-92 was the third revision of the SQL database query language. MySQL is an open-source relational database management system. It is procedural language extension to SQL often called as PL/SQL.
HIVEQL ia an query language for hive to process and analyze structured data in metastore. SQL is a domain-specific language used in programming and designed for managing data held in a relational database management system also known as RDBMS. It is also useful in handling structured data, i.e., data incorporating relations among entities and variables. SQL is a standard language for storing, manipulating, and retrieving data in databases. MapReduce is a programming paradigm that enables massive scalability across hundreds or thousands of servers in a Hadoop cluster.
- nandini
- shreya