Hive hcatalog
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Hive hcatalog

on

  • 1,023 views

 

Statistics

Views

Total Views
1,023
Views on SlideShare
1,023
Embed Views
0

Actions

Likes
1
Downloads
29
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Hive hcatalog Presentation Transcript

  • 1. @alepoletto
  • 2. Hive @alepoletto
  • 3. Hive – What is? • Data warehouse System Layer build on top of Hadoop • Define Structure for your Unstructured Big Data • Query this Data Using SQL like Language HiveQL @alepoletto
  • 4. Hive - is not …Relational Database • Use Relational database to store metadata. • Data that HIVE process is stored in HDFS @alepoletto
  • 5. Hive - is not… designed for online transactions • Runs on Hadoop ( batch Processing system) • Jobs can have High latency with overhead @alepoletto
  • 6. Hive - is not… real time queries and row updates • Suited for batch jobs and over large sets of immutable data @alepoletto
  • 7. Hive – What it does • Hadoop was built to organize and store massive amounts of data. • A Hadoop cluster is a reservoir of heterogeneous data, from multiple sources and in different formats. • Hive allows the user to explore and structure that data, analyze it, and then turn it into business insight. @alepoletto
  • 8. Hive – Architecture @alepoletto
  • 9. Hive – Tables • Hive Tables • Data: in files in HDFS • Schema: in metadata stored into relational tables • Schema and Data are separated • Hive needs schema for existing HDFS data @alepoletto
  • 10. @alepoletto
  • 11. Hive – Pig x Hive Pig is good for Hive is for • ETL. • Query Data • Preparing data for easier analyses. • Need answer to specific questions • for long series of steps to perform • If you are familiar with sql @alepoletto
  • 12. Hive – HiveQL @alepoletto
  • 13. @alepoletto
  • 14. HCatalog – What it does • Metadata and Table management System for Hadoop. • shared schema and data type mechanism for different Hadoop tools like pig, hive and MapReduce • Interoperability across data processing tools • Table abstraction, so you don’t need to worry with where and how the data is stored. @alepoletto
  • 15. HCatalog – Summary • “Takes Hive Meatafdata and opens to everybody else” @alepoletto
  • 16. HCatalog – Overview • Access data Through Hcatalog @alepoletto
  • 17. HCatalog – Archtecture @alepoletto
  • 18. @alepoletto