Apache pig

•Download as PPTX, PDF•

1 like•101 views

This PPT is related to Apache Pig in Big Data. which is covered in its key features, anatomy, Data types, running and execution mode of pig and when to use pig, and when not to use a pig.

Education

What is Pig ?
Scripting language
Alternative to MapReduce Programming
Developed as a reserach project at Yahoo in 2006
Platform for data analysis

Key features of Pig
• It provides an engine for executing data flows.
• It provides a language called "Pig Latin".
• It contains operators for many of the traditional data operations.
• UDF

The
Anatomy
of Pig
Pig Latin
Grunt
Interpreter and
execution engine

Pig on Hadoop
HDFS
commands
UNIX shell
commands
Relational
operators
Positional
parameters
Common
mathematical
function
Custom
function
Complex data
structure

Pig
Philosophy
Pigs Eat
Anything
Pigs Live
Anywhere
Pigs are Domestic
Animals
Pigs Fly

Pig Latin
Overview
Pig Latin Statements
Ex: A = load ‘students’ (rollno, name);
Pig Latin : Keywords
Pig Latin : Identifiers
Pig Latin : Comments
Pig Latin : Case Sensitivity
Operators in Pig Latin

Data Types in Pig
Simple data types
Int
Long
Float
Double
Chararray
Bytearray
Datetime

Running Pig
Interactive Mode
grunt
Batch Mode
Create "Pig Script"
Save it with .pig
extension

Execution Modes of Pig
Local Mode
MapReduce Mode

HDFS Commands
Work with all
HDFS command
in Grunt shell.
Create a directory
(ex: fs –mkdir
/piglatindemos; )
Sections :
Objective : Input : Act : Outcome :

Relational Operators
FILTER FOREACH GROUP DISTINCT LIMIT
ORDER
BY
JOIN UNION SPLIT SAMPLE

Piggy Bank
Users can share their functions in piggy bank
Need to use the “register” keyword to use piggy bank jar function in pig script
Ex:
• register ‘/root/pigdemos/piggybank-0.12.0.jar’;
• A= load ‘/pigdemo/student.tsv’ as (rollno:int, name:chararray, gpa:float);
• Upper= foreach A generate
• org,.apache.pig.piggybank.evaluation.string.UPPER(name);
• DUMP upper;

When to use pig ?
When your data loads are time sensitive.
When you want to process various data sources.
When you want to get analytical insights through
sampling.

When not to use pig ?
When your data is
completely in the
unstructured
form such as
video, text, and
audio.
When there is a
time constraint
because pig is
slower than
MapReduce jobs.

Similar to Apache pig

Introduction to pig.Triloki Gupta

Sql saturday pig session (wes floyd) v2Wes Floyd

PigHive.pptxKeerthiChukka

PigHive presentation and hive impor.pptxRahul Borate

Apache PIGAnuja Gunale

Unit 4 lecture2vishal choudhary

Big Data Hadoop Trainingstratapps

PigHive.pptxDenizDural2

Duplicitymig5

Big data components - Introduction to Flume, Pig and SqoopJeyamariappan Guru

Introduction to pigUday Vakalapudi

An Introduction to Apache PigSachin Vakkund

Apples and Oranges-- Introductory Comparison between PHP and PythonMurtaza Abbas

Introduction to PIG components Rupak Roy

Programming Under Linux In PythonMarwan Osman

unit-4-apache pig-.pdfssuser92282c

Unit 4-apache pigvishal choudhary

Pig and Pig Latin - Module 5Rohit Agrawal

Pig power tools_by_viswanath_gangavaramViswanath Gangavaram

Hive and Pig for .NET User GroupCsaba Toth

Similar to Apache pig (20)

Introduction to pig.

Sql saturday pig session (wes floyd) v2

PigHive.pptx

PigHive presentation and hive impor.pptx

Apache PIG

Unit 4 lecture2

Big Data Hadoop Training

PigHive.pptx

Duplicity

Big data components - Introduction to Flume, Pig and Sqoop

Introduction to pig

An Introduction to Apache Pig

Apples and Oranges-- Introductory Comparison between PHP and Python

Introduction to PIG components

Programming Under Linux In Python

unit-4-apache pig-.pdf

Unit 4-apache pig

Pig and Pig Latin - Module 5

Pig power tools_by_viswanath_gangavaram

Hive and Pig for .NET User Group

Recently uploaded

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching

Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke

Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth

internship ppt on smartinternz platform as salesforce developerunnathinaik

Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam

History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi

Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle

The Most Excellent Way | 1 Corinthians 13Steve Thomason

Proudly South Africa powerpoint Thorisha.pptxthorishapillay1

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR

A Critique of the Proposed National Education Policy ReformChameera Dedduwage

Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha

Science lesson Moon for 4th quarter lessonJericReyAuditor

Paris 2024 Olympic Geographies - an activityGeoBlogs

Software Engineering Methodologies (overview)eniolaolutunde

Presiding Officer Training module 2024 lok sabha electionsanshu789521

EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood

Introduction to AI in Higher Education_draft.pptxpboyjonauth

How to Make a Pirate ship Primary Education.pptxmanuelaromero2013

Recently uploaded (20)

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...

Painted Grey Ware.pptx, PGW Culture of India

Introduction to ArtificiaI Intelligence in Higher Education

internship ppt on smartinternz platform as salesforce developer

Pharmacognosy Flower 3. Compositae 2023.pdf

History Class XII Ch. 3 Kinship, Caste and Class (1).pptx

Hybridoma Technology ( Production , Purification , and Application )

The Most Excellent Way | 1 Corinthians 13

Proudly South Africa powerpoint Thorisha.pptx

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️

A Critique of the Proposed National Education Policy Reform

Call Girls in Dwarka Mor Delhi Contact Us 9654467111

Science lesson Moon for 4th quarter lesson

Paris 2024 Olympic Geographies - an activity

Software Engineering Methodologies (overview)

Presiding Officer Training module 2024 lok sabha elections

EPANDING THE CONTENT OF AN OUTLINE using notes.pptx

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx

Introduction to AI in Higher Education_draft.pptx

How to Make a Pirate ship Primary Education.pptx

Apache pig

1. Introduction to Pig

2. What is Pig ? Scripting language Alternative to MapReduce Programming Developed as a reserach project at Yahoo in 2006 Platform for data analysis

3. Key features of Pig • It provides an engine for executing data flows. • It provides a language called "Pig Latin". • It contains operators for many of the traditional data operations. • UDF

4. The Anatomy of Pig Pig Latin Grunt Interpreter and execution engine

5. Pig on Hadoop HDFS commands UNIX shell commands Relational operators Positional parameters Common mathematical function Custom function Complex data structure

6. Pig Philosophy Pigs Eat Anything Pigs Live Anywhere Pigs are Domestic Animals Pigs Fly

7. ETL processing

8. Pig Latin Overview Pig Latin Statements Ex: A = load ‘students’ (rollno, name); Pig Latin : Keywords Pig Latin : Identifiers Pig Latin : Comments Pig Latin : Case Sensitivity Operators in Pig Latin

9. Data Types in Pig Simple data types Int Long Float Double Chararray Bytearray Datetime

10. Complex data types Tuple Bag map

11. Running Pig Interactive Mode grunt Batch Mode Create "Pig Script" Save it with .pig extension

12. Execution Modes of Pig Local Mode MapReduce Mode

13. HDFS Commands Work with all HDFS command in Grunt shell. Create a directory (ex: fs –mkdir /piglatindemos; ) Sections : Objective : Input : Act : Outcome :

14. Relational Operators FILTER FOREACH GROUP DISTINCT LIMIT ORDER BY JOIN UNION SPLIT SAMPLE

15. EVAL Function AVG MAX COUNT

16. Piggy Bank Users can share their functions in piggy bank Need to use the “register” keyword to use piggy bank jar function in pig script Ex: • register ‘/root/pigdemos/piggybank-0.12.0.jar’; • A= load ‘/pigdemo/student.tsv’ as (rollno:int, name:chararray, gpa:float); • Upper= foreach A generate • org,.apache.pig.piggybank.evaluation.string.UPPER(name); • DUMP upper;

17. When to use pig ? When your data loads are time sensitive. When you want to process various data sources. When you want to get analytical insights through sampling.

18. When not to use pig ? When your data is completely in the unstructured form such as video, text, and audio. When there is a time constraint because pig is slower than MapReduce jobs.

Apache pig

Recommended

Recommended

More Related Content

Similar to Apache pig

Similar to Apache pig (20)

Recently uploaded

Recently uploaded (20)

Apache pig