SlideShare a Scribd company logo
1 of 9
Download to read offline
Pig
Casting, Reference
Casting
Casting enables us to cast or convert data from one type to
another, as long as conversion is supported. For example,
suppose if we have an integer field (int) which you want to
convert to a string. We can cast this field from int to chararray
using chararray
For example:
grunt> select = foreach data generate $0, (chararray)$4,
(chararray)$5;
Grunt> dump select;
(ryan,67,57)
(Bob,77,75)
(Alica,68,)
(Bryan,81,79)
(Kate,66,69)
Rupak Roy
Reference field by position
 We can refer the data fields by name as well as
with there positions( $0,$1,,,,,).
$0 $1 $2 $3 $4 $5
Name Age School Location
Test
Score
1
Test
Score
2
Ryan 22 St.JohnsSchool NewAvenue 67 57
Bob 23 St.EdumndSchool Downtown 77 75
Alica Na Don Bosco ParkAvenue 65 79
Bryan 24 St.JhonsSchool NewAvenue 81 79
Kate 22 Don Bosco ParkAvenue 66 69
Rupak Roy
#filter the data by age >= 22
grunt> age = FILTER data by $1 >= 22;
grunt> dump age;
Here, we are referencing the age column by position $1. However
we can reference them directly by name itself such as
grunt > age = FILTER data by age >=22;
But sometimes it becomes tedious to reference the column by its
name when we will be dealing large datasets with complex
column names.
#filter the data by test score1 <= 66
grunt> testscore = FILTER data $4<= 66;
grunt> dump testscore;
Rupak Roy
grunt> dump testscore;
We will notice that the output will show only
one record that is (kate,22, Don bosco,
ParkAvenue,66,69) but in our original dataset
we have an another record of testscore1<= 66
i.e. Alica’s.
This is because when we defined while loading
the data the column values are separated by
comma (, ) and in Alica row 2nd column have
no values so it automatically took the next
value after comma Don Bosco as the 2nd
column($3) value input for column($1) ‘age’.
Rupak Roy
Filter data based on position of the column
grunt> select = foreach data generate $0,$4,$5;
grunt> dump select;
(ryan,67,57)
(Bob,77,75)
(Alica,68,)
(Bryan,81,79)
(Kate,66,69)
Rupak Roy
Select columns using reference
grunt> select_all= foreach data generate *;
grunt> dump select_all;
Grunt> select_range= foreach data generate $0..$3;
grunt> dump select_range;
(Name,age)
(Ryan,22)
(bob,23)
(Alica,Don Bosco)
(Bryan,24)
(kate,22)
Showing Don Bosco instead of age
because the 2nd value for Alica’s
age is missing, therefore it will
consider the next value as the 2nd
column ‘age’ value. It is advisable
to mark the missing value as NA/NIL
so that it will not get misplaced
with the other column values.
Rupak Roy
Reference range of columns/fields
grunt> leftsidedata = foreach data generate ..$1;
grunt> middle = foreach data generate $0 .. $2;
grunt> from_last= foreach data generate $2.. ;
grunt> random= foreach data generate $0, $4 ..$6;
If schema is not defined while loading the dataset, we can even define
the schema by using a query. For example:
grunt> random = foreach data generate (chararray)$0, (chararray)$3;
Alternatively, we can also assign Alias name to the field like
grunt> random = foreach data generate (chararray)$0 as FC,
chararray)$3 as LC ;
grunt> describe random;
grunt> alias = FILTER alias by fc ==‘Kate’
Rupak Roy
Next
 We will learn PIG relational operators and
how to perform them.
Rupak Roy

More Related Content

More from Rupak Roy

Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase ArchitectureRupak Roy
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase Rupak Roy
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQLRupak Roy
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Rupak Roy
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive Rupak Roy
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSRupak Roy
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Rupak Roy
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functionsRupak Roy
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Rupak Roy
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command LineRupak Roy
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations Rupak Roy
 
Pig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsPig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsRupak Roy
 
Introduction to PIG components
Introduction to PIG components Introduction to PIG components
Introduction to PIG components Rupak Roy
 
Map Reduce Execution Architecture
Map Reduce Execution Architecture Map Reduce Execution Architecture
Map Reduce Execution Architecture Rupak Roy
 
YARN(yet an another resource locator)
YARN(yet an another resource locator)YARN(yet an another resource locator)
YARN(yet an another resource locator)Rupak Roy
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS filesRupak Roy
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Rupak Roy
 
Geo Spatial Plot using R
Geo Spatial Plot using R Geo Spatial Plot using R
Geo Spatial Plot using R Rupak Roy
 
Data visualization using case study
Data visualization using case studyData visualization using case study
Data visualization using case studyRupak Roy
 

More from Rupak Roy (20)

Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase Architecture
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQL
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMS
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command Line
 
Apache PIG Relational Operations
Apache PIG Relational Operations Apache PIG Relational Operations
Apache PIG Relational Operations
 
Pig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsPig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store Functions
 
Introduction to PIG components
Introduction to PIG components Introduction to PIG components
Introduction to PIG components
 
Map Reduce Execution Architecture
Map Reduce Execution Architecture Map Reduce Execution Architecture
Map Reduce Execution Architecture
 
YARN(yet an another resource locator)
YARN(yet an another resource locator)YARN(yet an another resource locator)
YARN(yet an another resource locator)
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS files
 
Introduction to hadoop ecosystem
Introduction to hadoop ecosystem Introduction to hadoop ecosystem
Introduction to hadoop ecosystem
 
Geo Spatial Plot using R
Geo Spatial Plot using R Geo Spatial Plot using R
Geo Spatial Plot using R
 
Data visualization using case study
Data visualization using case studyData visualization using case study
Data visualization using case study
 

Recently uploaded

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 

Recently uploaded (20)

Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 

Apache PIG casting, reference

  • 2. Casting Casting enables us to cast or convert data from one type to another, as long as conversion is supported. For example, suppose if we have an integer field (int) which you want to convert to a string. We can cast this field from int to chararray using chararray For example: grunt> select = foreach data generate $0, (chararray)$4, (chararray)$5; Grunt> dump select; (ryan,67,57) (Bob,77,75) (Alica,68,) (Bryan,81,79) (Kate,66,69) Rupak Roy
  • 3. Reference field by position  We can refer the data fields by name as well as with there positions( $0,$1,,,,,). $0 $1 $2 $3 $4 $5 Name Age School Location Test Score 1 Test Score 2 Ryan 22 St.JohnsSchool NewAvenue 67 57 Bob 23 St.EdumndSchool Downtown 77 75 Alica Na Don Bosco ParkAvenue 65 79 Bryan 24 St.JhonsSchool NewAvenue 81 79 Kate 22 Don Bosco ParkAvenue 66 69 Rupak Roy
  • 4. #filter the data by age >= 22 grunt> age = FILTER data by $1 >= 22; grunt> dump age; Here, we are referencing the age column by position $1. However we can reference them directly by name itself such as grunt > age = FILTER data by age >=22; But sometimes it becomes tedious to reference the column by its name when we will be dealing large datasets with complex column names. #filter the data by test score1 <= 66 grunt> testscore = FILTER data $4<= 66; grunt> dump testscore; Rupak Roy
  • 5. grunt> dump testscore; We will notice that the output will show only one record that is (kate,22, Don bosco, ParkAvenue,66,69) but in our original dataset we have an another record of testscore1<= 66 i.e. Alica’s. This is because when we defined while loading the data the column values are separated by comma (, ) and in Alica row 2nd column have no values so it automatically took the next value after comma Don Bosco as the 2nd column($3) value input for column($1) ‘age’. Rupak Roy
  • 6. Filter data based on position of the column grunt> select = foreach data generate $0,$4,$5; grunt> dump select; (ryan,67,57) (Bob,77,75) (Alica,68,) (Bryan,81,79) (Kate,66,69) Rupak Roy
  • 7. Select columns using reference grunt> select_all= foreach data generate *; grunt> dump select_all; Grunt> select_range= foreach data generate $0..$3; grunt> dump select_range; (Name,age) (Ryan,22) (bob,23) (Alica,Don Bosco) (Bryan,24) (kate,22) Showing Don Bosco instead of age because the 2nd value for Alica’s age is missing, therefore it will consider the next value as the 2nd column ‘age’ value. It is advisable to mark the missing value as NA/NIL so that it will not get misplaced with the other column values. Rupak Roy
  • 8. Reference range of columns/fields grunt> leftsidedata = foreach data generate ..$1; grunt> middle = foreach data generate $0 .. $2; grunt> from_last= foreach data generate $2.. ; grunt> random= foreach data generate $0, $4 ..$6; If schema is not defined while loading the dataset, we can even define the schema by using a query. For example: grunt> random = foreach data generate (chararray)$0, (chararray)$3; Alternatively, we can also assign Alias name to the field like grunt> random = foreach data generate (chararray)$0 as FC, chararray)$3 as LC ; grunt> describe random; grunt> alias = FILTER alias by fc ==‘Kate’ Rupak Roy
  • 9. Next  We will learn PIG relational operators and how to perform them. Rupak Roy