SlideShare a Scribd company logo
1 of 16
Pig’s Map Reduce Execution xiafei.qiu@PCA
Agenda Data type Data structure Pig-Latin to Map-Reduce job compilation Physical Plan Execution UDF Invocation
Data Type Tuple An ordered list of Data. DefaultTuple has List<Object> mFields DataBag A collection of Tuples. Memory Manager calls spill() to spill to disk Map – Java Type Integer, Double, etc.. – Java Type
Data Structure
Map-Reduce Compilation Pig-Latin to Logical Plan Parser invoke logicalPlanBuilder Logical Plan to Physical Plan LogToPhyTranslationVisitor  group, distinct:LR-GR-Pack Join: LR-GR-JoinPack(with inner foreach)
Map-Reduce Compilation Physical Plan to Map-Reduce Plan A MROperator stands for a MR job Traverse in topological order If POLoad or GlobalRearrnge, new MR operator/job
Map-Reduce Compilation
Map-Reduce Compilation
Map Execution protectedvoid map(Text key, Tuple inpTuple, Context context) throws IOException, InterruptedException  {      //........... for (PhysicalOperator root : roots) { if (inIllustrator) { if (root != null) {                     root.attachInput(inpTuple); }             } else {                 root.attachInput(tf.newTupleNoCopy(inpTuple.getAll())); } } 		runPipeline(leaf); }
Map Execution protectedvoid runPipeline(PhysicalOperator leaf) throws IOException, InterruptedException { while(true){             Result res = leaf.getNext(DUMMYTUPLE); if(res.returnStatus==POStatus.STATUS_OK){                 collect(outputCollector,(Tuple)res.result); continue; } } //........... }
Reduce Execution protectedvoid reduce(PigNullableWritable key, Iterable<NullableTuple> tupIter, Context context)  throws IOException, InterruptedException  { //........... if (packinstanceofPOJoinPackage) { pack.attachInput(key, tupIter.iterator()); 		while (true) 		{ 		    if (processOnePackageOutput(context)) 			break; } 	} 	else { pack.attachInput(key, tupIter.iterator()); processOnePackageOutput(context); }  }
Reduce Execution publicbooleanprocessOnePackageOutput(Context oc)  throws IOException, InterruptedException  {     Result res = pack.getNext(DUMMYTUPLE); if(res.returnStatus==POStatus.STATUS_OK) {         Tuple packRes = (Tuple)res.result; 	   //........... for (int i = 0; i < roots.length; i++) { roots[i].attachInput(packRes); } runPipeline(leaf);     } if(res.returnStatus==POStatus.STATUS_NULL) { returnfalse; }     //........... if(res.returnStatus==POStatus.STATUS_EOP) { returntrue; }     returnfalse; }
Physical Plan Execution PhysicalPlan extends OperatorPlan<PhysicalOperator> Operation on Graph PhysicalOperator as vertex Each vertex has a group of getNext() methods processInput() if necessary
Physical Plan Execution 	public Result getNext(Tuple t) throwsExecException { //........... 	     Result res = new Result(); try { res.result = loader.getNext(); if(res.result==null){ res.returnStatus = POStatus.STATUS_EOP; tearDown(); } else res.returnStatus = POStatus.STATUS_OK; if (res.returnStatus == POStatus.STATUS_OK) res.result = illustratorMarkup(res, res.result, 0);         } catch (IOException e) { log.error("Received error from loader function: " + e); return res; } return res; }
Physical Plan Execution public Result getNext(Tuple t) throwsExecException {         Result res = null;         Result inp = null; while (true) { inp = processInput(); if (inp.returnStatus == POStatus.STATUS_EOP                     || inp.returnStatus == POStatus.STATUS_ERR) break; illustratorMarkup(inp.result, null, 0); // illustrator ignore LIMIT before the post processing if ((illustrator == null || illustrator.getOriginalLimit() != -1) && soFar>=mLimit) inp.returnStatus = POStatus.STATUS_EOP; soFar++; break; } returninp; }
UDF/Built-In Invocation POUserFunc

More Related Content

What's hot

pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLMark Wong
 
Secuencias Recursivas, Sucesiones Recursivas & Progresiones con Geogebra
Secuencias Recursivas, Sucesiones Recursivas & Progresiones con GeogebraSecuencias Recursivas, Sucesiones Recursivas & Progresiones con Geogebra
Secuencias Recursivas, Sucesiones Recursivas & Progresiones con GeogebraJose Perez
 
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...Altinity Ltd
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenPostgresOpen
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxData
 

What's hot (6)

pg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQLpg_proctab: Accessing System Stats in PostgreSQL
pg_proctab: Accessing System Stats in PostgreSQL
 
Secuencias Recursivas, Sucesiones Recursivas & Progresiones con Geogebra
Secuencias Recursivas, Sucesiones Recursivas & Progresiones con GeogebraSecuencias Recursivas, Sucesiones Recursivas & Progresiones con Geogebra
Secuencias Recursivas, Sucesiones Recursivas & Progresiones con Geogebra
 
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
ClickHouse and the Magic of Materialized Views, By Robert Hodges and Altinity...
 
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres OpenBruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
 
Full Text Search in PostgreSQL
Full Text Search in PostgreSQLFull Text Search in PostgreSQL
Full Text Search in PostgreSQL
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
 

Similar to Pig Map Reduce Execution

Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...CloudxLab
 
Droidjam 2019 flutter isolates pdf
Droidjam 2019 flutter isolates pdfDroidjam 2019 flutter isolates pdf
Droidjam 2019 flutter isolates pdfAnvith Bhat
 
Hadoop本 輪読会 1章〜2章
Hadoop本 輪読会 1章〜2章Hadoop本 輪読会 1章〜2章
Hadoop本 輪読会 1章〜2章moai kids
 
05 pig user defined functions (udfs)
05 pig user defined functions (udfs)05 pig user defined functions (udfs)
05 pig user defined functions (udfs)Subhas Kumar Ghosh
 
Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)
Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)
Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)Kohei KaiGai
 
エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理maruyama097
 
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...julien.ponge
 
20140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture1120140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture11Computer Science Club
 
For problems 3 and 4, consider the following functions that implemen.pdf
For problems 3 and 4, consider the following functions that implemen.pdfFor problems 3 and 4, consider the following functions that implemen.pdf
For problems 3 and 4, consider the following functions that implemen.pdfanjandavid
 
Web Optimization Summit: Coding for Performance
Web Optimization Summit: Coding for PerformanceWeb Optimization Summit: Coding for Performance
Web Optimization Summit: Coding for Performancejohndaviddalton
 
オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)Takayuki Goto
 
Gearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copyGearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copyBrian Aker
 
Gearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copyGearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copyBrian Aker
 
스위프트를 여행하는 히치하이커를 위한 스타일 안내
스위프트를 여행하는 히치하이커를 위한 스타일 안내스위프트를 여행하는 히치하이커를 위한 스타일 안내
스위프트를 여행하는 히치하이커를 위한 스타일 안내Jung Kim
 
Functional programming using underscorejs
Functional programming using underscorejsFunctional programming using underscorejs
Functional programming using underscorejs偉格 高
 
Столпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойСтолпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойSigma Software
 
Modern technologies in data science
Modern technologies in data science Modern technologies in data science
Modern technologies in data science Chucheng Hsieh
 

Similar to Pig Map Reduce Execution (20)

Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
 
Txjs
TxjsTxjs
Txjs
 
Droidjam 2019 flutter isolates pdf
Droidjam 2019 flutter isolates pdfDroidjam 2019 flutter isolates pdf
Droidjam 2019 flutter isolates pdf
 
Hadoop本 輪読会 1章〜2章
Hadoop本 輪読会 1章〜2章Hadoop本 輪読会 1章〜2章
Hadoop本 輪読会 1章〜2章
 
Spark_Documentation_Template1
Spark_Documentation_Template1Spark_Documentation_Template1
Spark_Documentation_Template1
 
05 pig user defined functions (udfs)
05 pig user defined functions (udfs)05 pig user defined functions (udfs)
05 pig user defined functions (udfs)
 
Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)
Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)
Writable Foreign Data Wrapper (JPUG Unconference 16-Feb-2013)
 
エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理
 
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
Java 7 Launch Event at LyonJUG, Lyon France. Fork / Join framework and Projec...
 
20140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture1120140427 parallel programming_zlobin_lecture11
20140427 parallel programming_zlobin_lecture11
 
For problems 3 and 4, consider the following functions that implemen.pdf
For problems 3 and 4, consider the following functions that implemen.pdfFor problems 3 and 4, consider the following functions that implemen.pdf
For problems 3 and 4, consider the following functions that implemen.pdf
 
Web Optimization Summit: Coding for Performance
Web Optimization Summit: Coding for PerformanceWeb Optimization Summit: Coding for Performance
Web Optimization Summit: Coding for Performance
 
オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)オープンデータを使ったモバイルアプリ開発(応用編)
オープンデータを使ったモバイルアプリ開発(応用編)
 
Gearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copyGearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copy
 
Gearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copyGearmam, from the_worker's_perspective copy
Gearmam, from the_worker's_perspective copy
 
스위프트를 여행하는 히치하이커를 위한 스타일 안내
스위프트를 여행하는 히치하이커를 위한 스타일 안내스위프트를 여행하는 히치하이커를 위한 스타일 안내
스위프트를 여행하는 히치하이커를 위한 스타일 안내
 
Functional programming using underscorejs
Functional programming using underscorejsFunctional programming using underscorejs
Functional programming using underscorejs
 
Столпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай МозговойСтолпы функционального программирования для адептов ООП, Николай Мозговой
Столпы функционального программирования для адептов ООП, Николай Мозговой
 
In kor we Trust
In kor we TrustIn kor we Trust
In kor we Trust
 
Modern technologies in data science
Modern technologies in data science Modern technologies in data science
Modern technologies in data science
 

Recently uploaded

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 

Recently uploaded (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 

Pig Map Reduce Execution

  • 1. Pig’s Map Reduce Execution xiafei.qiu@PCA
  • 2. Agenda Data type Data structure Pig-Latin to Map-Reduce job compilation Physical Plan Execution UDF Invocation
  • 3. Data Type Tuple An ordered list of Data. DefaultTuple has List<Object> mFields DataBag A collection of Tuples. Memory Manager calls spill() to spill to disk Map – Java Type Integer, Double, etc.. – Java Type
  • 5. Map-Reduce Compilation Pig-Latin to Logical Plan Parser invoke logicalPlanBuilder Logical Plan to Physical Plan LogToPhyTranslationVisitor group, distinct:LR-GR-Pack Join: LR-GR-JoinPack(with inner foreach)
  • 6. Map-Reduce Compilation Physical Plan to Map-Reduce Plan A MROperator stands for a MR job Traverse in topological order If POLoad or GlobalRearrnge, new MR operator/job
  • 9. Map Execution protectedvoid map(Text key, Tuple inpTuple, Context context) throws IOException, InterruptedException { //........... for (PhysicalOperator root : roots) { if (inIllustrator) { if (root != null) { root.attachInput(inpTuple); } } else { root.attachInput(tf.newTupleNoCopy(inpTuple.getAll())); } } runPipeline(leaf); }
  • 10. Map Execution protectedvoid runPipeline(PhysicalOperator leaf) throws IOException, InterruptedException { while(true){ Result res = leaf.getNext(DUMMYTUPLE); if(res.returnStatus==POStatus.STATUS_OK){ collect(outputCollector,(Tuple)res.result); continue; } } //........... }
  • 11. Reduce Execution protectedvoid reduce(PigNullableWritable key, Iterable<NullableTuple> tupIter, Context context) throws IOException, InterruptedException { //........... if (packinstanceofPOJoinPackage) { pack.attachInput(key, tupIter.iterator()); while (true) { if (processOnePackageOutput(context)) break; } } else { pack.attachInput(key, tupIter.iterator()); processOnePackageOutput(context); } }
  • 12. Reduce Execution publicbooleanprocessOnePackageOutput(Context oc) throws IOException, InterruptedException { Result res = pack.getNext(DUMMYTUPLE); if(res.returnStatus==POStatus.STATUS_OK) { Tuple packRes = (Tuple)res.result; //........... for (int i = 0; i < roots.length; i++) { roots[i].attachInput(packRes); } runPipeline(leaf); } if(res.returnStatus==POStatus.STATUS_NULL) { returnfalse; } //........... if(res.returnStatus==POStatus.STATUS_EOP) { returntrue; } returnfalse; }
  • 13. Physical Plan Execution PhysicalPlan extends OperatorPlan<PhysicalOperator> Operation on Graph PhysicalOperator as vertex Each vertex has a group of getNext() methods processInput() if necessary
  • 14. Physical Plan Execution public Result getNext(Tuple t) throwsExecException { //........... Result res = new Result(); try { res.result = loader.getNext(); if(res.result==null){ res.returnStatus = POStatus.STATUS_EOP; tearDown(); } else res.returnStatus = POStatus.STATUS_OK; if (res.returnStatus == POStatus.STATUS_OK) res.result = illustratorMarkup(res, res.result, 0); } catch (IOException e) { log.error("Received error from loader function: " + e); return res; } return res; }
  • 15. Physical Plan Execution public Result getNext(Tuple t) throwsExecException { Result res = null; Result inp = null; while (true) { inp = processInput(); if (inp.returnStatus == POStatus.STATUS_EOP || inp.returnStatus == POStatus.STATUS_ERR) break; illustratorMarkup(inp.result, null, 0); // illustrator ignore LIMIT before the post processing if ((illustrator == null || illustrator.getOriginalLimit() != -1) && soFar>=mLimit) inp.returnStatus = POStatus.STATUS_EOP; soFar++; break; } returninp; }