View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
Agenda Data type Data structure Pig-Latin to Map-Reduce job compilation Physical Plan Execution UDF Invocation
Data Type Tuple An ordered list of Data. DefaultTuple has List<Object> mFields DataBag A collection of Tuples. Memory Manager calls spill() to spill to disk Map – Java Type Integer, Double, etc.. – Java Type
Map-Reduce Compilation Pig-Latin to Logical Plan Parser invoke logicalPlanBuilder Logical Plan to Physical Plan LogToPhyTranslationVisitor group, distinct：LR-GR-Pack Join: LR-GR-JoinPack(with inner foreach)
Map-Reduce Compilation Physical Plan to Map-Reduce Plan A MROperator stands for a MR job Traverse in topological order If POLoad or GlobalRearrnge, new MR operator/job