Pig is a platform for analyzing large data sets that operates on Hadoop. It provides a high-level language called Pig Latin for expressing data analysis programs, as well as infrastructure for evaluating these programs. Pig Latin scripts are compiled into sequences of MapReduce jobs for execution on Hadoop clusters.