This document discusses data and process scheduling in Hadoop. It provides examples of loading data from HDFS, Hive, and Avro formats into Pig and querying that data. It also discusses switching file formats from ORC and shows a diagram of data flows from raw to presented data. The document mentions the Apache Falcon project for managing Hadoop data pipelines and some of its adoption and future enhancements.