MapReduce is a programming model for processing large datasets in a distributed environment. It originated from Google and features parallel processing on commodity hardware with fault tolerance. The three phases of MapReduce are Map, Shuffle, and Reduce. Mappers process key-value pairs and emit intermediate key-value pairs. Reducers then process all intermediate pairs with the same key and emit the final results. HDFS is a distributed file system that stores massive amounts of data reliably on large clusters of commodity hardware through data replication. It features a master node that coordinates access and stores metadata, and stores files as blocks across data nodes.