This document discusses using Hadoop for large scale processing. It provides an overview of Hadoop and MapReduce frameworks and how they allow distributing processing across many nodes to efficiently process large amounts of data in parallel. It also gives examples of how Hadoop has been used at the British Library for digital preservation tasks like format migration and analysis.